aokolnychyi commented on a change in pull request #3565:
URL: https://github.com/apache/iceberg/pull/3565#discussion_r750687620



##########
File path: 
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/SparkSchemaUtil.java
##########
@@ -319,7 +319,12 @@ public static void validateMetadataColumnReferences(Schema 
tableSchema, Schema r
   }
 
   public static Map<Integer, String> indexQuotedNameById(Schema schema) {
+    return indexQuotedNameById(schema, null);
+  }
+
+  public static Map<Integer, String> indexQuotedNameById(Schema schema, 
Set<Integer> fieldIds) {
     Function<String, String> quotingFunc = name -> String.format("`%s`", 
name.replace("`", "``"));
-    return TypeUtil.indexQuotedNameById(schema.asStruct(), quotingFunc);
+    Schema projectedSchema = fieldIds != null ? TypeUtil.select(schema, 
fieldIds) : schema;

Review comment:
       It depends. I think the biggest performance hit is calling replace on 
the underlying strings. We may pass a set field ids to the indexing util or 
just adapt the quoting function to use a regex to see if the column should be 
quoted. 

##########
File path: 
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/SparkSchemaUtil.java
##########
@@ -319,7 +319,12 @@ public static void validateMetadataColumnReferences(Schema 
tableSchema, Schema r
   }
 
   public static Map<Integer, String> indexQuotedNameById(Schema schema) {
+    return indexQuotedNameById(schema, null);
+  }
+
+  public static Map<Integer, String> indexQuotedNameById(Schema schema, 
Set<Integer> fieldIds) {
     Function<String, String> quotingFunc = name -> String.format("`%s`", 
name.replace("`", "``"));
-    return TypeUtil.indexQuotedNameById(schema.asStruct(), quotingFunc);
+    Schema projectedSchema = fieldIds != null ? TypeUtil.select(schema, 
fieldIds) : schema;

Review comment:
       It depends. I think the biggest performance hit is calling replace on 
the underlying strings. We may pass a set field ids to the indexing util or 
just adapt the quoting function to use a regex to see if the column should be 
quoted. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to