aokolnychyi commented on a change in pull request #3565:
URL: https://github.com/apache/iceberg/pull/3565#discussion_r750687620
##########
File path:
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/SparkSchemaUtil.java
##########
@@ -319,7 +319,12 @@ public static void validateMetadataColumnReferences(Schema
tableSchema, Schema r
}
public static Map<Integer, String> indexQuotedNameById(Schema schema) {
+ return indexQuotedNameById(schema, null);
+ }
+
+ public static Map<Integer, String> indexQuotedNameById(Schema schema,
Set<Integer> fieldIds) {
Function<String, String> quotingFunc = name -> String.format("`%s`",
name.replace("`", "``"));
- return TypeUtil.indexQuotedNameById(schema.asStruct(), quotingFunc);
+ Schema projectedSchema = fieldIds != null ? TypeUtil.select(schema,
fieldIds) : schema;
Review comment:
It depends. I think the biggest performance hit is calling replace on
the underlying strings. We may pass a set field ids to the indexing util or
just adapt the quoting function to use a regex to see if the column should be
quoted.
##########
File path:
spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/SparkSchemaUtil.java
##########
@@ -319,7 +319,12 @@ public static void validateMetadataColumnReferences(Schema
tableSchema, Schema r
}
public static Map<Integer, String> indexQuotedNameById(Schema schema) {
+ return indexQuotedNameById(schema, null);
+ }
+
+ public static Map<Integer, String> indexQuotedNameById(Schema schema,
Set<Integer> fieldIds) {
Function<String, String> quotingFunc = name -> String.format("`%s`",
name.replace("`", "``"));
- return TypeUtil.indexQuotedNameById(schema.asStruct(), quotingFunc);
+ Schema projectedSchema = fieldIds != null ? TypeUtil.select(schema,
fieldIds) : schema;
Review comment:
It depends. I think the biggest performance hit is calling replace on
the underlying strings. We may pass a set field ids to the indexing util or
just adapt the quoting function to use a regex to see if the column should be
quoted.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]