rdblue commented on a change in pull request #830: Support name mapping 
resolution for parquet
URL: https://github.com/apache/incubator-iceberg/pull/830#discussion_r404458859
 
 

 ##########
 File path: 
parquet/src/main/java/org/apache/iceberg/parquet/ParquetDictionaryRowGroupFilter.java
 ##########
 @@ -75,6 +86,16 @@ public ParquetDictionaryRowGroupFilter(Schema schema, 
Expression unbound, boolea
    */
   public boolean shouldRead(MessageType fileSchema, BlockMetaData rowGroup,
                             DictionaryPageReadStore dictionaries) {
+    StructType struct;
+
+    if (nameMapping != null) {
+      MessageType project = ParquetSchemaUtil.pruneColumnsByName(fileSchema, 
schema, nameMapping);
+      struct = ParquetSchemaUtil.convert(project).asStruct();
+    } else {
+      struct = schema.asStruct();
+    }
+
+    this.expr = Binder.bind(struct, Expressions.rewriteNot(expr), 
caseSensitive);
 
 Review comment:
   This class doesn't need to know about the name mapping. The mapping should 
be used to add IDs to the file schema so that most classes don't need to add 
specific support.
   
   Look at how the other fallback happens using 
`ParquetSchemaUtil.addFallbackIds`. I think this should mimic that fallback in 
all cases.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to