paul-rogers commented on a change in pull request #1954: DRILL-7509: Incorrect TupleSchema is created for DICT column when querying Parquet files URL: https://github.com/apache/drill/pull/1954#discussion_r366602180
########## File path: metastore/metastore-api/src/main/java/org/apache/drill/metastore/util/SchemaPathUtils.java ########## @@ -63,6 +64,30 @@ public static ColumnMetadata getColumnMetadata(SchemaPath schemaPath, TupleMetad return colMetadata; } + /** + * Checks if field indetified by the schema path is child in either {@code DICT} or {@code REPEATED MAP}. + * For such fields, nested in {@code DICT} or {@code REPEATED MAP}, + * filters can't be removed based on Parquet statistics. + * @param schemaPath schema path used in filter + * @param schema schema containing all the fields in the file + * @return {@literal true} if field is nested inside {@code DICT} (is {@code `key`} or {@code `value`}) + * or inside {@code REPEATED MAP} field, {@literal false} otherwise. + */ + public static boolean isFieldNestedInDictOrRepeatedMap(SchemaPath schemaPath, TupleMetadata schema) { Review comment: This is a kind of cluttered version of what was just suggested. Rather than a bunch of if-statements that answers a specific question, generalize to look up a member. You'd get the same result as this method if `member(String name)` returned `null` if no such member existed. (And, you'd do it generically without having to later extend this to `isFieldNestedInDictOrRepeatedMapOrSomeNewType()`. Also, why only `Repeated` Map? (Non-repeated) Maps also have fields and allow `a.b` notation. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services