paul-rogers commented on a change in pull request #1954: DRILL-7509: Incorrect 
TupleSchema is created for DICT column when querying Parquet files
URL: https://github.com/apache/drill/pull/1954#discussion_r366602180
 
 

 ##########
 File path: 
metastore/metastore-api/src/main/java/org/apache/drill/metastore/util/SchemaPathUtils.java
 ##########
 @@ -63,6 +64,30 @@ public static ColumnMetadata getColumnMetadata(SchemaPath 
schemaPath, TupleMetad
     return colMetadata;
   }
 
+  /**
+   * Checks if field indetified by the schema path is child in either {@code 
DICT} or {@code REPEATED MAP}.
+   * For such fields, nested in {@code DICT} or {@code REPEATED MAP},
+   * filters can't be removed based on Parquet statistics.
+   * @param schemaPath schema path used in filter
+   * @param schema schema containing all the fields in the file
+   * @return {@literal true} if field is nested inside {@code DICT} (is {@code 
`key`} or {@code `value`})
+   *         or inside {@code REPEATED MAP} field, {@literal false} otherwise.
+   */
+  public static boolean isFieldNestedInDictOrRepeatedMap(SchemaPath 
schemaPath, TupleMetadata schema) {
 
 Review comment:
   This is a kind of cluttered version of what was just suggested. Rather than 
a bunch of if-statements that answers a specific question, generalize to look 
up a member. You'd get the same result as this method if `member(String name)` 
returned `null` if no such member existed. (And, you'd do it generically 
without having to later extend this to 
`isFieldNestedInDictOrRepeatedMapOrSomeNewType()`.
   
   Also, why only `Repeated` Map? (Non-repeated) Maps also have fields and 
allow `a.b` notation.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to