KazydubB commented on a change in pull request #1861: DRILL-7380: Query of a 
field inside of an array of structs returns null
URL: https://github.com/apache/drill/pull/1861#discussion_r328619538
 
 

 ##########
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet2/DrillParquetReader.java
 ##########
 @@ -111,20 +110,37 @@ public DrillParquetReader(FragmentContext 
fragmentContext,
     this.numRecordsToRead = initNumRecordsToRead(recordsToRead, 
entry.getRowGroupIndex(), footer);
   }
 
+  /**
+   * Creates projection MessageType from projection columns and given schema.
+   *
+   * @param schema Parquet file schema
+   * @param projectionColumns columns to search
+   * @param columnsNotFound any projection column which wasn't found in schema 
is added to the list
+   * @return projection containing matched columns or null if none column 
matches schema
+   */
   private static MessageType getProjection(MessageType schema,
-                                           Collection<SchemaPath> columns,
+                                           Collection<SchemaPath> 
projectionColumns,
                                            List<SchemaPath> columnsNotFound) {
-    MessageType projection = null;
-
-    String messageName = schema.getName();
-    List<ColumnDescriptor> schemaColumns = schema.getColumns();
-    // parquet type.union() seems to lose ConvertedType info when merging two 
columns that are the same type. This can
-    // happen when selecting two elements from an array. So to work around 
this, we use set of SchemaPath to avoid duplicates
-    // and then merge the types at the end
-    Set<SchemaPath> selectedSchemaPaths = new LinkedHashSet<>();
+    projectionColumns = adaptColumnsToParquetSchema(projectionColumns, schema);
+    List<SchemaPath> schemaColumns = getAllColumnsFrom(schema);
+    Set<SchemaPath> selectedSchemaPaths = 
matchProjectionWithSchemaColumns(projectionColumns, schemaColumns, 
columnsNotFound);
+    MessageType projection = convertSelectedColumnsToMessageType(schema, 
selectedSchemaPaths);
+    return projection;
 
 Review comment:
   nit: `projection` variable may be avoided.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to