[GitHub] [drill] ihuzenko commented on a change in pull request #1954: DRILL-7509: Incorrect TupleSchema is created for DICT column when querying Parquet files

GitBox Wed, 15 Jan 2020 02:09:41 -0800

ihuzenko commented on a change in pull request #1954: DRILL-7509: Incorrect 
TupleSchema is created for DICT column when querying Parquet files
URL: https://github.com/apache/drill/pull/1954#discussion_r366495844


 ##########
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/FileMetadataCollector.java
 ##########
 @@ -278,17 +273,71 @@ static ColTypeInfo of(MessageType schema, Type type, 
String[] path, int depth, L
         int repetitionLevel = schema.getMaxRepetitionLevel(path);
         int definitionLevel = schema.getMaxDefinitionLevel(path);
 
-        return new ColTypeInfo(type.getOriginalType(), parentTypes, precision, 
scale, repetitionLevel, definitionLevel);
+        Type.Repetition repetition;
+        // Check if the primitive has LIST as parent, if it does - this is an 
array of primitives.
+        // (See ParquetReaderUtility#isLogicalListType(GroupType) for the 
REPEATED field structure.)
+        if (parentTypes.size() - 2 >= 0 && parentTypes.get(parentTypes.size() 
- 2) == OriginalType.LIST) {
+          repetition = Type.Repetition.REPEATED;
+        } else {
+          repetition = primitiveType.getRepetition();
+        }
+
+        return new ColTypeInfo()
 
 Review comment:
   Please extract all the block inside ```if (type.isPrimitive()) {```, used to 
create ```ColTypeInfo``` for primitive type to separate method. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [drill] ihuzenko commented on a change in pull request #1954: DRILL-7509: Incorrect TupleSchema is created for DICT column when querying Parquet files

Reply via email to