ihuzenko commented on a change in pull request #1954: DRILL-7509: Incorrect 
TupleSchema is created for DICT column when querying Parquet files
URL: https://github.com/apache/drill/pull/1954#discussion_r366495844
 
 

 ##########
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/metadata/FileMetadataCollector.java
 ##########
 @@ -278,17 +273,71 @@ static ColTypeInfo of(MessageType schema, Type type, 
String[] path, int depth, L
         int repetitionLevel = schema.getMaxRepetitionLevel(path);
         int definitionLevel = schema.getMaxDefinitionLevel(path);
 
-        return new ColTypeInfo(type.getOriginalType(), parentTypes, precision, 
scale, repetitionLevel, definitionLevel);
+        Type.Repetition repetition;
+        // Check if the primitive has LIST as parent, if it does - this is an 
array of primitives.
+        // (See ParquetReaderUtility#isLogicalListType(GroupType) for the 
REPEATED field structure.)
+        if (parentTypes.size() - 2 >= 0 && parentTypes.get(parentTypes.size() 
- 2) == OriginalType.LIST) {
+          repetition = Type.Repetition.REPEATED;
+        } else {
+          repetition = primitiveType.getRepetition();
+        }
+
+        return new ColTypeInfo()
 
 Review comment:
   Please extract all the block inside ```if (type.isPrimitive()) {```, used to 
create ```ColTypeInfo``` for primitive type to separate method. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to