ychernysh commented on code in PR #2937:
URL: https://github.com/apache/drill/pull/2937#discussion_r1741155460
##########
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetTableMetadataUtils.java:
##########
@@ -661,6 +663,12 @@ static Map<SchemaPath, TypeProtos.MajorType>
resolveFields(MetadataBase.ParquetT
// row groups in the file have the same schema, so using the first one
Map<SchemaPath, TypeProtos.MajorType> fileColumns =
getFileFields(parquetTableMetadata, file);
fileColumns.forEach((columnPath, type) -> putType(columns, columnPath,
type));
+ // If at least 1 parquet file to read doesn't contain a column, enforce
this column
+ // DataMode to OPTIONAL in the overall table schema
Review Comment:
The first item is about resolving different data types even if there are no
missing columns, which I didn't cover.
`but only if the other types are REQUIRED` - is this condition necessary?
Regarding REPEATED - I haven't covered it in any way.
In theory, implementing these should not be that hard..l
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]