Github user paul-rogers commented on a diff in the pull request:
https://github.com/apache/drill/pull/644#discussion_r87231425
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/Metadata.java
---
@@ -185,7 +185,8 @@ private Metadata(FileSystem fs, ParquetFormatConfig
formatConfig) {
childFiles.add(file);
}
}
- ParquetTableMetadata_v3 parquetTableMetadata = new
ParquetTableMetadata_v3(true);
+ ParquetTableMetadata_v3 parquetTableMetadata = new
ParquetTableMetadata_v3(DrillVersionInfo.getVersion(),
+ ParquetWriter.WRITER_VERSION);
--- End diff --
I'm a bit confused. The writer version applies to the Parquet files which
Drill writes. (Or, at least, that was the intention.)
Here, we're talking about metadata. There may well be a metadata writer,
but that should be a different writer, with a different version.
Not sure we want to initialize the metadata object with the current writer
version: there seems to be no correlation between the metadata object and the
writer version.
On the other hand, the metadata can certainly hold the writer version, but
it must be the value read from the Parquet file itself; not a value set by the
code. Else, we have the difficult problem of making sure that the code-set
version number agrees with the actual version number in the file.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---