kbendick commented on code in PR #5681:
URL: https://github.com/apache/iceberg/pull/5681#discussion_r960218943
##########
parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java:
##########
@@ -1011,8 +1011,10 @@ public <D> CloseableIterable<D> build() {
conf.unset(property);
}
optionsBuilder = HadoopReadOptions.builder(conf);
+ optionsBuilder.withCodecFactory(new ParquetCodecFactory(conf, 0));
} else {
optionsBuilder = ParquetReadOptions.builder();
+ optionsBuilder.withCodecFactory(new ParquetCodecFactory(new
Configuration(), 0));
Review Comment:
Idea: Maybe making this a named constant, like
`UNUSED_PARQUET_DECOMPRESSOR_PAGE_SIZE` or something would be a good way of
indicating that?
Given this solution is temporary, I don’t have a strong feeling either way,
but I do like avoiding magic numbers.
Maybe a link to the upstream parquet-mr PR or an Iceberg issue related to
zstd decompression would be more informative than just mentioning that page
size is essentially ignored?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]