venkateshwaracholan commented on code in PR #16614:
URL: https://github.com/apache/iceberg/pull/16614#discussion_r3385624626
##########
parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java:
##########
@@ -1595,6 +1573,53 @@ public <D> CloseableIterable<D> build() {
}
}
+ @VisibleForTesting
+ static ParquetReadOptions buildReadOptions(
+ InputFile file,
+ Map<String, String> properties,
+ Long start,
+ Long length,
+ FileDecryptionProperties fileDecryptionProperties) {
+ ParquetReadOptions.Builder optionsBuilder;
+ HadoopInputFile hadoopInputFile = null;
+ if (HadoopInputFile.class.isInstance(file)) {
+ hadoopInputFile = HadoopInputFile.class.cast(file);
+ }
+
+ if (hadoopInputFile != null) {
+ // remove read properties already set that may conflict with this read
+ Configuration conf = new Configuration(hadoopInputFile.getConf());
+ for (String property : READ_PROPERTIES_TO_REMOVE) {
+ conf.unset(property);
+ }
+ optionsBuilder = HadoopReadOptions.builder(conf);
+ } else {
+ optionsBuilder = ParquetReadOptions.builder(new
PlainParquetConfiguration());
+ }
+
+ for (Map.Entry<String, String> entry : properties.entrySet()) {
+ optionsBuilder.set(entry.getKey(), entry.getValue());
+ }
+
+ if (start != null) {
+ optionsBuilder.withRange(start, start + length);
+ }
+
+ if (fileDecryptionProperties != null) {
+ optionsBuilder.withDecryption(fileDecryptionProperties);
+ }
+
+ if (properties.containsKey(ParquetInputFormat.HADOOP_VECTORED_IO_ENABLED))
{
+ optionsBuilder.withUseHadoopVectoredIo(
Review Comment:
For the vectored I/O concern, I added a regression test
(testVectoredIoDisabledDoesNotInvokeReadVectored) that tracks calls to
readVectored(...) on a RangeReadable input. When
parquet.hadoop.vectored.io.enabled=false, no vectored reads are issued; when
it's enabled, readVectored(...) is called. This verifies the setting affects
the actual read path rather than just the Parquet option value.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]