venkateshwaracholan commented on code in PR #16614:
URL: https://github.com/apache/iceberg/pull/16614#discussion_r3385624626


##########
parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java:
##########
@@ -1595,6 +1573,53 @@ public <D> CloseableIterable<D> build() {
     }
   }
 
+  @VisibleForTesting
+  static ParquetReadOptions buildReadOptions(
+      InputFile file,
+      Map<String, String> properties,
+      Long start,
+      Long length,
+      FileDecryptionProperties fileDecryptionProperties) {
+    ParquetReadOptions.Builder optionsBuilder;
+    HadoopInputFile hadoopInputFile = null;
+    if (HadoopInputFile.class.isInstance(file)) {
+      hadoopInputFile = HadoopInputFile.class.cast(file);
+    }
+
+    if (hadoopInputFile != null) {
+      // remove read properties already set that may conflict with this read
+      Configuration conf = new Configuration(hadoopInputFile.getConf());
+      for (String property : READ_PROPERTIES_TO_REMOVE) {
+        conf.unset(property);
+      }
+      optionsBuilder = HadoopReadOptions.builder(conf);
+    } else {
+      optionsBuilder = ParquetReadOptions.builder(new 
PlainParquetConfiguration());
+    }
+
+    for (Map.Entry<String, String> entry : properties.entrySet()) {
+      optionsBuilder.set(entry.getKey(), entry.getValue());
+    }
+
+    if (start != null) {
+      optionsBuilder.withRange(start, start + length);
+    }
+
+    if (fileDecryptionProperties != null) {
+      optionsBuilder.withDecryption(fileDecryptionProperties);
+    }
+
+    if (properties.containsKey(ParquetInputFormat.HADOOP_VECTORED_IO_ENABLED)) 
{
+      optionsBuilder.withUseHadoopVectoredIo(

Review Comment:
   For the vectored I/O concern, I added a regression test 
(testVectoredIoDisabledDoesNotInvokeReadVectored) that tracks calls to 
readVectored(...) on a RangeReadable input. When 
parquet.hadoop.vectored.io.enabled=false, no vectored reads are issued; when 
it's enabled, readVectored(...) is called. This verifies the setting affects 
the actual read path rather than just the Parquet option value.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to