Re: [PR] chore: various refactoring changes for iceberg [datafusion-comet]

via GitHub Tue, 04 Nov 2025 05:35:18 -0800


martin-g commented on code in PR #2680:
URL: https://github.com/apache/datafusion-comet/pull/2680#discussion_r2490539142



##########
common/src/main/java/org/apache/comet/parquet/NativeBatchReader.java:
##########
@@ -421,9 +516,30 @@ public void init() throws Throwable {
       CometFileKeyUnwrapper keyUnwrapper = null;
       if (encryptionEnabled) {
         keyUnwrapper = new CometFileKeyUnwrapper();
-        keyUnwrapper.storeDecryptionKeyRetriever(file.filePath().toString(), 
conf);
+        keyUnwrapper.storeDecryptionKeyRetriever(filePath, conf);
       }
 
+      // Filter out columns with preinitialized readers from sparkSchema 
before making the
+      // call to native
+      if (preInitializedReaders != null) {
+        StructType filteredSchema = new StructType();
+        StructField[] sparkFields = sparkSchema.fields();
+        for (int i = 0; i < sparkFields.length; i++) {
+          if (i >= preInitializedReaders.length || preInitializedReaders[i] == 
null) {
+            filteredSchema = filteredSchema.add(sparkFields[i]);
+          }
+        }
+        sparkSchema = filteredSchema;

Review Comment:
   Is it possible that the filtering done here may lead to 
ArrayIndexOutOfBoundsException at 
https://github.com/parthchandra/datafusion-comet/blob/d73bcbab9f80836d7229207f309283942501e9ab/common/src/main/java/org/apache/comet/parquet/NativeBatchReader.java#L985
 ?
   Now the `sparkSchema` may have less fields than before I see no new logic to 
protect the `.fields()[i]` call there.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] chore: various refactoring changes for iceberg [datafusion-comet]

Reply via email to