martin-g commented on code in PR #2680:
URL: https://github.com/apache/datafusion-comet/pull/2680#discussion_r2490539142
##########
common/src/main/java/org/apache/comet/parquet/NativeBatchReader.java:
##########
@@ -421,9 +516,30 @@ public void init() throws Throwable {
CometFileKeyUnwrapper keyUnwrapper = null;
if (encryptionEnabled) {
keyUnwrapper = new CometFileKeyUnwrapper();
- keyUnwrapper.storeDecryptionKeyRetriever(file.filePath().toString(),
conf);
+ keyUnwrapper.storeDecryptionKeyRetriever(filePath, conf);
}
+ // Filter out columns with preinitialized readers from sparkSchema
before making the
+ // call to native
+ if (preInitializedReaders != null) {
+ StructType filteredSchema = new StructType();
+ StructField[] sparkFields = sparkSchema.fields();
+ for (int i = 0; i < sparkFields.length; i++) {
+ if (i >= preInitializedReaders.length || preInitializedReaders[i] ==
null) {
+ filteredSchema = filteredSchema.add(sparkFields[i]);
+ }
+ }
+ sparkSchema = filteredSchema;
Review Comment:
Is it possible that the filtering done here may lead to
ArrayIndexOutOfBoundsException at
https://github.com/parthchandra/datafusion-comet/blob/d73bcbab9f80836d7229207f309283942501e9ab/common/src/main/java/org/apache/comet/parquet/NativeBatchReader.java#L985
?
Now the `sparkSchema` may have less fields than before I see no new logic to
protect the `.fields()[i]` call there.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]