sezruby commented on issue #10511:
URL: https://github.com/apache/gluten/issues/10511#issuecomment-4623944376

   Opened #12240 with a Gluten-only fix.
   
   Root cause: `DeltaPostTransformRules` rewrote partition filters and 
`partitionSchema` to physical names, but Delta's 
`PreparedDeltaFileIndex.matchingFiles` / `DeltaLog.rewritePartitionFilters` 
resolve them against the *logical* `metadata.partitionSchema`, so partition 
pruning silently no-op'd (and file-level stats skipping was disabled for the 
same reason).
   
   Fix: keep filters and partition schema logical on the scan node; only 
`output` / `dataSchema` / data fields of `requiredSchema` go physical. Native 
side gets a physical-translated copy via `DeltaScanTransformer.scanFilters` 
(filter binding to native is by `exprId`, so logical-named filter attributes 
still resolve correctly against the physical-named output). Tests added in 
`DeltaSuite` for both `name` and `id` modes covering partition pruning, 
multi-partition, partition+data filter combos, IS [NOT] NULL, and column rename.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to