sezruby commented on issue #10511: URL: https://github.com/apache/gluten/issues/10511#issuecomment-4623944376
Opened #12240 with a Gluten-only fix. Root cause: `DeltaPostTransformRules` rewrote partition filters and `partitionSchema` to physical names, but Delta's `PreparedDeltaFileIndex.matchingFiles` / `DeltaLog.rewritePartitionFilters` resolve them against the *logical* `metadata.partitionSchema`, so partition pruning silently no-op'd (and file-level stats skipping was disabled for the same reason). Fix: keep filters and partition schema logical on the scan node; only `output` / `dataSchema` / data fields of `requiredSchema` go physical. Native side gets a physical-translated copy via `DeltaScanTransformer.scanFilters` (filter binding to native is by `exprId`, so logical-named filter attributes still resolve correctly against the physical-named output). Tests added in `DeltaSuite` for both `name` and `id` modes covering partition pruning, multi-partition, partition+data filter combos, IS [NOT] NULL, and column rename. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
