Re: [PR] [GLUTEN-10511][VL][Delta] Fix wrong result with partition filters under column mapping [gluten]

via GitHub Thu, 04 Jun 2026 21:35:58 -0700


sezruby commented on PR #12240:
URL: https://github.com/apache/gluten/pull/12240#issuecomment-4628227225


   @zhztheplayer @FelixYBW could you take a look when you get a chance?
   
   Quick context: this fixes the wrong-result bug in #10511 by narrowing the 
column-mapping rewrite — only the reader-facing fields (`output`, `dataSchema`, 
data part of `requiredSchema`) become physical; partition schema and filters 
stay logical so Delta's `PreparedDeltaFileIndex` keeps working for partition 
pruning + stats-based file skipping. Native side gets a physical-translated 
filter copy via a `scanFilters` override.
   
   The result is asymmetric on the scan node, which I know is a bit ugly. The 
reason is that vanilla Spark + Delta does the logical→physical translation 
just-in-time inside `DeltaParquetFileFormat.buildReaderWithPartitionValues`, 
and Gluten bypasses that hook. The cleaner shape is to keep EVERYTHING on the 
scan node logical and translate only at substrait emission 
(`BasicScanExecTransformer.doTransform`) — that would also let us drop the 
alias-back `ProjectExecTransformer` and the `scanFilters` override. But that's 
a multi-module refactor (touches the substrait emitter shared across 
Iceberg/Hudi/plain Parquet/Delta), so I left it as a follow-up noted in the 
docstring rather than scope-creep into a bug fix. Happy to take it on as a 
separate PR if you'd prefer.
   
   Verified locally end-to-end with the prebuilt CI artifacts in 
`apache/gluten:centos-8-jdk8` — `VeloxDeltaSuite` passes 30/30 including all 12 
new tests. CI is also green on the latest commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [GLUTEN-10511][VL][Delta] Fix wrong result with partition filters under column mapping [gluten]

Reply via email to