brijrajk commented on PR #12151:
URL: https://github.com/apache/gluten/pull/12151#issuecomment-4812584645

   ### Latest push — fix TPC-DS plan stability failures
   
   The previous push (`51185c76f`) introduced a DPP guard that broke 18 TPC-DS 
`check simplified sf100` tests (`tpcds-v1.4/q2`, `q10`, `q16`, etc.). Root 
cause and fix:
   
   **Root cause:** The guard `v: Attribute` prevented DPP/runtime-filter bloom 
filters from being rewritten to `VeloxBloomFilterMightContain`. When Velox 
tries to validate `FilterExecTransformer` with a vanilla 
`BloomFilterMightContain(ScalarSubquery, xxhash64(...))`, it cannot find a 
substrait mapping → validation fails → the filter (and the bloom filter 
aggregate subquery) falls back to vanilla Spark. This changes the plan 
structure: `FilterExecTransformer` + `RegularHashAggregateExecTransformer` 
becomes `Filter` + `ObjectHashAggregate`, which doesn't match the golden files.
   
   **Fix (`7b59cc4a4`):** 3-case pattern match in 
`BloomFilterMightContainJointRewriteRule`:
   
   1. **User-facing** (`v: Attribute`): rewrite **both** outer `MightContain` 
AND inner `Aggregate` to Velox format → bytes format consistent across stages
   2. **DPP/runtime-filter** (`v` is `xxhash64(...)`, bf is `ScalarSubquery`): 
rewrite **only** outer `MightContain` to `VeloxBloomFilterMightContain` — leave 
inner aggregate as vanilla `bloom_filter_agg(xxhash64(...))` since Velox 
handles it natively → `FilterExecTransformer` validates, simplified plan shows 
`bloom_filter_agg(xxhash64(...))` matching golden files
   3. **Pre-computed literal bytes** (bf is not a `ScalarSubquery`): rewrite 
only outer `MightContain`
   
   This should make both the TPC-DS plan stability tests and the bloom filter 
test suites pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to