brijrajk commented on code in PR #12151:
URL: https://github.com/apache/gluten/pull/12151#discussion_r3500310173


##########
gluten-ut/spark40/src/test/resources/backends-velox/tpcds-plan-stability/gluten-approved-plans-modified/q59.sf100/simplified.txt:
##########
@@ -23,52 +23,42 @@ VeloxColumnarToRow
                                         InputIteratorTransformer
                                           InputAdapter
                                             ColumnarBroadcastExchange #2
-                                              WholeStageCodegenTransformer (3)
-                                                FilterExecTransformer 
[d_date_sk,d_week_seq]
-                                                  Subquery #1
-                                                    VeloxColumnarToRow
-                                                      
WholeStageCodegenTransformer (2)
-                                                        
RegularHashAggregateExecTransformer [bloom_filter_agg(xxhash64(d_week_seq, 42), 
335, 8990, 0, 0),bloomFilter,buf,buf]
-                                                          
InputIteratorTransformer
-                                                            InputAdapter
-                                                              ColumnarExchange 
#3
-                                                                
VeloxResizeBatches
-                                                                  
WholeStageCodegenTransformer (1)
-                                                                    
FlushableHashAggregateExecTransformer [_pre_x] [buf,buf,buf]
+                                              RowToVeloxColumnar
+                                                WholeStageCodegen (1)
+                                                  Filter [d_date_sk,d_week_seq]
+                                                    Subquery #1
+                                                      ObjectHashAggregate 
[buf] [bloom_filter_agg(xxhash64(d_week_seq, 42), 335, 8990, 0, 
0),bloomFilter,buf]
+                                                        VeloxColumnarToRow
+                                                          ColumnarExchange #3
+                                                            VeloxResizeBatches
+                                                              
RowToVeloxColumnar
+                                                                
ObjectHashAggregate [d_week_seq] [buf,buf]
+                                                                  
VeloxColumnarToRow
+                                                                    
WholeStageCodegenTransformer (1)

Review Comment:
   @zhztheplayer Yes, `ObjectHashAggregate` is in the final executed plan, and 
the fallback is intentional.
   
   **`bloom_filter_agg` is the attribute name, not the physical function 
class.** It comes from `resultExpressions`, which is frozen when 
`ObjectHashAggregateExec` is first built with vanilla `BloomFilterAggregate`. 
Since `transformAllExpressions` only patches `aggregateExpressions` and not 
`resultExpressions`, the name stays the same regardless of any later 
substitution.
   
   **The physical function inside is `VeloxBloomFilterAggregate`, not 
vanilla.** `BloomFilterRuntimeFilterRewriteRule` substitutes it via 
`subq.withNewPlan(rewrittenPlan)`. `eval()` calls `serialize(buffer)` 
unconditionally, producing version=1 bytes that `VeloxBloomFilterMightContain` 
reads correctly.
   
   **JVM mode is unavoidable here.** `PlanSubqueries` (pos 2) prepares the 
subquery before `ApplyColumnarRulesAndInsertTransitions` (pos 10) runs, so 
`HeuristicTransform` sees vanilla `BloomFilterAggregate` with no Substrait 
mapping and emits `ObjectHashAggregateExec`. By the time our rule patches it at 
pos 10 on the main plan, `HeuristicTransform` has already finished.
   
   **Substituting earlier would break SPARK-54336.** If we substitute during 
subquery preparation, it also fires for `might_contain(empty_subquery, 
literal)`. Since `VeloxBloomFilterAggregate` has no cardinality guard, empty 
input returns bytes instead of `null`.
   
   Added `GlutenBloomFilterFallbackSuite."GLUTEN-12013: 
VeloxBloomFilterAggregate in JVM subquery produces correct Velox bytes"` that 
inspects `collectWithSubqueries(executedPlan)` for 
`ObjectHashAggregateExec(VeloxBloomFilterAggregate)` and verifies correct query 
results.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to