zhztheplayer opened a new pull request, #5435:
URL: https://github.com/apache/incubator-gluten/pull/5435

   Velox's bloom-filter agg/filter functions are logically different with 
Spark's version. This makes their resident Gluten/Spark Filter/Aggregate 
operators logically different with Spark's version. For such logical 
differences, we should use different functions to distinguish between 
implementations rather than reusing Spark's function type in Velox backend.
   
   
   Patch incorporates:
   
   1. Add `VeloxBloomFilterMightContain` / `VeloxBloomFilterAggregate`.
   2. When transforming `FilterExec` to `FilterExecTransformer`, transform 
`BloomFilterMightContain` / `BloomFilterAggregate` to 
`VeloxBloomFilterMightContain` / `VeloxBloomFilterAggregate` at the same time.
   3. Remove rule `FallbackBloomFilterAggIfNeeded`.
   
   The patch makes the relevant code safer than before, since we have explicit 
function pair of  `BloomFilterMightContain` / `BloomFilterAggregate` and 
`VeloxBloomFilterMightContain` / `VeloxBloomFilterAggregate`. Thus we can 
easier detect a mismatch between agg/filter function by checking their 
implementation types, rather than failing the query execution at runtime.
   
   The patch is still not able to solve the failed test case in 
https://github.com/apache/incubator-gluten/pull/5433. However, the failure will 
be reported faster by Spark since vanilla Spark doesn't implement function 
`VeloxBloomFilterMightContain`. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to