WangGuangxin commented on PR #6652:
URL: 
https://github.com/apache/incubator-gluten/pull/6652#issuecomment-2288332697

   > > > But since the bloom filter contructed by native are different with 
Spark, a NPE will throw in this case
   > > 
   > > 
   > > I remember we had made Velox bloom-filter functions runnable in vanilla 
Spark operators so there shouldn't have compatibility issues. #5435. Am I 
missing something? Or there is a bug?
   > 
   > @zhztheplayer Got it. Let me check.
   
   @zhztheplayer I digged into it, the main reason is that in this case, its a 
partition filter and the bloom filter is evaluted in driver side, so we need to 
change VeloxBloomFilterMightContain's codegen from 
   
   ```
   val bf = ctx.addMutableState(className, "bloomFilter")
       ctx.addPartitionInitializationStatement(s"$bf = 
$className.readFrom($bfData);")
   ```
   
   to 
   
   ```
   val bf = ctx.addMutableState(className, "bloomFilter", bf => s"$bf = 
$className.readFrom($bfData);")
   ```
   
   Otherwise, the `bf` object is not initialized when executed in driver and a 
NPE is thrown.
   
   But after this change, the `VeloxBloomFilter` still cannot evaluted on 
driver side, since most of JNI and Resource Management will check whether we 
are in a spark task or not.
   such as 
   
   ```
   Runtimes.contextInstance
   ```
   and 
   ```
   Spillable reservation listener must be used in a Spark task.
   ``` 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to