gianm commented on issue #6222: Add ability to pass in Bloom filter from Hive 
Queries
URL: https://github.com/apache/incubator-druid/pull/6222#issuecomment-415471216
 
 
   Hi @nishantmonu51, this is really cool and something that seems useful 
outside of Hive / Druid interop too. Is there a way to do this without making 
it so Hive specific? Could the core bloom filter code be extracted into a 
library and put into a `druid-bloom-filter` extension?
   
   I am thinking of use cases like approximate `x IN (SELECT y FROM … WHERE …)` 
filters done like:
   
   - Run the subquery with a bloom filter aggregator that builds a bloom filter 
object for all `y` matching the inner query.
   - Turn around and apply that bloom filter object as a filter on the main 
query.
   
   It's not exact but with appropriately sized filters it would still be useful 
for some use cases.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to