Hi @nishantmonu51, this is really cool and something that seems useful outside of Hive / Druid interop too. Is there a way to do this without making it so Hive specific? Could the core bloom filter code be extracted into a library and put into a `druid-bloom-filter` extension?
I am thinking of use cases like approximate `x IN (SELECT y FROM … WHERE …)` filters done like: - Run the subquery with a bloom filter aggregator that builds a bloom filter object for all `y` matching the inner query. - Turn around and apply that bloom filter object as a filter on the main query. It's not exact but with appropriately sized filters it would still be useful for some use cases. [ Full content available at: https://github.com/apache/incubator-druid/pull/6222 ] This message was relayed via gitbox.apache.org for [email protected]
