clintropolis commented on issue #3878: Filters on high cardinality dimensions should sometimes use dim index bitset + full scan instead of unioning bitsets of dim values URL: https://github.com/apache/incubator-druid/issues/3878#issuecomment-465789290 Eh, I think the problem I encountered with bloom filters is more or less the same problem you describe, if not even simpler because it only depends on cardinality, not even selectivity matters, so a threshold based approach would likely almost always do the right thing if the threshold is set correctly. The filters you describe experience bad performance when those filters match a large percentage of overall rows, which is why I brought up selectivity in the first place, but I too am hoping that just a threshold based approach would be good enough or at least better than what there is now.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
