clintropolis commented on a change in pull request #6397: Adds bloom filter aggregator to 'druid-bloom-filters' extension URL: https://github.com/apache/incubator-druid/pull/6397#discussion_r250590106
########## File path: extensions-core/druid-bloom-filter/src/main/java/org/apache/druid/query/filter/BloomKFilter.java ########## @@ -38,7 +40,13 @@ * https://github.com/apache/hive/commit/87ce36b458350db141c4cb4b6336a9a01796370f#diff-e65fc506757ee058dc951d15a9a526c3L238 * and this linked issue https://issues.apache.org/jira/browse/HIVE-20101. * - * Todo: remove this and begin using hive-storage-api version again once https://issues.apache.org/jira/browse/HIVE-20893 is released + * Addtionally, a handful of methods have been added to in situ work with BloomKFilters that have been serialized to a + * ByteBuffer, e.g. all add and merge methods. Test methods were not added because we don't need them.. but would + * probably be chill to do so it is symmetrical. + * + * Todo: remove this and begin using hive-storage-api version again once Review comment: As I understand the Hive integration is that it externally constructs a `BloomKFilter` with a bunch of values and then uses it with the `BloomDimFilter` the original PR added to filter Druid queries. This agg uses the same bloom filter implementation so there is a native Druid way to generate input for `BloomDimFilter` from existing Druid data. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
