jmbowles commented on issue #9413:
URL: https://github.com/apache/druid/issues/9413#issuecomment-1368219702

   I stumbled upon this thread while trying to accomplish a similar goal. That 
is, I needed a means to create the base64 encoded bloom within Druid itself, 
from user provided values, so that it could then be used to filter an extremely 
high cardinality dimension. The solution was to add an additional column with a 
constant value, which can then be ignored. 
   
   **Tested version: 0.22.1**
   
   1.  **The inline results showing the two columns of data for illustration:**
   
   SELECT * FROM (VALUES 'a', 'b', 'c') AS blacklist (item), (VALUES 0) AS 
blacklist2 (item2)
   
   2.  **Build the bloom filter from the item column only:**
   
   SELECT BLOOM_FILTER(blacklist.item, 100) as item_bloom FROM (VALUES 'a', 
'b', 'c') AS blacklist (item), (VALUES 0) AS blacklist2 (item2)
   
   ... which creates the bloom filter value: 
   
   
BAAAABAACAAAAAAAAAAgAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAAAAAAABAAAAACAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAhAAAAAAAAAAAAAIAAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
   
   3.  **Use the base64 encoded bloom filter value in a query on an actual 
Druid data source:**
   
   SELECT * FROM some_druid_table WHERE 
bloom_filter_test(some_druid_target_dimension, 
'BAAAABAACAAAAAAAAAAgAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAAAAAAABAAAAACAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAhAAAAAAAAAAAAAIAAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==')
 = true
   
   You'll notice in step 3, there's no need for the IN clause since it was 
replaced with the bloom_filter_test function.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to