julianhyde commented on issue #12262:
URL: https://github.com/apache/druid/issues/12262#issuecomment-1061079561


   @gianm When you are doing a big join with filters on both sides - e.g. 
orders from customers in california for red products - then you only want to 
read customers who may have bought a red product, and also only want to read 
orders placed by a customer in california. Thus you want a filter implied by 
the join to travel both ways, and you can handle a few false positives. A good 
way is to generate a bloom filter as you are scanning customers and pass it to 
the scan of products, and vice versa.  
   
   In short, approximate semi-joins pushed down to each sides of a join as 
pre-filters.
   
   The paper "[Sideways Information Passing for
   Push-Style Query 
Processing](https://repository.upenn.edu/cgi/viewcontent.cgi?article=1045&context=db_research)"
 from 2008 describes the idea pretty well. But we were doing it in Broadbase in 
around 1998.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to