Gabriel Reid created CRUNCH-215:
-----------------------------------
Summary: Add BloomFilterJoinStrategy
Key: CRUNCH-215
URL: https://issues.apache.org/jira/browse/CRUNCH-215
Project: Crunch
Issue Type: New Feature
Reporter: Gabriel Reid
Assignee: Gabriel Reid
Bloom filters can be very effective for pre-filtering one side of a join when
one side of the join has a small subset of the keys of the other side (i.e.
there are many keys on one side that will not be joined).
The Bloom filter can be built up based on the keys of one side of the join (the
side with fewer keys), and then can be applied as a filter to the other side of
the join before it is sent through the shuffle and reduce phases.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira