Gabriel Reid created CRUNCH-213:
-----------------------------------

             Summary: Add sharded join functionality
                 Key: CRUNCH-213
                 URL: https://issues.apache.org/jira/browse/CRUNCH-213
             Project: Crunch
          Issue Type: New Feature
            Reporter: Gabriel Reid
            Assignee: Gabriel Reid


Performing joins where a large proportion of the values on one or both sides of 
the join are mapped to a single key can result in poor performance, as one (or 
a small number) of reducers end up handling most of the joining work, leaving 
the rest of the cluster idle.

Sharded joining should be added to allow splitting up join keys, thereby 
distributing values mapped to a single key over multiple reducer partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to