[
https://issues.apache.org/jira/browse/CRUNCH-216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13743949#comment-13743949
]
Gabriel Reid commented on CRUNCH-216:
-------------------------------------
I have the feeling that it's better to stay away from trying to be too clever
with that stuff. I find that even when I remember to implement a decent
scaleFactor method, it's still pretty hit and miss with getting reliable sizes
from the getSize method (i.e. it's just really hard to do it correctly).
On the other hand, usually when you're using a MapSideJoin there is going to be
a really big difference in the size of collections being joined, so maybe it
would be ok even if the size heuristic isn't that reliable.
> Transpose arguments in MapsideJoinStrategy.join
> -----------------------------------------------
>
> Key: CRUNCH-216
> URL: https://issues.apache.org/jira/browse/CRUNCH-216
> Project: Crunch
> Issue Type: Improvement
> Reporter: Gabriel Reid
>
> The MapsideJoinStrategy currently specifies that the smaller table in the
> join (i.e. the table to be replicated and loaded in memory) should be on the
> right-hand side of the join.
> This is the opposite of what is done in all other join strategies, making it
> impossible to just switch out another join strategy for a
> MapsideJoinStrategy. The MapsideJoinStrategy could be brought in line with
> the other JoinStrategies to expect the smaller of two tables to be provided
> as the left-side table.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira