[
https://issues.apache.org/jira/browse/CRUNCH-211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13679133#comment-13679133
]
Gabriel Reid commented on CRUNCH-211:
-------------------------------------
Nope, not committed yet -- it won't apply totally cleanly on CRUNCH-213, so
I'll get that committed this evening and then commit this one as well.
> Add one-to-many join functionality
> ----------------------------------
>
> Key: CRUNCH-211
> URL: https://issues.apache.org/jira/browse/CRUNCH-211
> Project: Crunch
> Issue Type: Bug
> Reporter: Gabriel Reid
> Attachments: CRUNCH-211.patch, CRUNCH-211.patch
>
>
> A common pattern is a join between two tables where the left-side table
> contains a single value per key, and the right-side table contains multiple
> values per key. An example of such a join would be a join between users and
> web click entries:
> PTable<Long,User> usersById = ...;
> PTable<Long,WebClick> webClicksByUserId = ...;
> In this case, there can be some situations where it is desirable to bring the
> User together with the iterable of all WebClicks. The current join
> functionality will replicate the User for each WebClick that it's related to,
> but each WebClick then needs to be dealt with completely separately.
> Currently, the only way of getting an iterable of WebClicks together with a
> single User in a single method call is by materializing all WebClicks per
> user in memory using something like PTable#collectValues, and this approach
> doesn't work when there are a large number of WebClicks.
> The intention of this ticket is to add functionality whereby the User and
> Iterable of WebClicks are available in a single method call, without the
> Iterable of WebClicks being materialized in memory (i.e. a feasible approach
> for millions or more WebClicks).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira