[
https://issues.apache.org/jira/browse/PIG-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15839313#comment-15839313
]
liyunzhang_intel commented on PIG-4891:
---------------------------------------
[~nkollar]: LGTM except some minor issues and left some comment on rb.
> Implement FR join by broadcasting small rdd not making more copys of data
> -------------------------------------------------------------------------
>
> Key: PIG-4891
> URL: https://issues.apache.org/jira/browse/PIG-4891
> Project: Pig
> Issue Type: Sub-task
> Components: spark
> Reporter: liyunzhang_intel
> Assignee: Nandor Kollar
> Fix For: spark-branch
>
>
> In current implementation of FRJoin(PIG-4771), we just set the value of
> replication of data as 10 to make the data access more efficiency because
> current FRJoin algrithms can be reused in this way. We need to figure out how
> to use broadcasting small rdd to implement FRJoin in current code base if we
> find the performance can be improved a lot by using broadcasting rdd.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)