[ 
https://issues.apache.org/jira/browse/PIG-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nandor Kollar updated PIG-4891:
-------------------------------
    Attachment: PIG-4891_2.patch

> Implement FR join by broadcasting small rdd not making more copys of data
> -------------------------------------------------------------------------
>
>                 Key: PIG-4891
>                 URL: https://issues.apache.org/jira/browse/PIG-4891
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: Nandor Kollar
>             Fix For: spark-branch
>
>         Attachments: PIG-4891_2.patch
>
>
> In current implementation of FRJoin(PIG-4771), we just set the value of 
> replication of data as 10 to make the data access more efficiency because 
> current FRJoin algrithms can be reused in this way. We need to figure out how 
> to use broadcasting small rdd to implement FRJoin in current code base if we 
> find the performance can be improved a lot by using broadcasting rdd.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to