[jira] [Updated] (PIG-4518) SparkOperator should correspond to complete Spark job

Mohit Sabharwal (JIRA) Thu, 23 Apr 2015 17:07:18 -0700

     [ 
https://issues.apache.org/jira/browse/PIG-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mohit Sabharwal updated PIG-4518:
---------------------------------
    Attachment: PIG-4518.patch

> SparkOperator should correspond to complete Spark job
> -----------------------------------------------------
>
>                 Key: PIG-4518
>                 URL: https://issues.apache.org/jira/browse/PIG-4518
>             Project: Pig
>          Issue Type: Bug
>          Components: spark
>            Reporter: Mohit Sabharwal
>            Assignee: Mohit Sabharwal
>             Fix For: spark-branch
>
>         Attachments: PIG-4518.patch
>
>
> SparkPlan, which was added in PIG-4374, creates a new SparkOperator for every 
> shuffle boundary (denoted by presence of POGlobalRearrange in the 
> corresponding physical plan).  This is unnecessary for Spark engine since it 
> relies on Spark to do the shuffle (using groupBy(), reduceByKey() and 
> CoGroupRDD) and does not need to explicitly identify "map" and "reduce" 
> operations.
> It is also cleaner if a single SparkOperator represents a single complete 
> Spark job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4518) SparkOperator should correspond to complete Spark job

Reply via email to