Mohit Sabharwal created PIG-4518:
------------------------------------

             Summary: SparkOperator should correspond to complete Spark job
                 Key: PIG-4518
                 URL: https://issues.apache.org/jira/browse/PIG-4518
             Project: Pig
          Issue Type: Bug
          Components: spark
            Reporter: Mohit Sabharwal
            Assignee: Mohit Sabharwal
             Fix For: spark-branch


SparkPlan, which was added in PIG-4374, creates a new SparkOperator for every 
shuffle boundary (denoted by presence of POGlobalRearrange in the corresponding 
physical plan).  This is unnecessary for Spark engine since it relies on Spark 
to do the shuffle (using groupBy(), reduceByKey() and CoGroupRDD) and does not 
need to explicitly identify "map" and "reduce" operations.

It is also cleaner if a single SparkOperator represents a single complete Spark 
job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to