Mohit Sabharwal created PIG-4518:
------------------------------------
Summary: SparkOperator should correspond to complete Spark job
Key: PIG-4518
URL: https://issues.apache.org/jira/browse/PIG-4518
Project: Pig
Issue Type: Bug
Components: spark
Reporter: Mohit Sabharwal
Assignee: Mohit Sabharwal
Fix For: spark-branch
SparkPlan, which was added in PIG-4374, creates a new SparkOperator for every
shuffle boundary (denoted by presence of POGlobalRearrange in the corresponding
physical plan). This is unnecessary for Spark engine since it relies on Spark
to do the shuffle (using groupBy(), reduceByKey() and CoGroupRDD) and does not
need to explicitly identify "map" and "reduce" operations.
It is also cleaner if a single SparkOperator represents a single complete Spark
job.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)