[
https://issues.apache.org/jira/browse/PIG-5165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979424#comment-15979424
]
Adam Szita commented on PIG-5165:
---------------------------------
I took a look on this. The problem is indeed the predecessor operators of
POSkewedJoin being swapped but the root cause is not in any optimization. There
are two places where this can go wrong:
1. At SparkPlan compilation in *SparkCompiler.java*: when we use
{{OperatorPlan.merge}} it may not keep the order of the plans we provided for
the merge. Behind the scene only operators in maps will be copied where any
ordering is lost - that's why we'll need to sort the operators in the resulting
merged plan.
2. After the plan is compiled and optimized we produce RDD's in
*JobGraphBuilder.java* and for some reason we're ordering predecessors of
operators and for POSkewedJoin we shouldn't. Optimization has changed parts of
the plan and after that we can't rely on scope ID's to sort operators.
The fixes are in [^PIG-5165.0.patch], [~kellyzly] please review.
This took me a whole lot of time to debug especially because as it turns out
calling toString() on PhysicalPlan may change the plan! (Also IDE debuggers
usually call toString() automatically to help the developer causing a great
confusion). Is there any reason why we're sorting the leaves of a plan
*in-place* instead of taking a copy of them and then doing it
[here|https://github.com/apache/pig/blob/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java#L137]?
[~rohini] do you agree we should take a copy of the leaves first here? If so
I'd like to create another ticket and patch it.
> MultiQuery_Union_7 is failing with spark exec type
> --------------------------------------------------
>
> Key: PIG-5165
> URL: https://issues.apache.org/jira/browse/PIG-5165
> Project: Pig
> Issue Type: Sub-task
> Components: spark
> Reporter: Nandor Kollar
> Assignee: Adam Szita
> Fix For: spark-branch
>
> Attachments: PIG-5165.0.patch
>
>
> 1st output is fine, 2nd is different
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)