Self join wth implicit split has the join output in wrong order
---------------------------------------------------------------
Key: PIG-429
URL: https://issues.apache.org/jira/browse/PIG-429
Project: Pig
Issue Type: Bug
Affects Versions: types_branch
Reporter: Pradeep Kamath
Fix For: types_branch
Query:
{code}
A = load 'st10k' split by 'file';
B = filter A by $1 > 25;
D = join A by $0, B by $0;
dump D;
{code}
In the output the columns from B are projected out first and from A next. On
closer examination of the code, the ImplicitSplitInserter class adds in the
split and two splitoutput operators into the plan and tries the connect the
successors of LOad to these. However it does this by iterating over its
successors and disconnecting from them and connecting up the split-splitoutput
to the successors. However the order in which it gets its successors is NOT the
same as the order in which cogroup (join) expects its inputs. Hence the
discrepancy.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.