[
https://issues.apache.org/jira/browse/PIG-429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pradeep Kamath updated PIG-429:
-------------------------------
Attachment: PIG-429.patch
patch for issue - passes all unit tests
> Self join wth implicit split has the join output in wrong order
> ---------------------------------------------------------------
>
> Key: PIG-429
> URL: https://issues.apache.org/jira/browse/PIG-429
> Project: Pig
> Issue Type: Bug
> Affects Versions: types_branch
> Reporter: Pradeep Kamath
> Fix For: types_branch
>
> Attachments: PIG-429.patch
>
>
> Query:
> {code}
> A = load 'st10k' split by 'file';
> B = filter A by $1 > 25;
> D = join A by $0, B by $0;
> dump D;
> {code}
> In the output the columns from B are projected out first and from A next. On
> closer examination of the code, the ImplicitSplitInserter class adds in the
> split and two splitoutput operators into the plan and tries the connect the
> successors of LOad to these. However it does this by iterating over its
> successors and disconnecting from them and connecting up the
> split-splitoutput to the successors. However the order in which it gets its
> successors is NOT the same as the order in which cogroup (join) expects its
> inputs. Hence the discrepancy.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.