[
https://issues.apache.org/jira/browse/PIG-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096746#comment-15096746
]
Daniel Dai commented on PIG-4737:
---------------------------------
+1.
There is a ticket for NPE in StoreFuncDecorator: PIG-4731
> Check and fix clone implementation for all classes extending PhysicalOperator
> -----------------------------------------------------------------------------
>
> Key: PIG-4737
> URL: https://issues.apache.org/jira/browse/PIG-4737
> Project: Pig
> Issue Type: Bug
> Reporter: Rohini Palaniswamy
> Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4737-1.patch, PIG-4737-2.patch,
> PIG-4737-fixtestfailures.patch
>
>
> PhysicalOperator.clone() eventually calls Object.clone() which only does
> a shallow copy (javadoc wrongly says deep copy) and this causes issues with
> UnionOptimizer in Tez. Most of the clone is already fixed due to issues found
> earlier, but recently ran into an issue with POStream where after clone same
> reference was retained to binaryOutputQueue and binaryInputQueue and caused
> the script to hang.
> Mostly cloned operators in Union go to different tez vertex plans and the
> issue would not have occurred, but in the particular case due to replicated
> join and with the combination of multi-query and union optimization, both the
> cloned plans of union ended up in the same vertex(one that loads C). That
> single vertex will handle both the replicated joins and streaming in two
> sub-plans of split and store the final result in g.
> {code}
> A = LOAD 'a';
> B = LOAD 'b';
> C = LOAD 'c';
> D = JOIN C by $0, A by $0 using 'replicated';
> E = JOIN C by $0, B by $0 using 'replicated';
> F = UNION D, E;
> G = STREAM F through ....
> STORE G into 'g';
> {code}
> It is good to go through all classes extending PhysicalOperator and check if
> it deep clones objects that are not primitive types.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)