[ 
https://issues.apache.org/jira/browse/PIG-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096765#comment-15096765
 ] 

Rohini Palaniswamy commented on PIG-4737:
-----------------------------------------

Thanks Daniel. Will use PIG-4731 to address all the other test failures.

Committed 
https://issues.apache.org/jira/secure/attachment/12782090/PIG-4737-fixtestfailures.patch
 to trunk. 

> Check and fix clone implementation for all classes extending PhysicalOperator
> -----------------------------------------------------------------------------
>
>                 Key: PIG-4737
>                 URL: https://issues.apache.org/jira/browse/PIG-4737
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.16.0
>
>         Attachments: PIG-4737-1.patch, PIG-4737-2.patch, 
> PIG-4737-fixtestfailures.patch
>
>
>     PhysicalOperator.clone() eventually calls Object.clone() which only does 
> a shallow copy (javadoc wrongly says deep copy) and this causes issues with 
> UnionOptimizer in Tez. Most of the clone is already fixed due to issues found 
> earlier, but recently ran into an issue with POStream where after clone same 
> reference was retained to binaryOutputQueue and binaryInputQueue and caused 
> the script to hang. 
> Mostly cloned operators in Union go to different tez vertex plans and the 
> issue would not have occurred, but in the particular case due to replicated 
> join and with the combination of multi-query and union optimization, both the 
> cloned plans of union ended up in the same vertex(one that loads C). That 
> single vertex will handle both the replicated joins and streaming in two 
> sub-plans of split and store the final result in g.
> {code}
> A = LOAD 'a';
> B = LOAD 'b';
> C = LOAD 'c';
> D = JOIN C by $0, A by $0 using 'replicated';
> E = JOIN C by $0, B by $0 using 'replicated';
> F = UNION D, E;
> G = STREAM F through ....
> STORE G into 'g';
> {code}
> It is good to go through all classes extending PhysicalOperator and check if 
> it deep clones objects that are not primitive types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to