[ 
https://issues.apache.org/jira/browse/PIG-747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786124#action_12786124
 ] 

Pradeep Kamath commented on PIG-747:
------------------------------------

I did some investigation and here are some observations:
Consider the following foreach segment which is similar to the script above:
{code}
foreach a generate {
 X = 10;
 Y = X + X;
 generate Y;
}
{code}

Currently it looks like in the logical plan we connect the same instance of 
LOConst (X) twice to the LOAdd (Y). In LogToPhyTranslationVisitor,  each 
successor of an operator is supposed to get a different instance of the 
operator as its predecessor  because DependencyOrderWalkerWOSeenChk is used to 
visit the inner foreach plan and a new Physical Operator is created each time a 
Logical operator is seen (even if it is the same instance of the Logical 
Operator). However the LogToPhyTranslationVisitor maintains a LogToPhyMap which 
is hashmap for mapping between a logicaloperator and translated 
PhysicalOperator. Since this is a HashMap and not a MultiMap, the LOConst gets 
mapped to the last POConst created and POAdd gets connected to it twice. 

Options to solve this:
1) Change the design in LogToPhyTranslationVisitor to handle this by using a 
MultiMap - this might be pretty involved - not sure on the extent of changes 
required
2) Change the parser to create copies originally in the nested foreach of the 
LogicalPlan and then LogToPhyTranslation doesn't need to worry about this case 
- this seems more cleaner - again unsure on how easy this is.



> Logical to Physical Plan Translation fails when temporary alias are created 
> within foreach
> ------------------------------------------------------------------------------------------
>
>                 Key: PIG-747
>                 URL: https://issues.apache.org/jira/browse/PIG-747
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.4.0
>            Reporter: Viraj Bhat
>            Assignee: Daniel Dai
>             Fix For: 0.7.0
>
>         Attachments: physicalplan.txt, physicalplanprob.pig, PIG-747-1.patch
>
>
> Consider a the pig script which calculates a new column F inside the foreach 
> as:
> {code}
> A = load 'physicalplan.txt' as (col1,col2,col3);
> B = foreach A {
>    D = col1/col2;
>    E = col3/col2;
>    F = E - (D*D);
>    generate
>    F as newcol;
> };
> dump B;
> {code}
> This gives the following error:
> =======================================================================================================================================
> Caused by: 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogicalToPhysicalTranslatorException:
>  ERROR 2015: Invalid physical operators in the physical plan
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:377)
>         at 
> org.apache.pig.impl.logicalLayer.LOMultiply.visit(LOMultiply.java:63)
>         at 
> org.apache.pig.impl.logicalLayer.LOMultiply.visit(LOMultiply.java:29)
>         at 
> org.apache.pig.impl.plan.DependencyOrderWalkerWOSeenChk.walk(DependencyOrderWalkerWOSeenChk.java:68)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:908)
>         at 
> org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:122)
>         at org.apache.pig.impl.logicalLayer.LOForEach.visit(LOForEach.java:41)
>         at 
> org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
>         at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
>         at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:246)
>         ... 10 more
> Caused by: org.apache.pig.impl.plan.PlanException: ERROR 0: Attempt to give 
> operator of type 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.Divide
>  multiple outputs.  This operator does not support multiple outputs.
>         at 
> org.apache.pig.impl.plan.OperatorPlan.connect(OperatorPlan.java:158)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans.PhysicalPlan.connect(PhysicalPlan.java:89)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:373)
>         ... 19 more
> =======================================================================================================================================

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to