[ 
https://issues.apache.org/jira/browse/PIG-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shravan Matthur Narayanamurthy updated PIG-422:
-----------------------------------------------

    Status: Patch Available  (was: Open)

This one got broken because of the fix to POUserFunc to adhere to trunk 
behavior. We removed the Tuple inside a Tuple check. The initial fix used a 
constant expression which was a Tuple and relied on POUserFunc to remove the 
nesting before sending it to GFCross. 

So now I split the list of objects inside the constant tuple into 2 constant 
expressions. However, it did not work because of our unordered plan structure. 
It was accessing the two constants in random order and GFCross would not work 
if we pass(1,2) instead of (2,1).

I think we need to be careful about this one. If a UDF is given constant 
expressions like UDF('2','1'), We create constant expressions and attach it to 
the UDF as inputs. However, I am not sure if there is guarantee that the two 
constant expressions will be pulled in the same order as our plan doesn't 
support order.

I was able to fix this one because, luckily the POUserFunc operator relies on 
its inputs and not on the ones got by using getPredecessors() on the plan. I 
think most of the operators that were created earlier did that since we did not 
have a handle to the plan the operator is a part of. So, I explicitly 
initialized the inputs of POUserFunc to the list of constanct expressions, 
created in the right order, after connecting all the operators in the plan. I 
think we need to take a look at the code and see if we can hit such problems 
elsewhere.

> cross is broken
> ---------------
>
>                 Key: PIG-422
>                 URL: https://issues.apache.org/jira/browse/PIG-422
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Olga Natkovich
>            Assignee: Shravan Matthur Narayanamurthy
>             Fix For: types_branch
>
>         Attachments: 422.patch
>
>
> The following script fails:
> a = load 'data1' as (name, age, gpa);
> b = load 'data2' as (name, age, registration, contributions);
> c = filter a by age < 19 and gpa < 1.0;
> d = filter b by age < 19;
> e = cross c, d;
> store e into 'output';
> produces the following stack:
> 0808261638_3210_r_000000java.lang.ClassCastException: 
> org.apache.pig.data.DefaultDataBag cannot be cast to org.apache.pig.data.Tuple
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:264)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:220)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:231)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:220)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODistinct.getNext(PODistinct.java:76)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:270)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:351)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:158)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:123)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:175)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:241)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:217)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:156)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:206)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:176)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:87)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391)
>         at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)
> /Cross
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:158)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:123)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:175)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:241)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:217)
>         at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:156)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:206)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:176)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:87)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391)
>         at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to