[ 
https://issues.apache.org/jira/browse/PIG-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-976:
-----------------------------

    Attachment: PIG-976.patch

The cause of this bug is the wrong assumption made by the multi-query 
optimizer. Namely, it assumed that the tuples emitted from a package object 
always had the key ('group') at the tuple's first field. When the key ('group') 
wasn't part of the output of the following foreach clause (as in the script 
above), the first field of the tuple (from a package object) was actually a bag 
(value), not the key.

This patch fixed this problem.

> Multi-query optimization throws ClassCastException
> --------------------------------------------------
>
>                 Key: PIG-976
>                 URL: https://issues.apache.org/jira/browse/PIG-976
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.4.0
>            Reporter: Ankur
>            Assignee: Richard Ding
>         Attachments: PIG-976.patch
>
>
> Multi-query optimization fails to merge 2 branches when 1 is a result of 
> Group By ALL and another is a result of Group By field1 where field 1 is of 
> type long. Here is the script that fails with multi-query on.
> data = LOAD 'test' USING PigStorage('\t') AS (a:long, b:double, c:double); 
> A = GROUP data ALL;
> B = FOREACH A GENERATE SUM(data.b) AS sum1, SUM(data.c) AS sum2;
> C = FOREACH B GENERATE (sum1/sum2) AS rate; 
> STORE C INTO 'result1';
> D = GROUP data BY a; 
> E = FOREACH D GENERATE group AS a, SUM(data.b), SUM(data.c);
> STORE E into 'result2';
>  
> Here is the exception from the logs
> java.lang.ClassCastException: org.apache.pig.data.DefaultTuple cannot be cast 
> to org.apache.pig.data.DataBag
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:399)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:180)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:145)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:197)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:235)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:240)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:264)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:254)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.processOnePackageOutput(PigCombiner.java:196)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:174)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:63)
>       at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(MapTask.java:906)
>       at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:786)
>       at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:698)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:228)
>       at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2206)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to