[jira] Commented: (PIG-976) Multi-query optimization throws ClassCastException

Pradeep Kamath (JIRA) Thu, 08 Oct 2009 12:11:57 -0700

    [ 
https://issues.apache.org/jira/browse/PIG-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763645#action_12763645
 ]


Pradeep Kamath commented on PIG-976:
------------------------------------

Reviewed the new patch - one comment is on POMultiQueryPackage:
{code}
203         Object obj = tuple.get(0);                                          
                                                                                
                                                                         
  204         if (obj instanceof PigNullableWritable) {                         
                                                                                
                                                                           
  205             ((PigNullableWritable)obj).setIndex(origIndex);               
                                                                                
                                                                           
  206         }                                                                 
                                                                                
                                                                           
  207         else {                                                            
                                                                                
                                                                           
  208             PigNullableWritable myObj = 
HDataType.getWritableComparableTypes(obj, (byte)0);                             
                                                                                
                             
  209             myObj.setIndex(origIndex);                                    
                                                                                
                                                                           
  210             tuple.set(0, myObj);                                          
                                                                                
                                                                           
  211         }                             
{code}

If obj is null then the above code in the else would give an exception - I 
think the code should check for obj == null and if so create a NullWritable 
object where NullWritable is a subclass of PigNullableWritable representing a 
null. Since only the getValueAsPigType() method is used in PODemux, that would 
always return null for this use case.

> Multi-query optimization throws ClassCastException
> --------------------------------------------------
>
>                 Key: PIG-976
>                 URL: https://issues.apache.org/jira/browse/PIG-976
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.4.0
>            Reporter: Ankur
>            Assignee: Richard Ding
>         Attachments: PIG-976.patch, PIG-976.patch
>
>
> Multi-query optimization fails to merge 2 branches when 1 is a result of 
> Group By ALL and another is a result of Group By field1 where field 1 is of 
> type long. Here is the script that fails with multi-query on.
> data = LOAD 'test' USING PigStorage('\t') AS (a:long, b:double, c:double); 
> A = GROUP data ALL;
> B = FOREACH A GENERATE SUM(data.b) AS sum1, SUM(data.c) AS sum2;
> C = FOREACH B GENERATE (sum1/sum2) AS rate; 
> STORE C INTO 'result1';
> D = GROUP data BY a; 
> E = FOREACH D GENERATE group AS a, SUM(data.b), SUM(data.c);
> STORE E into 'result2';
>  
> Here is the exception from the logs
> java.lang.ClassCastException: org.apache.pig.data.DefaultTuple cannot be cast 
> to org.apache.pig.data.DataBag
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:399)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:180)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:145)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:197)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:235)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:240)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:264)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:254)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.processOnePackageOutput(PigCombiner.java:196)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:174)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:63)
>       at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(MapTask.java:906)
>       at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:786)
>       at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:698)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:228)
>       at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2206)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-976) Multi-query optimization throws ClassCastException

Reply via email to