[ https://issues.apache.org/jira/browse/PIG-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12763645#action_12763645 ]
Pradeep Kamath commented on PIG-976: ------------------------------------ Reviewed the new patch - one comment is on POMultiQueryPackage: {code} 203 Object obj = tuple.get(0); 204 if (obj instanceof PigNullableWritable) { 205 ((PigNullableWritable)obj).setIndex(origIndex); 206 } 207 else { 208 PigNullableWritable myObj = HDataType.getWritableComparableTypes(obj, (byte)0); 209 myObj.setIndex(origIndex); 210 tuple.set(0, myObj); 211 } {code} If obj is null then the above code in the else would give an exception - I think the code should check for obj == null and if so create a NullWritable object where NullWritable is a subclass of PigNullableWritable representing a null. Since only the getValueAsPigType() method is used in PODemux, that would always return null for this use case. > Multi-query optimization throws ClassCastException > -------------------------------------------------- > > Key: PIG-976 > URL: https://issues.apache.org/jira/browse/PIG-976 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.4.0 > Reporter: Ankur > Assignee: Richard Ding > Attachments: PIG-976.patch, PIG-976.patch > > > Multi-query optimization fails to merge 2 branches when 1 is a result of > Group By ALL and another is a result of Group By field1 where field 1 is of > type long. Here is the script that fails with multi-query on. > data = LOAD 'test' USING PigStorage('\t') AS (a:long, b:double, c:double); > A = GROUP data ALL; > B = FOREACH A GENERATE SUM(data.b) AS sum1, SUM(data.c) AS sum2; > C = FOREACH B GENERATE (sum1/sum2) AS rate; > STORE C INTO 'result1'; > D = GROUP data BY a; > E = FOREACH D GENERATE group AS a, SUM(data.b), SUM(data.c); > STORE E into 'result2'; > > Here is the exception from the logs > java.lang.ClassCastException: org.apache.pig.data.DefaultTuple cannot be cast > to org.apache.pig.data.DataBag > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:399) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:180) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:145) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:197) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:235) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:240) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.runPipeline(PODemux.java:264) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.PODemux.getNext(PODemux.java:254) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.processOnePackageOutput(PigCombiner.java:196) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:174) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:63) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(MapTask.java:906) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:786) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:698) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:228) > at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2206) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.