[ https://issues.apache.org/jira/browse/PIG-746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Richard Ding updated PIG-746: ----------------------------- Attachment: PIG-746.patch As Pradeep suggested, the combiner optimizer code in this patch identifies projection of bags in the foreach following the group and in such cases decides not to use the combiner. > Works in --exectype local, fails on grid - ERROR 2113: SingleTupleBag should > never be serialized > ------------------------------------------------------------------------------------------------ > > Key: PIG-746 > URL: https://issues.apache.org/jira/browse/PIG-746 > Project: Pig > Issue Type: Bug > Reporter: David Ciemiewicz > Assignee: Richard Ding > Attachments: PIG-746.patch > > > The script below works on Pig 2.0 local mode but fails when I run the same > program on the grid. > I was attempting to create a workaround for PIG-710. > Here's the error: > {code} > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2113: > SingleTupleBag should never be serialized > or serialized. > at org.apache.pig.data.SingleTupleBag.write(SingleTupleBag.java:129) > at > org.apache.pig.data.DataReaderWriter.writeDatum(DataReaderWriter.java:147) > at org.apache.pig.data.DefaultTuple.write(DefaultTuple.java:291) > at > org.apache.pig.impl.io.PigNullableWritable.write(PigNullableWritable.java:83) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:439) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:101) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:219) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:208) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:86) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) > {code} > Here's the program: > {code} > A = load 'filterbug.data' using PigStorage() as ( id, str ); > A = foreach A generate > id, > str, > ( > str matches 'hello' or > str matches 'hello' > ? 1 : 0 > ) as matched; > describe A; > B = group A by ( id ); > describe B; > D = foreach B generate > group, > SUM(A.matched) as matchedcount, > A; > describe D; > E = filter D by matchedcount > 0; > describe E; > F = foreach E generate > FLATTEN(A); > describe F; > dump F; > {code} > Here's the data filterbug.data > {code} > a hello > a goodbye > b goodbye > c hello > c hello > c hello > e what > {code} > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.