[ 
https://issues.apache.org/jira/browse/PIG-746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-746:
-----------------------------

    Attachment: PIG-746.patch

As Pradeep suggested, the combiner optimizer code in this patch identifies 
projection of bags in the foreach following the group and in such cases decides 
not to use the combiner. 

> Works in --exectype local, fails on grid - ERROR 2113: SingleTupleBag should 
> never be serialized
> ------------------------------------------------------------------------------------------------
>
>                 Key: PIG-746
>                 URL: https://issues.apache.org/jira/browse/PIG-746
>             Project: Pig
>          Issue Type: Bug
>            Reporter: David Ciemiewicz
>            Assignee: Richard Ding
>         Attachments: PIG-746.patch
>
>
> The script below works on Pig 2.0 local mode but fails when I run the same 
> program on the grid.
> I was attempting to create a workaround for PIG-710.
> Here's the error:
> {code}
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2113: 
> SingleTupleBag should never be serialized
> or serialized.
>         at org.apache.pig.data.SingleTupleBag.write(SingleTupleBag.java:129)
>         at 
> org.apache.pig.data.DataReaderWriter.writeDatum(DataReaderWriter.java:147)
>         at org.apache.pig.data.DefaultTuple.write(DefaultTuple.java:291)
>         at 
> org.apache.pig.impl.io.PigNullableWritable.write(PigNullableWritable.java:83)
>         at
> org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90)
>         at
> org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77)
>         at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:439)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:101)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:219)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:208)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:86)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
> {code}
> Here's the program:
> {code}
> A = load 'filterbug.data' using PigStorage() as ( id, str );
> A = foreach A generate
>         id,
>         str,
>         (
>         str matches 'hello' or
>         str matches 'hello'
>         ? 1 : 0
>         )                       as matched;
> describe A;
> B = group A by ( id );
> describe B;
> D = foreach B generate
>         group,
>         SUM(A.matched)  as matchedcount,
>         A;
> describe D;
> E = filter D by matchedcount > 0;
> describe E;
> F = foreach E generate
>         FLATTEN(A);
> describe F;
> dump F;
> {code}
> Here's the data filterbug.data
> {code}
> a       hello
> a       goodbye
> b       goodbye
> c       hello
> c       hello
> c       hello
> e       what
> {code}
>               

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to