[jira] Updated: (PIG-454) group by followed by group ALL causes error in reduce

Alan Gates (JIRA) Wed, 24 Sep 2008 17:20:06 -0700

     [ 
https://issues.apache.org/jira/browse/PIG-454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Alan Gates updated PIG-454:
---------------------------

    Attachment: PIG-454.patch

CombinerOptimizer is a visitor that walks the entire plan of MapReduceOpers.  
It was not resetting state as it visited each operator, causing it to get 
confused on the key to set in the combiner in cases where there were multiple 
ops that could use the combiner.

> group by followed by group ALL causes error in reduce
> -----------------------------------------------------
>
>                 Key: PIG-454
>                 URL: https://issues.apache.org/jira/browse/PIG-454
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Pradeep Kamath
>            Assignee: Alan Gates
>             Fix For: types_branch
>
>         Attachments: PIG-454.patch
>
>
> Script:
> {code}
> a = load 'st10k' as (name, age, gpa);
> b = group a by name;
> c = foreach b generate flatten(group), COUNT(a) as cnt;
> d = group c all;
> e = foreach d generate AVG(c.cnt);
> dump e;
> {code}
> Error:
> {noformat}
> 2008-09-23 17:58:12,002 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>  - Job failed!
> 2008-09-23 17:58:12,004 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error 
> message from task (map) tip_200809051428_0117_m_000000java.io.IOException: 
> wrong key class: org.apache.pig.impl.io.NullableTuple is not class 
> org.apache.pig.impl.io.NullableText
>         at 
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:995)
>         at 
> org.apache.hadoop.mapred.MapTask$CombineOutputCollector.collect(MapTask.java:1079)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:155)
>         at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine.reduce(PigCombiner.java:56)
>         at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.combineAndSpill(MapTask.java:872)
>         at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:779)
>         at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:691)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:220)
>         at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124)
> ...
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-454) group by followed by group ALL causes error in reduce

Reply via email to