[jira] Commented: (PIG-1108) Incorrect map output key type in MultiQuery optimization

Ankur (JIRA) Wed, 25 Nov 2009 23:41:05 -0800

    [ 
https://issues.apache.org/jira/browse/PIG-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782787#action_12782787
 ]


Ankur commented on PIG-1108:
----------------------------

In my test run on 0.6.0 branch, disabling MQ did not work. Pig client logs 
showed that MQ was still kicking in and the mappers failed with the same error 
message as in description. It will be good if we can add few points about 
"SecondaryKey" here - 
http://wiki.apache.org/pig/PigMultiQueryPerformanceSpecification

> Incorrect map output key type in MultiQuery optimization
> --------------------------------------------------------
>
>                 Key: PIG-1108
>                 URL: https://issues.apache.org/jira/browse/PIG-1108
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Ankur
>            Assignee: Richard Ding
>
> When trying to merge 2 split plans, one of which never progresses along an 
> M/R boundary, PIG sets the map-output key type incorrectly resulting in the 
> following error:-
> java.io.IOException: Type mismatch in key from map: expected 
> org.apache.pig.impl.io.NullableText, recieved 
> org.apache.pig.impl.io.NullableTuple
>       at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:807)
>       at 
> org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:108)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:249)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:238)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:93)
>       at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>       at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>       at org.apache.hadoop.mapred.Child.main(Child.java:159)
> Here is a small script to be used a reproducible test case
> rmf plan1
> rmf plan2
> A = LOAD 'data' USING PigStorage() as (a: int, b: chararray);
> SPLIT A into plan1 IF (a>5), plan2 IF (a<5);
> B = GROUP plan1 BY b;
> C = FOREACH B {
>               tmp = ORDER plan1 BY a desc;
>               GENERATE FLATTEN(group) as b, tmp;
>               };
> D = FILTER C BY b is not null;
> STORE D into 'plan1';
> STORE plan2 into 'plan2';

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1108) Incorrect map output key type in MultiQuery optimization

Reply via email to