[
https://issues.apache.org/jira/browse/PIG-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683714#comment-13683714
]
Jeremy Karn commented on PIG-3355:
----------------------------------
I should also mention that this bug manifests itself in a couple of different
ways. The job generally crashes at some point
where the schema doesn't match the data tuple. The most common exceptions
we've seen are like:
java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
at java.util.ArrayList.RangeCheck(ArrayList.java:547)
at java.util.ArrayList.get(ArrayList.java:322)
at org.apache.pig.data.DefaultTuple.get(DefaultTuple.java:159)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getValueTuple(POPackage.java:341)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getNext(POPackage.java:264)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:416)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:407)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:261)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
2013-06-13 15:28:14,188 java.io.IOException: Type mismatch in key from map:
expected org.apache.pig.impl.io.NullableText, recieved
org.apache.pig.impl.io.NullableBytesWritable
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:845)
at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541)
at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map.collect(PigGenericMapReduce.java:127)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:273)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:266)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
> ColumnMapKeyPrune bug with distinct operator
> --------------------------------------------
>
> Key: PIG-3355
> URL: https://issues.apache.org/jira/browse/PIG-3355
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.9.2, 0.10.1, 0.11.1
> Reporter: Jeremy Karn
> Attachments: PIG-3355.patch
>
>
> We came across a bug that happens when you have a distinct operator
> immediately followed by a union where the result of the union has at least
> one column that will be pruned by ColumnMapKeyPrune. There's a test showing
> an example script in the submitted patch.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira