[
https://issues.apache.org/jira/browse/PIG-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729882#action_12729882
]
Pradeep Kamath commented on PIG-880:
------------------------------------
The root cause of this issue is that in interpreting map data, PigStorage
returns values in the map to be of the type that it deduces based on the data.
So string data for values are returned as String, integer values are returned
as Integer. However the logical layer in Pig assumes the type of the values in
the map to be ByteArray since it cannot assume any type. If one of the sampled
values forming the quantile list is a null, it is assumed to be of type of the
reduce key of the final order by job. In this case, since the order by key is
smap#'name', it is thought to be of type ByteArray. However the values
resulting from the map lookup are actually of type String. This mismatch
results in the above exception - if nulls are filtered out, map.collect() fails
because hadoop thinks the map key type is bytearray but it gets a Text (string).
A proposal to fix this is to Change TextDataParser which is used by PigStorage
for reading map data to return ByteArray type for the values in the map.
Thoughts?
> Order by is borken with complex fields
> --------------------------------------
>
> Key: PIG-880
> URL: https://issues.apache.org/jira/browse/PIG-880
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.3.0
> Reporter: Olga Natkovich
> Fix For: 0.4.0
>
>
> Pig script:
> a = load 'studentcomplextab10k' as (smap:map[],c2,c3);
> f = foreach a generate smap#'name, smap#'age', smap#'gpa' ;
> s = order f by $0;
> store s into 'sc.out'
> Stack:
> Caused by: java.lang.ArrayStoreException
> at java.lang.System.arraycopy(Native Method)
> at java.util.Arrays.copyOf(Arrays.java:2763)
> at java.util.ArrayList.toArray(ArrayList.java:305)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.convertToArray(WeightedRangePartitioner.java:154)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:96)
> ... 5 more
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:230)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:179)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204)
> at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265)
> at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:769)
> at org.apache.pig.PigServer.execute(PigServer.java:762)
> at org.apache.pig.PigServer.access$100(PigServer.java:91)
> at org.apache.pig.PigServer$Graph.execute(PigServer.java:933)
> at org.apache.pig.PigServer.executeBatch(PigServer.java:245)
> at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
> at org.apache.pig.Main.main(Main.java:389)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.