[ https://issues.apache.org/jira/browse/PIG-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729882#action_12729882 ]
Pradeep Kamath commented on PIG-880: ------------------------------------ The root cause of this issue is that in interpreting map data, PigStorage returns values in the map to be of the type that it deduces based on the data. So string data for values are returned as String, integer values are returned as Integer. However the logical layer in Pig assumes the type of the values in the map to be ByteArray since it cannot assume any type. If one of the sampled values forming the quantile list is a null, it is assumed to be of type of the reduce key of the final order by job. In this case, since the order by key is smap#'name', it is thought to be of type ByteArray. However the values resulting from the map lookup are actually of type String. This mismatch results in the above exception - if nulls are filtered out, map.collect() fails because hadoop thinks the map key type is bytearray but it gets a Text (string). A proposal to fix this is to Change TextDataParser which is used by PigStorage for reading map data to return ByteArray type for the values in the map. Thoughts? > Order by is borken with complex fields > -------------------------------------- > > Key: PIG-880 > URL: https://issues.apache.org/jira/browse/PIG-880 > Project: Pig > Issue Type: Bug > Affects Versions: 0.3.0 > Reporter: Olga Natkovich > Fix For: 0.4.0 > > > Pig script: > a = load 'studentcomplextab10k' as (smap:map[],c2,c3); > f = foreach a generate smap#'name, smap#'age', smap#'gpa' ; > s = order f by $0; > store s into 'sc.out' > Stack: > Caused by: java.lang.ArrayStoreException > at java.lang.System.arraycopy(Native Method) > at java.util.Arrays.copyOf(Arrays.java:2763) > at java.util.ArrayList.toArray(ArrayList.java:305) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.convertToArray(WeightedRangePartitioner.java:154) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.configure(WeightedRangePartitioner.java:96) > ... 5 more > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:230) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:179) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:204) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:769) > at org.apache.pig.PigServer.execute(PigServer.java:762) > at org.apache.pig.PigServer.access$100(PigServer.java:91) > at org.apache.pig.PigServer$Graph.execute(PigServer.java:933) > at org.apache.pig.PigServer.executeBatch(PigServer.java:245) > at > org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:112) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:140) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88) > at org.apache.pig.Main.main(Main.java:389) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.