[
https://issues.apache.org/jira/browse/PIG-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914128#action_12914128
]
Yan Zhou commented on PIG-1645:
-------------------------------
The problem is that both RandomSampleLoader and PossionSampleLoader have
internal states from the previous invocations that should be reset when a
different underlying split is worked on under the same umbrella split when the
split combination (PIG-1518) is on.
When temporary file compression is disabled, Pig internal storage will create
empty files which will be discarded by split combiner, making the only
non-empty split as the only split to be worked on, so it is ok in this case.
> Using both small split combination and temporary file compression on a query
> of ORDER BY may cause crash
> --------------------------------------------------------------------------------------------------------
>
> Key: PIG-1645
> URL: https://issues.apache.org/jira/browse/PIG-1645
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.0
> Reporter: Yan Zhou
> Assignee: Yan Zhou
> Fix For: 0.8.0
>
>
> The stack looks like the following:
> java.lang.NullPointerException at
> java.util.Arrays.binarySearch(Arrays.java:2043) at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:72)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:52)
> at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:565) at
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
> at
> org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638) at
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:314) at
> org.apache.hadoop.mapred.Child$4.run(Child.java:217) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.security.auth.Subject.doAs(Subject.java:396) at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062)
> at
> org.apache.hadoop.mapred.Child.main(Child.java:211)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.