[jira] Commented: (PIG-1645) Using both small split combination and temporary file compression on a query of ORDER BY may cause crash
[ https://issues.apache.org/jira/browse/PIG-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914556#action_12914556 ] Thejas M Nair commented on PIG-1645: +1 > Using both small split combination and temporary file compression on a query > of ORDER BY may cause crash > > > Key: PIG-1645 > URL: https://issues.apache.org/jira/browse/PIG-1645 > Project: Pig > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Yan Zhou >Assignee: Yan Zhou > Fix For: 0.8.0 > > Attachments: PIG-1645.patch > > > The stack looks like the following: > java.lang.NullPointerException at > java.util.Arrays.binarySearch(Arrays.java:2043) at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:72) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:52) > at > org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:565) at > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) > at > org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:314) at > org.apache.hadoop.mapred.Child$4.run(Child.java:217) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:396) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062) > at > org.apache.hadoop.mapred.Child.main(Child.java:211) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1645) Using both small split combination and temporary file compression on a query of ORDER BY may cause crash
[ https://issues.apache.org/jira/browse/PIG-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914541#action_12914541 ] Yan Zhou commented on PIG-1645: --- The possibility of failure also depends upon the block distribution since the split combination makes use of that info. > Using both small split combination and temporary file compression on a query > of ORDER BY may cause crash > > > Key: PIG-1645 > URL: https://issues.apache.org/jira/browse/PIG-1645 > Project: Pig > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Yan Zhou >Assignee: Yan Zhou > Fix For: 0.8.0 > > Attachments: PIG-1645.patch > > > The stack looks like the following: > java.lang.NullPointerException at > java.util.Arrays.binarySearch(Arrays.java:2043) at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:72) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:52) > at > org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:565) at > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) > at > org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:314) at > org.apache.hadoop.mapred.Child$4.run(Child.java:217) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:396) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062) > at > org.apache.hadoop.mapred.Child.main(Child.java:211) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1645) Using both small split combination and temporary file compression on a query of ORDER BY may cause crash
[ https://issues.apache.org/jira/browse/PIG-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914128#action_12914128 ] Yan Zhou commented on PIG-1645: --- The problem is that both RandomSampleLoader and PossionSampleLoader have internal states from the previous invocations that should be reset when a different underlying split is worked on under the same umbrella split when the split combination (PIG-1518) is on. When temporary file compression is disabled, Pig internal storage will create empty files which will be discarded by split combiner, making the only non-empty split as the only split to be worked on, so it is ok in this case. > Using both small split combination and temporary file compression on a query > of ORDER BY may cause crash > > > Key: PIG-1645 > URL: https://issues.apache.org/jira/browse/PIG-1645 > Project: Pig > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Yan Zhou >Assignee: Yan Zhou > Fix For: 0.8.0 > > > The stack looks like the following: > java.lang.NullPointerException at > java.util.Arrays.binarySearch(Arrays.java:2043) at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:72) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:52) > at > org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:565) at > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53) > at > org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:314) at > org.apache.hadoop.mapred.Child$4.run(Child.java:217) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:396) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062) > at > org.apache.hadoop.mapred.Child.main(Child.java:211) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.