[jira] Updated: (PIG-1645) Using both small split combination and temporary file compression on a query of ORDER BY may cause crash

Yan Zhou (JIRA) Fri, 24 Sep 2010 09:09:55 -0700

     [ 
https://issues.apache.org/jira/browse/PIG-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Yan Zhou updated PIG-1645:
--------------------------

    Attachment: PIG-1645.patch

test-core passed.

test-patch results:

     [exec] -1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
     [exec]                         Please justify why no tests are needed for 
this patch.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning 
messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
     [exec]
     [exec]     -1 release audit.  The applied patch generated 459 release 
audit warnings (more than the trunk's current 457 warnings).

The scenario is trully a corner case. The following query *might* have caused 
the problem:

A = load '/tmp/test/jsTst2.txt' as (fn, age:int);
B = load '/tmp/test/sample.txt' as (fn, age:int);
C = join A by fn, B by fn USING 'replicated';
D = ORDER C BY B::age;
dump D;

where sample.txt has only one row that contains one record that has the same 
join key as a single record in jsTst2.txt which should have size of several 
HDFS blocks. Even so, it is random to see a failure, as it depends upon whether 
any of the logically empty files is placed in the first underlying split of the 
list of splits combined. Compute nodes' host names seem to play a role too.  
Running in local mode seems to see no failure.

The 2 release audit warnings are due to jdiff. No new file added.

> Using both small split combination and temporary file compression on a query 
> of ORDER BY may cause crash
> --------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-1645
>                 URL: https://issues.apache.org/jira/browse/PIG-1645
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Yan Zhou
>            Assignee: Yan Zhou
>             Fix For: 0.8.0
>
>         Attachments: PIG-1645.patch
>
>
> The stack looks like the following:
> java.lang.NullPointerException at 
> java.util.Arrays.binarySearch(Arrays.java:2043) at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:72)
>  at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:52)
>  at 
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:565) at
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>  at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:116)
>  at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:238)
>  at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
>  at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
>  at
> org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638) at
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:314) at 
> org.apache.hadoop.mapred.Child$4.run(Child.java:217) at
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:396) at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062)
>  at
> org.apache.hadoop.mapred.Child.main(Child.java:211) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1645) Using both small split combination and temporary file compression on a query of ORDER BY may cause crash

Reply via email to