Anthony Hsu created PIG-4392:
--------------------------------

             Summary: RANK BY fails when default_parallel is greater than 
cardinality of field being ranked by
                 Key: PIG-4392
                 URL: https://issues.apache.org/jira/browse/PIG-4392
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.11.1
            Reporter: Anthony Hsu


To reproduce:
{code:title=input.txt}
1 2 3
4 5 6
7 8 9
{code}
{code:title=rank.pig}
set default_parallel 4;

d = load 'input.txt' using PigStorage(' ') as (a:int, b:int, c:int);
e = rank d by a;
dump e;
{code}
If {{default_parallel}} is set to {{3}}, the script succeeds. So I'm guessing 
RANK BY has issues if the {{default_parallel}} exceeds the cardinality of the 
field being ranked by.

I'm seeing this issue with Pig 0.11.1 (which has the PIG-2932 patch applied) 
and Hadoop 2.3.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to