[ https://issues.apache.org/jira/browse/PIG-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14302792#comment-14302792 ]
Anthony Hsu commented on PIG-4392: ---------------------------------- Patch looks good to me. Just some minor comments and questions: * Add a space after the semicolons in the for loop declaration: {{for (int i=0;i<job.getJob().getNumReduceTasks();i++) {}} * Is the order of the tuples in {{iter}} in the test case guaranteed? * Why does the order of the tuples get reversed? > RANK BY fails when default_parallel is greater than cardinality of field > being ranked by > ---------------------------------------------------------------------------------------- > > Key: PIG-4392 > URL: https://issues.apache.org/jira/browse/PIG-4392 > Project: Pig > Issue Type: Bug > Affects Versions: 0.11.1 > Reporter: Anthony Hsu > Assignee: Daniel Dai > Fix For: 0.15.0 > > Attachments: PIG-4392-1.patch > > > To reproduce: > {code:title=input.txt} > 1 2 3 > 4 5 6 > 7 8 9 > {code} > {code:title=rank.pig} > set default_parallel 4; > d = load 'input.txt' using PigStorage(' ') as (a:int, b:int, c:int); > e = rank d by a; > dump e; > {code} > If {{default_parallel}} is set to {{3}}, the script succeeds. So I'm guessing > RANK BY has issues if the {{default_parallel}} exceeds the cardinality of the > field being ranked by. > I'm seeing this issue with Pig 0.11.1 (which has the PIG-2932 patch applied) > and Hadoop 2.3.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)