[ https://issues.apache.org/jira/browse/PIG-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14306055#comment-14306055 ]
Rohini Palaniswamy commented on PIG-4392: ----------------------------------------- +1. This is still a good fix to have on the client side in addition to PIG-4220. > RANK BY fails when default_parallel is greater than cardinality of field > being ranked by > ---------------------------------------------------------------------------------------- > > Key: PIG-4392 > URL: https://issues.apache.org/jira/browse/PIG-4392 > Project: Pig > Issue Type: Bug > Affects Versions: 0.11.1 > Reporter: Anthony Hsu > Assignee: Daniel Dai > Fix For: 0.15.0 > > Attachments: PIG-4392-1.patch, PIG-4392-2.patch > > > To reproduce: > {code:title=input.txt} > 1 2 3 > 4 5 6 > 7 8 9 > {code} > {code:title=rank.pig} > set default_parallel 4; > d = load 'input.txt' using PigStorage(' ') as (a:int, b:int, c:int); > e = rank d by a; > dump e; > {code} > If {{default_parallel}} is set to {{3}}, the script succeeds. So I'm guessing > RANK BY has issues if the {{default_parallel}} exceeds the cardinality of the > field being ranked by. > I'm seeing this issue with Pig 0.11.1 (which has the PIG-2932 patch applied) > and Hadoop 2.3.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)