[ https://issues.apache.org/jira/browse/PIG-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12802397#action_12802397 ]
Alan Gates commented on PIG-1195: --------------------------------- The sorting algorithm in DefaultComparator does not match the sorting algorithm in DefaultTuple.compare. The algorithm used here first compares the values of each column, and only considers the overall size of the tuples once one tuple has run out of fields. The algorithm used in DefaultTuple.compare first compares tuple size, then individual column values. So in this algorithm (5, 3) > (4, 3, 1), but in DefaultTuple's algorithm (5, 3) < (4, 3, 1). We should use the same algorithm in both places. > InternalSortedBag should take care of sort order > ------------------------------------------------ > > Key: PIG-1195 > URL: https://issues.apache.org/jira/browse/PIG-1195 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.6.0 > Reporter: Daniel Dai > Fix For: 0.6.0 > > Attachments: PIG-1195-1.patch, PIG-1195-2.patch > > > InternalSortedBag always use ascending order. We shall obey the sort order as > specified in the script. > For example, the following script does not do the right thing if we turn off > secondary sort (which means, we will rely on InternalSortedBag to sort): > {code} > A = load 'input' as (a0:int); > B = group A ALL; > C = foreach B { > D = order A by a0 desc; > generate D; > }; > dump C; > {code} > If we run it using the command line "java -Xmx512m > -Dpig.exec.nosecondarykey=true -jar pig.jar 1.pig". > The sort order for D is ascending. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.