[ 
https://issues.apache.org/jira/browse/PIG-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12802397#action_12802397
 ] 

Alan Gates commented on PIG-1195:
---------------------------------

The sorting algorithm in DefaultComparator does not match the sorting algorithm 
in DefaultTuple.compare.

The algorithm used here first compares the values of each column, and only 
considers the overall size of the tuples once one tuple has run out of fields.  
The algorithm used in DefaultTuple.compare first compares tuple size, then 
individual column values.  So in this algorithm (5, 3) > (4, 3, 1), but in 
DefaultTuple's algorithm (5, 3) < (4, 3, 1).  We should use the same algorithm 
in both places.

> InternalSortedBag should take care of sort order
> ------------------------------------------------
>
>                 Key: PIG-1195
>                 URL: https://issues.apache.org/jira/browse/PIG-1195
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.6.0
>            Reporter: Daniel Dai
>             Fix For: 0.6.0
>
>         Attachments: PIG-1195-1.patch, PIG-1195-2.patch
>
>
> InternalSortedBag always use ascending order. We shall obey the sort order as 
> specified in the script.
> For example, the following script does not do the right thing if we turn off 
> secondary sort (which means, we will rely on InternalSortedBag to sort):
> {code}
> A = load 'input' as (a0:int);
> B = group A ALL;
> C = foreach B {
>     D = order A by a0 desc;
>     generate D;
> };
> dump C;
> {code}
> If we run it using the command line "java -Xmx512m 
> -Dpig.exec.nosecondarykey=true -jar pig.jar 1.pig".
> The sort order for D is ascending.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to