[ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478611#comment-13478611
 ] 

Koji Noguchi commented on PIG-2975:
-----------------------------------

bq.  you can use BinSedesTuple.BinInterSedesTupleRawComparator and it works 
fine. 

Ah. That's how we can work around the problem. Thanks Jonathan! 

Now, stepping back a bit.  Isn't it the case that original dev intentionally 
used BytesWritable.Comparator for performance purposes knowingly sacrificing 
the sort order? Instantiating an object for every compare is known to be slow.  
I lack the overall picture of how often this NullableBytesWritable sorting is 
used but assuming this is the case.

We are talking about two different issues here.
* Result incorrect (when order-by used).  [0.11 and trunk]
* Sort order of bytearray sometimes incorrect. [in all of recent branches 
including 0.8,0.9 etc]

It would be nice if we can solve both without sacrificing the performance.  
However, given that 0.11 is already rolled and any complicate change can lead 
to new unexpected bugs, can we revert PIG-2862 for 0.11 so that we can keep the 
previous behavior?

                
> TestTypedMap.testOrderBy failing with incorrect result 
> -------------------------------------------------------
>
>                 Key: PIG-2975
>                 URL: https://issues.apache.org/jira/browse/PIG-2975
>             Project: Pig
>          Issue Type: Sub-task
>    Affects Versions: 0.11
>            Reporter: Koji Noguchi
>            Assignee: Koji Noguchi
>            Priority: Blocker
>             Fix For: 0.11
>
>         Attachments: PIG-2975-0_jco.patch, pig-2975-trunk_v01.txt, 
> pig-2975-trunk_v02-broken.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
>     at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main    -x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to