[ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Noguchi updated PIG-2975:
------------------------------

    Attachment: 
pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt

bq. or we can just have a special lightweight comparator that special cases 
DataByteArrays, and delegates to BinInterSedesRawComparator otherwise.

This one was faster than I expected.
414 seconds average vs the simple raw compare(including the header) of 398 
seconds.
(Much faster than my bulky union approach of 436 seconds.)

I also tried moving this special case comparator to inside 
BinInterSedesRawComparator.compare, but that jumped the runtime back to over 
600 seconds.

It's just one extra hop(method) + one extra checking(Tuple_1) but somehow jvm 
couldn't handle it well.

Adding test cases now.
                
> TestTypedMap.testOrderBy failing with incorrect result 
> -------------------------------------------------------
>
>                 Key: PIG-2975
>                 URL: https://issues.apache.org/jira/browse/PIG-2975
>             Project: Pig
>          Issue Type: Sub-task
>    Affects Versions: 0.11
>            Reporter: Koji Noguchi
>            Assignee: Koji Noguchi
>            Priority: Blocker
>             Fix For: 0.11
>
>         Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, 
> pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
>     at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main    -x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to