[ 
https://issues.apache.org/jira/browse/PIG-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-2236.
-----------------------------

    Resolution: Won't Fix

> ORDER BY is broken when in combination with LIMIT and FLATTEN
> -------------------------------------------------------------
>
>                 Key: PIG-2236
>                 URL: https://issues.apache.org/jira/browse/PIG-2236
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0, 0.8.1
>            Reporter: Sungho Ryu
>
> ORDER BY does not correctly sort the result when used in combination with 
> LIMIT and FOREACH / FLATTEN.
> ---  Input data
> A   1000
> A   128
> A   127
> A   0
> A   1
> A   2
> B   0
> B   1
> B   128
> B   1001
> B   2
> B   127
> C   0
> C   1
> C   128
> C   1000
> C   127
> C   2
> D   0
> D   1
> D   128
> D   1000
> D   2
> D   127
> -----  Test script
> data =  LOAD 'data' AS (k:chararray, v:int);
> grouped = GROUP data BY k;
> limited = LIMIT grouped BY 2;
> output = FOREACH limited {
>         ordered = ORDER data BY v;
>         GENERATE FLATTEN(ordered);
> };
> output = LIMIT output 10000;  -- a workaround for PIG-2231
> STORE output INTO 'result';
> ---- Desired output 
> A       0
> A       1
> A       2
> A       127
> A       128
> A       1000
> B       0
> B       1
> B       2
> B       127
> B       128
> B       1001
> ---  Actual output
> A       0
> A       1
> A       128
> A       1000
> A       2
> A       127
> B       0
> B       1
> B       128
> B       1001
> B       2
> B       127
> --------------
> As the result shows, ORDER BY does not correctly sort numbers in [2,128) when 
> LIMIT is applied  before or after.
> If I remove the both of LIMIT statements, I get the correct result. (tested 
> on 0.8.0, 0.8.1)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to