[jira] [Commented] (PIG-1963) in nested foreach, accumutive udf taking input from order-by does not get results in order

Thejas M Nair (JIRA) Mon, 04 Apr 2011 15:23:47 -0700

    [ 
https://issues.apache.org/jira/browse/PIG-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015661#comment-13015661
 ]


Thejas M Nair commented on PIG-1963:
------------------------------------

MYCONCATBAG udf in the query in description concatenates the entries in the 
bag, in the order it is recieved.
When the query run with the property - pig.accumulative.batchsize=2 , 
and input -
{code}
100     apple
200     orange
300     strawberry
300     pear
100     apple
300     pear
400     apple
{code}

gives output -
{code}
(100,(100)(100),(apple)(apple))
(200,(200),(orange))
(300,(300)(300)(300),(pear)(strawberry)(pear)) -- this should be 
(300,(300)(300)(300),(pear)(pear)(strawberry))
(400,(400),(apple))
{code}

> in nested foreach, accumutive udf taking input from order-by does not get 
> results in order
> ------------------------------------------------------------------------------------------
>
>                 Key: PIG-1963
>                 URL: https://issues.apache.org/jira/browse/PIG-1963
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Thejas M Nair
>
> This happens only when secondary sort is not being used for the order-by. 
> For example -
> {code}
> a1 = load 'fruits.txt' as (f1:int,f2);
> a2 = load 'fruits.txt' as (f1:int,f2);
> b = cogroup a1 by f1, a2 by f1;
> d = foreach b {
>    sort1 = order a1 by f2;
>    sort2 = order a2 by f2; -- secondary sort not getting used here, 
> MYCONCATBAG gets results in wrong order
>    generate group, MYCONCATBAG(sort1.f1), MYCONCATBAG(sort2.f2);
> }
> -- explain d;
> dump d;
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-1963) in nested foreach, accumutive udf taking input from order-by does not get results in order

Reply via email to