[ 
https://issues.apache.org/jira/browse/PIG-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916538#action_12916538
 ] 

Thejas M Nair commented on PIG-1629:
------------------------------------

Similar optimization can be done for inner filter as well - 
C = foreach B{ D = filter A by x > 0; generate group, MyUDF(D);}

Changes required-
- group physical/MR plan implementation to have an inner limit/filter.
- logical optimizer rules to make the load/filter an inner plan of groupp


> Need ability to limit bags produced during GROUP + LIMIT
> --------------------------------------------------------
>
>                 Key: PIG-1629
>                 URL: https://issues.apache.org/jira/browse/PIG-1629
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Olga Natkovich
>            Assignee: Thejas M Nair
>             Fix For: 0.9.0
>
>
> Currently, the code below will construct the full group in memory and then 
> trim it. This requires in use of more memory than needed.
> A = load 'data' as (x, y, z);
> B = group A by x;
> C = foreach B{
> D = limit A 100;
> generate group, MyUDF(D);}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to