[
https://issues.apache.org/jira/browse/PIG-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916538#action_12916538
]
Thejas M Nair commented on PIG-1629:
------------------------------------
Similar optimization can be done for inner filter as well -
C = foreach B{ D = filter A by x > 0; generate group, MyUDF(D);}
Changes required-
- group physical/MR plan implementation to have an inner limit/filter.
- logical optimizer rules to make the load/filter an inner plan of groupp
> Need ability to limit bags produced during GROUP + LIMIT
> --------------------------------------------------------
>
> Key: PIG-1629
> URL: https://issues.apache.org/jira/browse/PIG-1629
> Project: Pig
> Issue Type: Improvement
> Reporter: Olga Natkovich
> Assignee: Thejas M Nair
> Fix For: 0.9.0
>
>
> Currently, the code below will construct the full group in memory and then
> trim it. This requires in use of more memory than needed.
> A = load 'data' as (x, y, z);
> B = group A by x;
> C = foreach B{
> D = limit A 100;
> generate group, MyUDF(D);}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.