[ https://issues.apache.org/jira/browse/PIG-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alan Gates updated PIG-484: --------------------------- Resolution: Fixed Status: Resolved (was: Patch Available) Patch checked in. I ran performance tests on large data and saw no significant changes. This is fine, as this change is more for scalability than performance. > PERFORMANCE: streaming data to aggregate functions > -------------------------------------------------- > > Key: PIG-484 > URL: https://issues.apache.org/jira/browse/PIG-484 > Project: Pig > Issue Type: Improvement > Affects Versions: types_branch > Reporter: Olga Natkovich > Assignee: Pradeep Kamath > Fix For: types_branch > > Attachments: PIG-484.patch > > > Currently, for queries like > A = load 'data'; > B = group A by $0; > C = foreach A generate group, MIN(A.$1), MAX (A.$1) > The data will be put into the bag before being passed to aggregate functions. > This is unnecessary and inefficient. In this case, data can be just streamed > to the functions. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.