optimize bag usage
------------------
Key: PIG-49
URL: https://issues.apache.org/jira/browse/PIG-49
Project: Pig
Issue Type: Improvement
Reporter: Olga Natkovich
(1) Currently, we always bring the entire bag into memory even though in most
cases we just need to stream through it. This is very inefficient in terms of
memory and CPU usage.
(2) If we are doing multiple computations on the same group, we iterate over
the bag that represents the group several times. This is very inefficient
especially for spilled bags.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.