Olga Natkovich commented on PIG-975:

Couple of questions comments on the patch:

- Why do we need to synchronize in add. Who else is accessing the bag since it 
is no longer managed by spillable manager?
- Memory fraction should be a java property so that users can control it they 
choose so
- Why do we have limit of only 100 tuples in memory since we already have 
memory limit? Also, if we do need it, shouldn't it be configurable?

> Need a databag that does not register with SpillableMemoryManager and spill 
> data pro-actively
> ---------------------------------------------------------------------------------------------
>                 Key: PIG-975
>                 URL: https://issues.apache.org/jira/browse/PIG-975
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.2.0
>            Reporter: Ying He
>            Assignee: Ying He
>             Fix For: 0.2.0
>         Attachments: PIG-975.patch, PIG-975.patch2
> POPackage uses DefaultDataBag during reduce process to hold data. It is 
> registered with SpillableMemoryManager and prone to OutOfMemoryException.  
> It's better to pro-actively managers the usage of the memory. The bag fills 
> in memory to a specified amount, and dump the rest the disk.  The amount of 
> memory to hold tuples is configurable. This can avoid out of memory error.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to