[ https://issues.apache.org/jira/browse/PIG-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12759284#action_12759284 ]
Olga Natkovich commented on PIG-975: ------------------------------------ Couple of questions comments on the patch: - Why do we need to synchronize in add. Who else is accessing the bag since it is no longer managed by spillable manager? - Memory fraction should be a java property so that users can control it they choose so - Why do we have limit of only 100 tuples in memory since we already have memory limit? Also, if we do need it, shouldn't it be configurable? > Need a databag that does not register with SpillableMemoryManager and spill > data pro-actively > --------------------------------------------------------------------------------------------- > > Key: PIG-975 > URL: https://issues.apache.org/jira/browse/PIG-975 > Project: Pig > Issue Type: Improvement > Affects Versions: 0.2.0 > Reporter: Ying He > Assignee: Ying He > Fix For: 0.2.0 > > Attachments: PIG-975.patch, PIG-975.patch2 > > > POPackage uses DefaultDataBag during reduce process to hold data. It is > registered with SpillableMemoryManager and prone to OutOfMemoryException. > It's better to pro-actively managers the usage of the memory. The bag fills > in memory to a specified amount, and dump the rest the disk. The amount of > memory to hold tuples is configurable. This can avoid out of memory error. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.