[ https://issues.apache.org/jira/browse/PIG-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Koji Noguchi updated PIG-5390: ------------------------------ Priority: Minor (was: Major) Summary: Avoid adding self-spilling bags to SpillableMemoryManager (was: Possible race condition from Self-spilling bags registering with SpillableMemoryManager ) Issue Type: Improvement (was: Bug) Given synchronization was added in PIG-3212 and PIG-3466 , I'm changing the summary of this Jira and lowering severity. Question here would be, shall we stop adding InternalSortedBag and InternalDistinctBag to SpillableMemoryManager? > Avoid adding self-spilling bags to SpillableMemoryManager > ---------------------------------------------------------- > > Key: PIG-5390 > URL: https://issues.apache.org/jira/browse/PIG-5390 > Project: Pig > Issue Type: Improvement > Reporter: Koji Noguchi > Assignee: Koji Noguchi > Priority: Minor > > This is a follow up from PIG-5380 where [~rohini] pointed out > {quote} > I think same change is required in InternalSortedBag as well as code is > exactly same and it can spill too - > https://github.com/apache/pig/blob/trunk/src/org/apache/pig/data/InternalSortedBag.java#L133 > . We most likely haven't seen issues with it as the probability could be > very less as it will proactively spill if it exceeds cached memory limit. > {quote} > Looking at the history and the source, this is a critical bug given all these > self-spilling bags are designed on the premise that no other threads would > touch them. Comment in the source clearly say > {code} > * This bag is not registered with SpillableMemoryManager. It calculates > * the number of tuples to hold in memory and spill pro-actively into files." > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)