[ https://issues.apache.org/jira/browse/PIG-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879397#comment-16879397 ]
Koji Noguchi commented on PIG-5390: ----------------------------------- Looking further, bq. all these self-spilling bags are designed on the premise that no other threads would touch them This was true till 0.11. In PIG-2923, we incorrectly added these SelfSpill bags to SpillableMemoryManager. Then for InternalCachedBag, I took it back out from SpillableMemoryManager in PIG-3147 but didn't notice about other two bags. Later, for InternalSortedBag, InternalDistinctBag, instead of taking them out from SpillableMemoryManager, we incorrectly added synchronization to these SelfSpill bags. PIG-3212 and PIG-3466 respectively. > Possible race condition from Self-spilling bags registering with > SpillableMemoryManager > ---------------------------------------------------------------------------------------- > > Key: PIG-5390 > URL: https://issues.apache.org/jira/browse/PIG-5390 > Project: Pig > Issue Type: Bug > Reporter: Koji Noguchi > Assignee: Koji Noguchi > Priority: Major > > This is a follow up from PIG-5380 where [~rohini] pointed out > {quote} > I think same change is required in InternalSortedBag as well as code is > exactly same and it can spill too - > https://github.com/apache/pig/blob/trunk/src/org/apache/pig/data/InternalSortedBag.java#L133 > . We most likely haven't seen issues with it as the probability could be > very less as it will proactively spill if it exceeds cached memory limit. > {quote} > Looking at the history and the source, this is a critical bug given all these > self-spilling bags are designed on the premise that no other threads would > touch them. Comment in the source clearly say > {code} > * This bag is not registered with SpillableMemoryManager. It calculates > * the number of tuples to hold in memory and spill pro-actively into files." > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)