[ 
https://issues.apache.org/jira/browse/PIG-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Noguchi updated PIG-5390:
------------------------------
      Priority: Minor  (was: Major)
       Summary: Avoid adding self-spilling bags to SpillableMemoryManager   
(was: Possible race condition from Self-spilling bags registering with 
SpillableMemoryManager )
    Issue Type: Improvement  (was: Bug)

Given synchronization was added in PIG-3212 and PIG-3466 , I'm changing the 
summary of this Jira and lowering severity.  Question here would be, shall we 
stop adding  InternalSortedBag and  InternalDistinctBag to 
SpillableMemoryManager?

> Avoid adding self-spilling bags to SpillableMemoryManager 
> ----------------------------------------------------------
>
>                 Key: PIG-5390
>                 URL: https://issues.apache.org/jira/browse/PIG-5390
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Koji Noguchi
>            Assignee: Koji Noguchi
>            Priority: Minor
>
> This is a follow up from PIG-5380 where [~rohini] pointed out 
> {quote}
> I think same change is required in InternalSortedBag as well as code is 
> exactly same and it can spill too - 
> https://github.com/apache/pig/blob/trunk/src/org/apache/pig/data/InternalSortedBag.java#L133
>  . We most likely haven't seen issues with it as the probability could be 
> very less as it will proactively spill if it exceeds cached memory limit.
> {quote}
> Looking at the history and the source, this is a critical bug given all these 
> self-spilling bags are designed on the premise that no other threads would 
> touch them.  Comment in the source clearly say
> {code}
>  * This bag is not registered with SpillableMemoryManager. It calculates
>  * the number of tuples to hold in memory and spill pro-actively into files."
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to