[ 
https://issues.apache.org/jira/browse/PIG-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879477#comment-16879477
 ] 

Rohini Palaniswamy commented on PIG-5390:
-----------------------------------------

bq. Question here would be, shall we stop adding InternalSortedBag and 
InternalDistinctBag to SpillableMemoryManager
  No we cannot take out that spill. There could be multiple bags, user udfs or 
multiple input and output sort buffers for a vertex in Tez causing memory 
pressure and we will have to spill to avoid OOM. proactive spill will not kick 
in that case.  We need to have proactive_spill as well as we don’t want it to 
grow too much beyond it's memory limits and end up causing up full spill of all 
bags.

 We should just remove the misleading comment that these bags don't spill via 
SpillableMemoryManager. 

> Avoid adding self-spilling bags to SpillableMemoryManager 
> ----------------------------------------------------------
>
>                 Key: PIG-5390
>                 URL: https://issues.apache.org/jira/browse/PIG-5390
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Koji Noguchi
>            Assignee: Koji Noguchi
>            Priority: Minor
>
> This is a follow up from PIG-5380 where [~rohini] pointed out 
> {quote}
> I think same change is required in InternalSortedBag as well as code is 
> exactly same and it can spill too - 
> https://github.com/apache/pig/blob/trunk/src/org/apache/pig/data/InternalSortedBag.java#L133
>  . We most likely haven't seen issues with it as the probability could be 
> very less as it will proactively spill if it exceeds cached memory limit.
> {quote}
> Looking at the history and the source, this is a critical bug given all these 
> self-spilling bags are designed on the premise that no other threads would 
> touch them.  Comment in the source clearly say
> {code}
>  * This bag is not registered with SpillableMemoryManager. It calculates
>  * the number of tuples to hold in memory and spill pro-actively into files."
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to