[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904810#comment-14904810
 ] 

Siddharth Seth commented on TEZ-2850:
-------------------------------------

I think it'll be better to add / compute another parameter which would limit 
the number of segments which are retained in memory.
i.e. Spill to disk if 1) the memory threshold is exceeded, or 2) If #segments 
limit is reached. 

This could be a configurable parameter - which serves more as an upper limit. 
We should try capping the value based on a rough estimate of the size of 
segments.
The JVM size cannot be used as an available memory parameter, since multiple 
Inputs/Outputs could be running in the same JVM. We could limit this to a small 
fraction of the allocated memory for the shuffle.

> Tez MergeManager OOM for small Map Outputs
> ------------------------------------------
>
>                 Key: TEZ-2850
>                 URL: https://issues.apache.org/jira/browse/TEZ-2850
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Saikat
>            Assignee: Saikat
>         Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to