[
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904810#comment-14904810
]
Siddharth Seth commented on TEZ-2850:
-------------------------------------
I think it'll be better to add / compute another parameter which would limit
the number of segments which are retained in memory.
i.e. Spill to disk if 1) the memory threshold is exceeded, or 2) If #segments
limit is reached.
This could be a configurable parameter - which serves more as an upper limit.
We should try capping the value based on a rough estimate of the size of
segments.
The JVM size cannot be used as an available memory parameter, since multiple
Inputs/Outputs could be running in the same JVM. We could limit this to a small
fraction of the allocated memory for the shuffle.
> Tez MergeManager OOM for small Map Outputs
> ------------------------------------------
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Saikat
> Assignee: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)