[ 
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903787#comment-14903787
 ] 

Siddharth Seth commented on TEZ-2850:
-------------------------------------

Nice find!

I believe this change was made to reduce the number of times the checksum is 
computed, and to try and compute it in chunks of 4096 for better performance. 
cc [~gopalv]

Other than the 4K buffer, there's a bunch of other objects, references etc per 
Segment - I won't be surprised if this adds up to a KB. Along with the memory 
spill limit, adding a limit on the number of in-memory segments would help.

The memory-to-memory merger would normally have helped in this case, but that's 
not tested and should not be enabled.

> Tez MergeManager OOM for small Map Outputs
> ------------------------------------------
>
>                 Key: TEZ-2850
>                 URL: https://issues.apache.org/jira/browse/TEZ-2850
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Saikat
>         Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to