[
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903787#comment-14903787
]
Siddharth Seth commented on TEZ-2850:
-------------------------------------
Nice find!
I believe this change was made to reduce the number of times the checksum is
computed, and to try and compute it in chunks of 4096 for better performance.
cc [~gopalv]
Other than the 4K buffer, there's a bunch of other objects, references etc per
Segment - I won't be surprised if this adds up to a KB. Along with the memory
spill limit, adding a limit on the number of in-memory segments would help.
The memory-to-memory merger would normally have helped in this case, but that's
not tested and should not be enabled.
> Tez MergeManager OOM for small Map Outputs
> ------------------------------------------
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png, TEZ-2850_test.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)