[
https://issues.apache.org/jira/browse/TEZ-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903481#comment-14903481
]
Saikat edited comment on TEZ-2850 at 9/22/15 9:35 PM:
------------------------------------------------------
This is a unique scenario, that we faced, while running a Tez Job.
A reducer vertex task fetches around 200000 map outputs, each of around ~100
odd bytes.
So total mapoutput size is around 200000 * 100 ~ 20Mb.
The MergeManager has a merge threshold check, where if it crosses this
threshold, InmemoryMerger will be triggered and it will merge & spill the
inmemory fetched map outputs to disk to free up memory.
In our scenario, mergethreshold(~500mb) >> commitMemory(~20mb), So inMemory
merger never gets triggerd.
Finally when the finalMerge() is called in close(), MergeManager calls
createInMemorySegments() to do the final merge.
In this, when Tez creates a IFileInputStream object for the InMemoryReader, the
IFileInputStream allocates a buffer of size 4096(hard coded).
Thus the total size of a single inmemory segment comes to around 5kb, even
though data in this segment is only in order of 100 bytes. So, for 200000 map
outputs, the total size is 200000 * 5000 ~ 1G, which causes OOM!
Attached is a snapshot of the heap dump which shows this scenario.
was (Author: saikatr):
This is a unique scenario, that we faced, while running a Tez Job.
A reducer vertex task fetches around 200000 map outputs, each of around ~100
odd bytes.
So total mapoutput size is around 200000 * 100 ~ 20Mb.
The MergeManager has a merge threshold check, where if it crosses this
threshold, InmemoryMerger will be triggered and it will spill the inmemory
fetched map outputs to disk to free up memory.
In our scenario, mergethreshold(~500mb) >> commitMemory(~20mb), So inMemory
merger never gets triggerd.
Finally when the finalMerge() is called in close(), MergeManager calls
createInMemorySegments() to do the final merge.
In this, when Tez creates a IFileInputStream object for the InMemoryReader, the
IFileInputStream allocates a buffer of size 4096(hard coded).
Thus the total size of a single inmemory segment comes to around 5kb, even
though data in this segment is only in order of 100 bytes. So, for 200000 map
outputs, the total size is 200000 * 5000 ~ 1G, which causes OOM!
Attached is a snapshot of the heap dump which shows this scenario.
> Tez MergeManager OOM for small Map Outputs
> ------------------------------------------
>
> Key: TEZ-2850
> URL: https://issues.apache.org/jira/browse/TEZ-2850
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Saikat
> Attachments: OOM_1.png, OOM_2.png, OOM_3.png
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)