[ 
https://issues.apache.org/jira/browse/TEZ-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172227#comment-17172227
 ] 

Rajesh Balamohan commented on TEZ-4211:
---------------------------------------

Thanks for the note [~abstractdog] . I had offline discussion with [~sseth] as 
well. We flush the in-memory segments to make room for reduce tasks, which may 
need further memory. Not releasing the mem segments (i.e avoiding disk spills) 
could trigger OOM in certain jobs. Easier would be tune 
"{color:#1d1c1d}tez.runtime.task.input.post-merge.buffer.percent{color}" to a 
higher value to attain the same effect as this patch (as in, it would try to 
retain the in-mem segments). This can be done on case-to-case basis depending 
on the deployment.

> Optimise MergeManager final merge
> ---------------------------------
>
>                 Key: TEZ-4211
>                 URL: https://issues.apache.org/jira/browse/TEZ-4211
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Priority: Major
>         Attachments: TEZ-4211.2.patch, TEZ-4211.wip.patch
>
>
> There are cases, when entire data is held in memory and no disk segments are 
> present in MergeManager. Currently, mergemanager spills mem segments to disk 
> before proceeding.
>  
> [https://github.com/apache/tez/blob/master/tez-runtime-library/src/main/java/org/apache/tez/runtime/library/common/shuffle/orderedgrouped/MergeManager.java#L1184]
>  
> {code:java}
> if (numMemDiskSegments > 0 && ioSortFactor > onDiskMapOutputs.size()) {
> ...
> ..
> TezMerger.writeFile(rIter, writer, progressable, 
> TezRuntimeConfiguration.TEZ_RUNTIME_RECORDS_BEFORE_PROGRESS_DEFAULT);
> ...
> ..
>  {code}
> This can be optimised not to spill to disk when only mem segments are present.
> Snippet from logs in one of the apps (Q78)
> {noformat}
>  [ShuffleAndMergeRunner {Map_1} ()] 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: 
> finalMerge with #inMemoryOutputs=4112, size=839646500 and #onDiskOutputs=0, 
> size=0
>  [ShuffleAndMergeRunner {Map_1} ()] 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: 
> finalMerge with #inMemoryOutputs=4112, size=859378362 and #onDiskOutputs=0, 
> size=0
>  [ShuffleAndMergeRunner {Map_1} ()] 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: 
> finalMerge with #inMemoryOutputs=4112, size=856145179 and #onDiskOutputs=0, 
> size=0
>  [ShuffleAndMergeRunner {Map_1} ()] 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: 
> finalMerge with #inMemoryOutputs=4112, size=849878734 and #onDiskOutputs=0, 
> size=0
>  [ShuffleAndMergeRunner {Map_1} ()] 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: 
> finalMerge with #inMemoryOutputs=4112, size=842666749 and #onDiskOutputs=0, 
> size=0
>  [ShuffleAndMergeRunner {Map_1} ()] 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: 
> finalMerge with #inMemoryOutputs=4112, size=839533127 and #onDiskOutputs=0, 
> size=0
>  [ShuffleAndMergeRunner {Map_1} ()] 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: 
> finalMerge with #inMemoryOutputs=4112, size=860448335 and #onDiskOutputs=0, 
> size=0
>  [ShuffleAndMergeRunner {Map_1} ()] 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: 
> finalMerge with #inMemoryOutputs=4112, size=844468505 and #onDiskOutputs=0, 
> size=0
>  [ShuffleAndMergeRunner {Map_1} ()] 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: 
> finalMerge with #inMemoryOutputs=4112, size=850099810 and #onDiskOutputs=0, 
> size=0
>  [ShuffleAndMergeRunner {Map_1} ()] 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: 
> finalMerge with #inMemoryOutputs=4112, size=849206236 and #onDiskOutputs=0, 
> size=0
>  [ShuffleAndMergeRunner {Map_1} ()] 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager: 
> finalMerge with #inMemoryOutputs=4112, size=840238680 and #onDiskOutputs=0, 
> size=0
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to