[ 
https://issues.apache.org/jira/browse/TEZ-3103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated TEZ-3103:
----------------------------
    Attachment: TEZ-3103.001.patch

Attaching a patch that bumps the commitMemory accordingly when the mem-to-mem 
merge target segment is closed.

> Shuffle can hang when memory to memory merging enabled
> ------------------------------------------------------
>
>                 Key: TEZ-3103
>                 URL: https://issues.apache.org/jira/browse/TEZ-3103
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Critical
>         Attachments: TEZ-3103.001.patch
>
>
> The shuffle process can hang when memory to memory merging is enabled.  As 
> the memory-to-memory merge progresses it closes out the input segments which 
> in turn lowers the commitMemory associated with those segments.  However when 
> the merge completes it fails to increase the commitMemory accordingly for the 
> resulting merged segment.  This effectively "leaks" shuffle memory, and we 
> can end up in a situation where there's insufficient memory to perform any 
> more in-memory shuffles but commitMemory is too low to trigger a merge.  All 
> the fetcher threads eventually end up waiting on the merge that will never 
> occur, and the shuffle hangs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to