[ 
https://issues.apache.org/jira/browse/TEZ-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-3297:
----------------------------------
    Attachment: TEZ-3297.2.branch-0.7.patch

Thanks a lot [~sseth], [~bikassaha]. Will commit it shortly.

[~jeagles] - Attaching patch for branch-0.7 as well. Will commit it to 
branch-0.7

> Deadlock scenario in AM during ShuffleVertexManager auto reduce
> ---------------------------------------------------------------
>
>                 Key: TEZ-3297
>                 URL: https://issues.apache.org/jira/browse/TEZ-3297
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Zhiyuan Yang
>            Priority: Critical
>         Attachments: TEZ-3297.1.patch, TEZ-3297.2.branch-0.7.patch, 
> TEZ-3297.2.patch, am_log, thread_dump
>
>
> Here is what's happening in the attached thread dump.
> App Pool thread #9 does the auto reduce on V2 and initializes the new edge 
> manager, it holds the V2 write lock and wants read lock of source vertex V1. 
> At the same time, another App Pool thread #2 schedules a task of V1 and gets 
> the output spec, so it holds the V1 read lock and wants V2 read lock. 
> Also, dispatcher thread wants the V1 write lock to begin the state machine 
> transition. Since dispatcher thread is at the head of V1 ReadWriteLock queue, 
> thread #9 cannot get V1 read lock even thread #2 is holding V1 read lock. 
> This is a circular lock scenario. #2 blocks dispatcher, dispatcher blocks #9, 
> and #9 blocks #2.
> There is no problem with ReadWriteLock behavior in this case. Please see this 
> java bug report, http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6816565.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to