[ 
https://issues.apache.org/jira/browse/TEZ-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang resolved TEZ-2359.
-----------------------------
    Resolution: Invalid

My mistake, it's an issue when I work on TEZ-1273, not on the master. 

> Deadlock in DAGAppMaster
> ------------------------
>
>                 Key: TEZ-2359
>                 URL: https://issues.apache.org/jira/browse/TEZ-2359
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jeff Zhang
>            Priority: Blocker
>
> {code}
> Found one Java-level deadlock:
> =============================
> "Timer-1":
>   waiting for ownable synchronizer 0x00000007cd0f8a30, (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
>   which is held by "Dispatcher thread: Central"
> "Dispatcher thread: Central":
>   waiting to lock monitor 0x00007fb829866d18 (object 0x00000007cd5ab958, a 
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService),
>   which is held by "DelayedContainerManager"
> "DelayedContainerManager":
>   waiting for ownable synchronizer 0x00000007cd0f8a30, (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
>   which is held by "Dispatcher thread: Central"
> Java stack information for the threads listed above:
> ===================================================
> "Timer-1":
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x00000007cd0f8a30> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
>       at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
>       at 
> org.apache.tez.dag.app.DAGAppMaster.checkAndHandleSessionTimeout(DAGAppMaster.java:2015)
>       - locked <0x00000007cd0f2ff0> (a org.apache.tez.dag.app.DAGAppMaster)
>       at org.apache.tez.dag.app.DAGAppMaster$3.run(DAGAppMaster.java:1825)
>       at java.util.TimerThread.mainLoop(Timer.java:555)
>       at java.util.TimerThread.run(Timer.java:505)
> "Dispatcher thread: Central":
>       at 
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService.dagComplete(YarnTaskSchedulerService.java:842)
>       - waiting to lock <0x00000007cd5ab958> (a 
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService)
>       at 
> org.apache.tez.dag.app.rm.TaskSchedulerEventHandler.dagCompleted(TaskSchedulerEventHandler.java:566)
>       at 
> org.apache.tez.dag.app.DAGAppMaster.checkForCompletion(DAGAppMaster.java:832)
>       at 
> org.apache.tez.dag.app.DAGAppMaster.access$4800(DAGAppMaster.java:201)
>       at 
> org.apache.tez.dag.app.DAGAppMaster$DAGFinishedTransition.transition(DAGAppMaster.java:2362)
>       at 
> org.apache.tez.dag.app.DAGAppMaster$DAGFinishedTransition.transition(DAGAppMaster.java:2356)
>       at 
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
>       at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>       at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>       at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>       - locked <0x00000007cd1d0208> (a 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine)
>       at org.apache.tez.dag.app.DAGAppMaster.handle(DAGAppMaster.java:510)
>       at 
> org.apache.tez.dag.app.DAGAppMaster$DAGAppMasterEventHandler.handle(DAGAppMaster.java:879)
>       at 
> org.apache.tez.dag.app.DAGAppMaster$DAGAppMasterEventHandler.handle(DAGAppMaster.java:868)
>       at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
>       at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:113)
>       at java.lang.Thread.run(Thread.java:745)
> "DelayedContainerManager":
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x00000007cd0f8a30> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
>       at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
>       at org.apache.tez.dag.app.DAGAppMaster.getState(DAGAppMaster.java:531)
>       at 
> org.apache.tez.dag.app.DAGAppMaster$RunningAppContext.getAMState(DAGAppMaster.java:1522)
>       at 
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService.assignDelayedContainer(YarnTaskSchedulerService.java:585)
>       - locked <0x00000007cd5ab958> (a 
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService)
>       at 
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService.access$600(YarnTaskSchedulerService.java:82)
>       at 
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager.run(YarnTaskSchedulerService.java:1877)
>       - locked <0x00000007cd5ab958> (a 
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService)
> Found 1 deadlock.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to