Jeff Zhang created TEZ-2359:
-------------------------------

             Summary: Deadlock in DAGAppMaster
                 Key: TEZ-2359
                 URL: https://issues.apache.org/jira/browse/TEZ-2359
             Project: Apache Tez
          Issue Type: Bug
            Reporter: Jeff Zhang


{code}
Found one Java-level deadlock:
=============================
"Timer-1":
  waiting for ownable synchronizer 0x00000007cd0f8a30, (a 
java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
  which is held by "Dispatcher thread: Central"
"Dispatcher thread: Central":
  waiting to lock monitor 0x00007fb829866d18 (object 0x00000007cd5ab958, a 
org.apache.tez.dag.app.rm.YarnTaskSchedulerService),
  which is held by "DelayedContainerManager"
"DelayedContainerManager":
  waiting for ownable synchronizer 0x00000007cd0f8a30, (a 
java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
  which is held by "Dispatcher thread: Central"

Java stack information for the threads listed above:
===================================================
"Timer-1":
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000007cd0f8a30> (a 
java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
        at 
java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
        at 
org.apache.tez.dag.app.DAGAppMaster.checkAndHandleSessionTimeout(DAGAppMaster.java:2015)
        - locked <0x00000007cd0f2ff0> (a org.apache.tez.dag.app.DAGAppMaster)
        at org.apache.tez.dag.app.DAGAppMaster$3.run(DAGAppMaster.java:1825)
        at java.util.TimerThread.mainLoop(Timer.java:555)
        at java.util.TimerThread.run(Timer.java:505)
"Dispatcher thread: Central":
        at 
org.apache.tez.dag.app.rm.YarnTaskSchedulerService.dagComplete(YarnTaskSchedulerService.java:842)
        - waiting to lock <0x00000007cd5ab958> (a 
org.apache.tez.dag.app.rm.YarnTaskSchedulerService)
        at 
org.apache.tez.dag.app.rm.TaskSchedulerEventHandler.dagCompleted(TaskSchedulerEventHandler.java:566)
        at 
org.apache.tez.dag.app.DAGAppMaster.checkForCompletion(DAGAppMaster.java:832)
        at 
org.apache.tez.dag.app.DAGAppMaster.access$4800(DAGAppMaster.java:201)
        at 
org.apache.tez.dag.app.DAGAppMaster$DAGFinishedTransition.transition(DAGAppMaster.java:2362)
        at 
org.apache.tez.dag.app.DAGAppMaster$DAGFinishedTransition.transition(DAGAppMaster.java:2356)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        - locked <0x00000007cd1d0208> (a 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine)
        at org.apache.tez.dag.app.DAGAppMaster.handle(DAGAppMaster.java:510)
        at 
org.apache.tez.dag.app.DAGAppMaster$DAGAppMasterEventHandler.handle(DAGAppMaster.java:879)
        at 
org.apache.tez.dag.app.DAGAppMaster$DAGAppMasterEventHandler.handle(DAGAppMaster.java:868)
        at 
org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
        at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:113)
        at java.lang.Thread.run(Thread.java:745)
"DelayedContainerManager":
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000007cd0f8a30> (a 
java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
        at 
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
        at org.apache.tez.dag.app.DAGAppMaster.getState(DAGAppMaster.java:531)
        at 
org.apache.tez.dag.app.DAGAppMaster$RunningAppContext.getAMState(DAGAppMaster.java:1522)
        at 
org.apache.tez.dag.app.rm.YarnTaskSchedulerService.assignDelayedContainer(YarnTaskSchedulerService.java:585)
        - locked <0x00000007cd5ab958> (a 
org.apache.tez.dag.app.rm.YarnTaskSchedulerService)
        at 
org.apache.tez.dag.app.rm.YarnTaskSchedulerService.access$600(YarnTaskSchedulerService.java:82)
        at 
org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager.run(YarnTaskSchedulerService.java:1877)
        - locked <0x00000007cd5ab958> (a 
org.apache.tez.dag.app.rm.YarnTaskSchedulerService)

Found 1 deadlock.
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to