Alexey Ozeritskiy created KAFKA-2133:
----------------------------------------

             Summary: Deadlock in DeleteTopicsThread
                 Key: KAFKA-2133
                 URL: https://issues.apache.org/jira/browse/KAFKA-2133
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 0.8.2.1
            Reporter: Alexey Ozeritskiy
            Priority: Critical


Controller hangs after deleting multiple topics.

jstack:
1. delete-topics-thread acquired controllerLock and waiting for blocking queue:
{code}
"delete-topics-thread-2" prio=10 tid=0x00007f3a8d4e4000 nid=0x6924 waiting on 
condition [0x00007f3507684000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000047196e738> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
        at 
java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:349)
        at 
kafka.controller.ControllerChannelManager.sendRequest(ControllerChannelManager.scala:57)
        - locked <0x000000045eab3078> (a java.lang.Object)
        at 
kafka.controller.KafkaController.sendRequest(KafkaController.scala:668)
        at 
kafka.controller.ControllerBrokerRequestBatch$$anonfun$sendRequestsToBrokers$2.apply(ControllerChannelManager.scala:299)
        at 
kafka.controller.ControllerBrokerRequestBatch$$anonfun$sendRequestsToBrokers$2.apply(ControllerChannelManager.scala:291)
        at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
        at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
        at 
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
        at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
        at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
        at 
kafka.controller.ControllerBrokerRequestBatch.sendRequestsToBrokers(ControllerChannelManager.scala:291)
        at 
kafka.controller.KafkaController.sendUpdateMetadataRequest(KafkaController.scala:976)
        at 
kafka.controller.TopicDeletionManager.kafka$controller$TopicDeletionManager$$onTopicDeletion(TopicDeletionManager.scala:303)
        at 
kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1$$anonfun$apply$mcV$sp$4.apply(TopicDeletionManager.scala:424)
        at 
kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1$$anonfun$apply$mcV$sp$4.apply(TopicDeletionManager.scala:396)
        at 
scala.collection.immutable.HashSet$HashSet1.foreach(HashSet.scala:153)
        at 
scala.collection.immutable.HashSet$HashTrieSet.foreach(HashSet.scala:306)
        at 
kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1.apply$mcV$sp(TopicDeletionManager.scala:396)
        at 
kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1.apply(TopicDeletionManager.scala:390)
        at 
kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1.apply(TopicDeletionManager.scala:390)
        at kafka.utils.Utils$.inLock(Utils.scala:535)
        at 
kafka.controller.TopicDeletionManager$DeleteTopicsThread.doWork(TopicDeletionManager.scala:390)
        at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
{code}

2. Controller-2-to-broker waiting for controllerLock and cannot take messages 
from blocking queue:
{code}
"Controller-2-to-broker-3-send-thread" prio=10 tid=0x00007f3a8d4a3000 
nid=0x64d1 waiting on condition [0x00007f3507786000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x0000000468babde8> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
        at 
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
        at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
        at kafka.utils.Utils$.inLock(Utils.scala:533)
        at 
kafka.controller.TopicDeletionManager.kafka$controller$TopicDeletionManager$$deleteTopicStopReplicaCallback(TopicDeletionManager.scala:371)
        at 
kafka.controller.TopicDeletionManager$$anonfun$startReplicaDeletion$2$$anonfun$apply$3.apply(TopicDeletionManager.scala:338)
        at 
kafka.controller.TopicDeletionManager$$anonfun$startReplicaDeletion$2$$anonfun$apply$3.apply(TopicDeletionManager.scala:338)
        at 
kafka.controller.ControllerBrokerRequestBatch$$anonfun$addStopReplicaRequestForBrokers$2$$anonfun$apply$mcVI$sp$2.apply(ControllerChannelManager.scala:229)
        at 
kafka.controller.ControllerBrokerRequestBatch$$anonfun$addStopReplicaRequestForBrokers$2$$anonfun$apply$mcVI$sp$2.apply(ControllerChannelManager.scala:229)
        at 
kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:160)
        - locked <0x000000045ea2fec8> (a java.lang.Object)
        at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to