[ 
https://issues.apache.org/jira/browse/KAFKA-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17017827#comment-17017827
 ] 

leibo commented on KAFKA-6879:
------------------------------

Hello [~hachikuji],  I have met this issue on kafka 2.1.1 many times , and the 
description is here https://issues.apache.org/jira/browse/KAFKA-8532

 

So I think the controller deadlock problem is not solved completely.

> Controller deadlock following session expiration
> ------------------------------------------------
>
>                 Key: KAFKA-6879
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6879
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 1.1.0
>            Reporter: Jason Gustafson
>            Assignee: Jason Gustafson
>            Priority: Critical
>             Fix For: 1.1.1, 2.0.0
>
>
> We have observed an apparent deadlock situation which occurs following a 
> session expiration. The suspected deadlock occurs between the zookeeper 
> "initializationLock" and the latch inside the Expire event which we use to 
> ensure all events have been handled.
> In the logs, we see the "Session expired" message following acquisition of 
> the initialization lock: 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/zookeeper/ZooKeeperClient.scala#L358
> But we never see any logs indicating that the new session is being 
> initialized. In fact, the controller logs are basically empty from that point 
> on. The problem we suspect is that completion of the 
> {{beforeInitializingSession}} callback requires that all events have finished 
> processing in order to count down the latch: 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/KafkaController.scala#L1525.
> But an event which was dequeued just prior to the acquisition of the write 
> lock may be unable to complete because it is awaiting acquisition of the 
> initialization lock: 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/zookeeper/ZooKeeperClient.scala#L137.
> The impact is that the broker continues in a zombie state. It continues 
> fetching and is periodically added to ISRs, but it never receives any further 
> requests from the controller since it is not registered.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to