[
https://issues.apache.org/jira/browse/KAFKA-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jun Rao resolved KAFKA-3215.
----------------------------
Resolution: Fixed
Fix Version/s: 0.9.0.0
> controller may not be started when there are multiple ZK session expirations
> ----------------------------------------------------------------------------
>
> Key: KAFKA-3215
> URL: https://issues.apache.org/jira/browse/KAFKA-3215
> Project: Kafka
> Issue Type: Bug
> Components: core
> Reporter: Jun Rao
> Assignee: Flavio Junqueira
> Labels: controller
> Fix For: 0.9.0.0
>
>
> Suppose that broker 1 is the controller and it has 2 consecutive ZK session
> expirations. In this case, two ZK session expiration events will be fired.
> 1. When handling the first ZK session expiration event,
> SessionExpirationListener.handleNewSession() can elect broker 1 itself as the
> new controller and initialize the states properly.
> 2. When handling the second ZK session expiration event,
> SessionExpirationListener.handleNewSession() first calls
> onControllerResignation(), which will set ReplicaStateMachine.hasStarted to
> false. It then continues to do controller election in
> ZookeeperLeaderElector.elect() and try to create the controller node in ZK.
> This will fail since broker 1 has already registered itself as the controller
> node in ZK. In this case, we simply ignore the failure to create the
> controller node since we assume the controller must be in another broker.
> However, in this case, the controller is broker 1 itself, but the
> ReplicaStateMachine.hasStarted is still false.
> 3. Now, if a new broker event is fired, we will be ignoring the event in
> BrokerChangeListener.handleChildChange since ReplicaStateMachine.hasStarted
> is false. Now, we are in a situation that a controller is alive, but won't
> react to any broker change event.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)