[ 
https://issues.apache.org/jira/browse/KAFKA-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190794#comment-15190794
 ] 

Flavio Junqueira commented on KAFKA-3215:
-----------------------------------------

[~junrao] Let me see if I understand this issue correctly.

bq. broker 1 is the controller and it has 2 consecutive ZK session expirations

As I understand this, one possible run that reflects this is the following:

# zkclient creates a session S1
# S1 session expires
# zkclient queues the session expiration event to deliver to the kafka broker
# zkclient creates a new session S2
# S2 expires
# zkclient queues the session expiration for S2 and the event for S1 still 
hasn't been delivered
# zkclient creates a third session S3
# broker 1 processes the session expiration of S1
# broker 1 successfully elects itself leader/controller in session S3
# broker 1 processes session expiration for S2

After this last step, broker S2 is messed up because it thinks the replica 
state machine isn't properly initialized. Also, the broker won't give up 
leadership because the ephemeral has been created in the current session.

I think this was a problem in 0.8.2, but not a problem in 0.9 because we fixed 
it in KAFKA-1387. With ZKWatchedEphemeral, in the case we get that the znode 
exists while creating it, we check if the existing znode has the same session 
owner, in which case the operation returns ok and the controller becomes 
leader. Does it make sense?

> controller may not be started when there are multiple ZK session expirations
> ----------------------------------------------------------------------------
>
>                 Key: KAFKA-3215
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3215
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>            Reporter: Jun Rao
>              Labels: controller
>
> Suppose that broker 1 is the controller and it has 2 consecutive ZK session 
> expirations. In this case, two ZK session expiration events will be fired.
> 1. When handling the first ZK session expiration event, 
> SessionExpirationListener.handleNewSession() can elect broker 1 itself as the 
> new controller and initialize the states properly.
> 2. When handling the second ZK session expiration event, 
> SessionExpirationListener.handleNewSession() first calls 
> onControllerResignation(), which will set ReplicaStateMachine.hasStarted to 
> false. It then continues to do controller election in 
> ZookeeperLeaderElector.elect() and try to create the controller node in ZK. 
> This will fail since broker 1 has already registered itself as the controller 
> node in ZK. In this case, we simply ignore the failure to create the 
> controller node since we assume the controller must be in another broker. 
> However, in this case, the controller is broker 1 itself, but the 
> ReplicaStateMachine.hasStarted is still false.
> 3. Now, if a new broker event is fired, we will be ignoring the event in 
> BrokerChangeListener.handleChildChange since ReplicaStateMachine.hasStarted 
> is false. Now, we are in a situation that a controller is alive, but won't 
> react to any broker change event.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to