[ 
https://issues.apache.org/jira/browse/KAFKA-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin updated KAFKA-5453:
----------------------------
    Fix Version/s:     (was: 2.2.0)
                   2.1.0

> Controller may miss requests sent to the broker when zk session timeout 
> happens.
> --------------------------------------------------------------------------------
>
>                 Key: KAFKA-5453
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5453
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.11.0.0
>            Reporter: Jiangjie Qin
>            Priority: Major
>             Fix For: 2.1.0
>
>
> The issue I encountered was the following:
> 1. Partition reassignment was in progress, one replica of a partition is 
> being reassigned from broker 1 to broker 2.
> 2. Controller received an ISR change notification which indicates broker 2 
> has caught up.
> 3. Controller was sending StopReplicaRequest to broker 1.
> 4. Broker 1 zk session timeout occurs. Controller removed broker 1 from the 
> cluster and cleaned up the queue. i.e. the StopReplicaRequest was removed 
> from the ControllerChannelManager.
> 5. Broker 1 reconnected to zk and act as if it is still a follower replica of 
> the partition. 
> 6. Broker 1 will always receive exception from the leader because it is not 
> in the replica list.
> Not sure what is the correct fix here. It seems that broke 1 in this case 
> should ask the controller for the latest replica assignment.
> There are two related bugs:
> 1. when a {{NotAssignedReplicaException}} is thrown from 
> {{Partition.updateReplicaLogReadResult()}}, the other partitions in the same 
> request will failed to update the fetch timestamp and offset and thus also 
> drop out of the ISR.
> 2. The {{NotAssignedReplicaException}} was not properly returned to the 
> replicas, instead, a UnknownServerException is returned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to