Dhruvil Shah created KAFKA-9961:
-----------------------------------

             Summary: Brokers may be left in an inconsistent state after 
reassignment
                 Key: KAFKA-9961
                 URL: https://issues.apache.org/jira/browse/KAFKA-9961
             Project: Kafka
          Issue Type: Bug
            Reporter: Dhruvil Shah


When completing a reassignment, the controller sends StopReplicaRequest to 
replicas that are not in the target assignment and removes them from the 
assignment in ZK. We do not have any retry mechanism to ensure that the broker 
is able to process the StopReplicaRequest successfully. Under certain 
circumstances, this could leave brokers in an inconsistent state, where they 
continue being the follower for this partition and end up with an inconsistent 
metadata cache.

We have seen messages like the following being spammed in the broker logs when 
we get into this situation:
{code:java}
While recording the replica LEO, the partition topic-1 hasn't been created.
{code}
This happens because the broker has not an updated LeaderAndIsrRequest for the 
new leader nor a StopReplicaRequest from the controller when the replica was 
removed from the assignment.

Note that we would require a restart of the affected broker to fix this 
situation. A controller failover would not fix it as the broker could continue 
being a replica for the partition until it receives a StopReplicaRequest, which 
would never happen in this case.

There seem to be couple of problems we should address:
 # We need a mechanism to retry replica deletions after partition reassignment 
is complete. The main challenge here is to be able to deal with cases where a 
broker has been decommissioned and may never come back up.
 # We could perhaps consider a mechanism to reconcile replica states across 
brokers, something similar to the solution proposed inĀ 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-550%3A+Mechanism+to+Delete+Stray+Partitions+on+Broker].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to