cmccabe opened a new pull request, #15230: URL: https://github.com/apache/kafka/pull/15230
When a broker is down, and a topic is deleted, this will result in that broker seeing "stray replicas" the next time it starts up. These replicas contain data that used to be important, but which now needs to be deleted. Stray replica deletion is handled during the initial metadata publishing step on the broker. Previously, we deleted these stray replicas after starting up BOTH LogManager and ReplicaManager. However, this wasn't quite correct. The presence of the stray replicas confused ReplicaManager. Instead, we should delete the stray replicas BEFORE starting ReplicaManager. This bug triggered when a topic was deleted and re-created while a broker was down, and some of the replicas of the re-created topic landed on that broker. The impact was that the stray replicas were deleted, but the new replicas for the next iteration of the topic never got created. This, in turn, led to persistent under-replication until the next time the broker was restarted. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org