dajac opened a new pull request #9140:
URL: https://github.com/apache/kafka/pull/9140


   https://github.com/apache/kafka/pull/8672 introduced a bug leading to 
crashing the replica fetcher threads. The issue is that 
https://github.com/apache/kafka/pull/8672 deletes the Partitions prior to 
stopping the replica fetchers. As the replica fetchers relies access the 
Partition in the ReplicaManager, they crash with a NotLeaderOrFollowerException 
that is not handled.
   
   This PR reverts the code to the original ordering to avoid this issue.
   
   The regression has been caught by our system test: 
`kafkatest.tests.core.reassign_partitions_test`.
   
   I have not managed to reproduce the issue in a unit test without 
reimplementing the entire system test in Java. I am not sure that makes sense 
as we already have it in Python.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to