hachikuji commented on PR #11687: URL: https://github.com/apache/kafka/pull/11687#issuecomment-1141445344
@Kvicii Thanks for the patch. I checked this out and `ControllerIntegrationTest.testTopicIdUpgradeAfterReassigningPartitions` still fails 5-10% of the time when run locally. The new failure looks like this: ``` org.opentest4j.AssertionFailedError: expected same topic ID but it can not be found ==> Expected :Some(6_zllbH4TES3yUSJZ4ATsw) Actual :None at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62) at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182) at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1152) at kafka.controller.ControllerIntegrationTest.testTopicIdUpgradeAfterReassigningPartitions(ControllerIntegrationTest.scala:1473) ``` I think we need to account for a delay between when the change hits zk and when it is reflected in the `ControllerContext`. I've pushed a patch here on top of this branch: https://github.com/apache/kafka/commit/f2dd2f5558d56a7d2f667d71b4c3453466dc3df8. This logic first waits for a consistent `ControllerContext`, then asserts zookeeper state. I wasn't able to reproduce the failure anymore with this patch. Feel free to include it here. Also, could we move the unrelated cleanups into a separate PR? I know it can be difficult to overlook messy code, but it makes review more difficult since we have to look at every line to see which changes actually matter. And, on the other hand, it is easy to review cleanup PRs if they are not changing any logic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org