C0urante opened a new pull request, #13806: URL: https://github.com/apache/kafka/pull/13806
I frequently see failures when running these test cases locally caused by fewer-than-expected (2 instead of 3) invocations of `member::wakeup`. This turns out to be due to a benign race condition in `DistributedHerder::addRequest`, which is documented in a comment in that method. A comment is also added to `DistributedHerderTest::testTaskRequestedZombieFencingForwardingToLeader` explaining why we relax our expectations about how many invocations of `member::wakeup` take place. As an alternative, we could consider adding synchronization around access to the herder request queue. But the additional implementation complexity and risks of deadlock don't seem worth the finer-grained testing logic we'd be able to write. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org