[GitHub] [kafka] C0urante opened a new pull request, #13806: MINOR: Fix flaky DistributedHerderTest cases related to zombie fencing

via GitHub Sat, 03 Jun 2023 11:07:36 -0700


C0urante opened a new pull request, #13806:
URL: https://github.com/apache/kafka/pull/13806


   I frequently see failures when running these test cases locally caused by 
fewer-than-expected (2 instead of 3) invocations of `member::wakeup`.
   
   This turns out to be due to a benign race condition in 
`DistributedHerder::addRequest`, which is documented in a comment in that 
method. A comment is also added to 
`DistributedHerderTest::testTaskRequestedZombieFencingForwardingToLeader` 
explaining why we relax our expectations about how many invocations of 
`member::wakeup` take place.
   
   As an alternative, we could consider adding synchronization around access to 
the herder request queue. But the additional implementation complexity and 
risks of deadlock don't seem worth the finer-grained testing logic we'd be able 
to write.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka] C0urante opened a new pull request, #13806: MINOR: Fix flaky DistributedHerderTest cases related to zombie fencing

Reply via email to