XComp commented on PR #21137: URL: https://github.com/apache/flink/pull/21137#issuecomment-1299313832
Thanks for your patience on that PR. I gave it another thought: You're right about the issue with the 5ms delay being necessary to enter the synchronized block of the `grantLeadership` call hierarchy. The issue is that we don't have any means of verify that we're in the synchronized block of `DefaultLeaderElectionService#lock` in `DefaultLeaderElectionService#onGrantLeadership` but we trigger this implicly through `TestingLeaderElectionDriver#isLeader()`. This applies to your and my proposal and, therefore, the delay (i.e. the 5ms) are actually necessary. I'm not a big fan of these 5ms because they tend to make these tests flaky. We should be really clear in an inline comment on why this is necessary. That said, I tend to prefer my proposal, because it keeps the test-specific logic in the test method. Your proposal adds a test-specific callback to `TestingLeaderElectionDriver` which is some kind of generic test utility class. Mixing test utility classes with actual test implementations messes things up in the long run usually. That's the reason we started to come up with `Testing*` implementations like I proposed with `TestingJobMasterServiceProcess2` in my PR. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
