lhotari commented on PR #21335: URL: https://github.com/apache/pulsar/pull/21335#issuecomment-1754637243
> I'm afraid that if we simply increase the waiting time, more resources are consumed and the CI can be more unstable. Do you have some insights that this issue is flaky due to this timeout too short, or it's the major flaky tests that we should workaround? In this case, increasing the timeout won't have a significant impact in the direction where tests in general would be slowing down. This is a very local change. One reason to do this change is to validate an assumption. My assumption in this particular case is that 500 milliseconds isn't sufficient in CI perhaps due to some pause caused by GC etc.. Increasing the timeout from 500 millis to 1500 millis will rule out that possibility without causing actual delays or harm. > The ideal way is to build a determinate happens-before order and wait forever, but it's more challenging to implement so I don't insist it for such a fix. I agree. A large part of the problem is non-optimal test design. The flaky test problem in Pulsar has been going on for years and it's like a wack-a-mole issue that when you eliminate one issue, new problems pop up elsewhere. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
