gianm commented on code in PR #15726:
URL: https://github.com/apache/druid/pull/15726#discussion_r1514853293


##########
server/src/test/java/org/apache/druid/discovery/BaseNodeRoleWatcherTest.java:
##########
@@ -116,6 +136,69 @@ public void testGeneralUseSimulation()
     assertListener(listener3, true, nodesAdded, nodesRemoved);
   }
 
+  @Test(timeout = 60_000L)
+  public void testRegisterListenerBeforeTimeout() throws InterruptedException
+  {
+    BaseNodeRoleWatcher nodeRoleWatcher = new BaseNodeRoleWatcher(exec, 
NodeRole.BROKER, 1);

Review Comment:
   There's a timing race in this test: we have a timeout of 1 second but 
nothing really guarantees that `registerListener` is called prior to the 
timeout firing. It is also not ideal that there is a `sleep(1500)`, for similar 
reasons. On busy CI servers there can be random pauses that cause these tests 
to fail, test the wrong thing, or otherwise not work properly. The sleep also 
makes the test take longer than necessary.
   
   `testGetAllNodesBeforeTimeout` has a similar problem.
   
   In both cases, we can fix it by splitting up the timeout-scheduler from the 
constructor. This would work:
   
   - Move the `listenerExecutor.schedule` call for 
`this::cacheInitializedTimedOut` out of the constructor, put it in some new 
package-private method like `void scheduleTimeout(long timeout)`.
   - Make the constructor itself package-private.
   - Add a public static factory method like `create(ScheduledExecutorService, 
NodeRole)` that constructs an instance and then calls 
`scheduleTimeout(DEFAULT_TIMEOUT)` on it. Use this one in production code.
   - Add a package-private `awaitInitialization()` method that awaits the 
initialization latch.
   - In test code, call the regular constructor (which doesn't schedule the 
timeout), then do the pre-timeout part of the test, then call 
`scheduleTimeout(0)` (which would trigger timeout immediately), then call 
`awaitInitialization()` (to wait for the timeout to actually happen), then do 
the post-initialization part of the test.
   
   With this structure, the test should complete in a few milliseconds and be 
robust to any potential runtime timing issues.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to