Jackie-Jiang commented on a change in pull request #4674: Fix flakiness in
BoundedAccountingExecutorTest
URL: https://github.com/apache/incubator-pinot/pull/4674#discussion_r331183808
##########
File path:
pinot-core/src/test/java/org/apache/pinot/core/query/scheduler/resources/BoundedAccountingExecutorTest.java
##########
@@ -83,17 +86,44 @@ public void run() {
}).start();
syncer.startupBarrier.await();
+ // At this point, 'limit' jobs should have executed the
startupBarrier.await() call.
+ // The other jobs should be waiting on a semaphore for permits inside the
BoundedAccountingExecutor.execute()
+ // so they have not executed running.incrementAndGet() yet.
assertEquals(running.get(), limit);
verify(accountant, times(limit)).incrementThreads();
// reset will clear the counts on incrementThreads
reset(accountant);
- int pendingJobs = jobs - limit;
+ final int pendingJobs = jobs - limit;
+ // Before the pendingJobs get to startupBarrier, reset it to a different
value
+ // since we cannot change the limit of the CyclicBarrier once created.
+ // The new limit will be pending jobs plus the await we will call in this
thread.
syncer.startupBarrier = new CyclicBarrier(pendingJobs + 1);
+
+ // Now let the running threads complete and call running.decrementAndGet.
As
+ // they exit, the pending jobs will acquire permits and start to increment
+ // the running counter and wait on startupBarrier.await().
syncer.validationBarrier.await();
// verify additional jobs are run as soon as current job finishes
syncer.validationBarrier = new CyclicBarrier(pendingJobs + 1);
+ // When we run the test in a small number of cores, it is possible that
the running jobs
+ // have not yet gotten to execute running.decrementAndGet(), but the
pending jobs have already
+ // done the increment. So, we need to wait until we check the running
counter to equal the
+ // pending jobs.
+ TestUtils.waitForCondition(new Function<Void, Boolean>() {
Review comment:
This can be simplified as:
`TestUtils.waitForCondition(aVoid -> running.get() == pendingJobs, 10_000,
"Invalid number of running jobs" + running.get());`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]