I'm repeatedly running "./gradlew check -Pvalidation.git.failOnModified=false" to find problems with the commit that I am contemplating.

I noticed that as the build system gets closer to the end of the test run, that multiple threads go idle.

I suspect that the way the build system is allocating tests to threads is just an even split of all tests at the beginning, then each thread processes the list of tests it has been given in sequence.  As the run proceeds, threads go idle and are no longer given work.  I've got a server with 12 real CPU cores, so I get a lot of threads by default from the build system.

What I am hoping we can do is have it instead queue up the list of tests and assign the next test to a thread that has gone idle.  That way all the threads will be occupied longer and will likely complete faster.  And when test threads begin staying idle, I will be able to see exactly how many tests are left to execute.  Short running tests will be a lot less likely to be waiting for a longer test to finish.

Right now I am looking at a check run that has been going for 56 minutes.  Only one thread is running tests, and that thread has been "Executing test org.apache...api.collections.CollectionTooManyReplicasTest" for quite a while and right now I have no idea how many more tests that thread has left to run.  I am curious whether the test system has an absolute timeout for individual test classes. With strace, I am seeing activity that looks like the test is awaiting a condition that will probably never arrive.  I haven't looked at the code.  I'm going to cancel this run and start it again.

https://www.dropbox.com/s/usd0aj5m7w66csp/test_run_gradlew_check_SOLR-8803.png?dl=0

Is that queuing idea too difficult to implement?

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org

Reply via email to