I wrote one a year or two ago. It’s a little tricky here and there, but not too 
bad. I just made a second test task and used the gradle code from the test task 
as building blocks. You just need a little strategy for how to keep the jvms 
full. You could probably steal the old ant builds strategy. I did something 
like adding a test queue for each JVM. A configurable number of JVMs could be 
set as fast and the rest where slow. I also added another test annotation 
called slowest. I assigned a configurable weight to each test based on no 
annotation, slow, or slowest. Then I separated out the slowest tests and 
started feeding both test groups to the JVM queues (slowest and then slow and 
no annotation). Queues would only allow another test entry if the weight sum on 
the queue was under a configurable value, attempting to keep a sufficient 
number of tests queued, but no more than that. The slowest tests would go to 
the slow JVMs and the rest would go to the fast JVMs - until the slowest tests 
were all queued, then the rest of the tests would open up to going to any JVM 
queue. If the queues were full, the test feeder would just wait until there was 
room on a queue. This way, a JVM always had tests lined up to come in right 
away next, but not lined up so much that at the end, you’d have some JVMs with 
a bunch in their queue and some JVMs sitting idle with nothing in their queue. 
With the right config, you’d get full use of your hardware till almost the 
bitter end. I had 32 cores though, if you have a lot less or are not as 
interested in peak hardware usage, much simpler strategies would be just fine.

It’s a fun little project to work on anyway. I set it aside when when I started 
pushing every test in a standard run to a max of 10 seconds and saw you could 
even keep pushing that down from there, so this workaround became less 
interesting without practical use at the time. Otherwise, i found it to be 
enjoyable work. The gradle code already has the the annoying stuff figured out, 
so it’s just moving those blocks around, dealing with a couple issues due to 
the change in approach, and mostly just playing with some fun ideas around 
keeping the JVMs busy for the duration of the run.

[Mark Miller - Chat @ 
Spike](https://spikenow.com/r/a/?ref=spike-organic-signature&_ts=1rgzgp)  
[1rgzgp]

On October 9, 2022 at 6:01 GMT, Shawn Heisey <[email protected]> wrote:

I'm repeatedly running "./gradlew check
-Pvalidation.git.failOnModified=false" to find problems with the commit
that I am contemplating.

I noticed that as the build system gets closer to the end of the test
run, that multiple threads go idle.

I suspect that the way the build system is allocating tests to threads
is just an even split of all tests at the beginning, then each thread
processes the list of tests it has been given in sequence. As the run
proceeds, threads go idle and are no longer given work. I've got a
server with 12 real CPU cores, so I get a lot of threads by default from
the build system.

What I am hoping we can do is have it instead queue up the list of tests
and assign the next test to a thread that has gone idle. That way all
the threads will be occupied longer and will likely complete faster.
And when test threads begin staying idle, I will be able to see exactly
how many tests are left to execute. Short running tests will be a lot
less likely to be waiting for a longer test to finish.

Right now I am looking at a check run that has been going for 56
minutes. Only one thread is running tests, and that thread has been
"Executing test
org.apache...api.collections.CollectionTooManyReplicasTest" for quite a
while and right now I have no idea how many more tests that thread has
left to run. I am curious whether the test system has an absolute
timeout for individual test classes. With strace, I am seeing activity
that looks like the test is awaiting a condition that will probably
never arrive. I haven't looked at the code. I'm going to cancel this
run and start it again.

https://www.dropbox.com/s/usd0aj5m7w66csp/test_run_gradlew_check_SOLR-8803.png?dl=0

Is that queuing idea too difficult to implement?

Thanks,
Shawn

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to