I wrote one a year or two ago. It’s a little tricky here and there, but not too bad. I just made a second test task and used the gradle code from the test task as building blocks. You just need a little strategy for how to keep the jvms full. You could probably steal the old ant builds strategy. I did something like adding a test queue for each JVM. A configurable number of JVMs could be set as fast and the rest where slow. I also added another test annotation called slowest. I assigned a configurable weight to each test based on no annotation, slow, or slowest. Then I separated out the slowest tests and started feeding both test groups to the JVM queues (slowest and then slow and no annotation). Queues would only allow another test entry if the weight sum on the queue was under a configurable value, attempting to keep a sufficient number of tests queued, but no more than that. The slowest tests would go to the slow JVMs and the rest would go to the fast JVMs - until the slowest tests were all queued, then the rest of the tests would open up to going to any JVM queue. If the queues were full, the test feeder would just wait until there was room on a queue. This way, a JVM always had tests lined up to come in right away next, but not lined up so much that at the end, you’d have some JVMs with a bunch in their queue and some JVMs sitting idle with nothing in their queue. With the right config, you’d get full use of your hardware till almost the bitter end. I had 32 cores though, if you have a lot less or are not as interested in peak hardware usage, much simpler strategies would be just fine.
It’s a fun little project to work on anyway. I set it aside when when I started pushing every test in a standard run to a max of 10 seconds and saw you could even keep pushing that down from there, so this workaround became less interesting without practical use at the time. Otherwise, i found it to be enjoyable work. The gradle code already has the the annoying stuff figured out, so it’s just moving those blocks around, dealing with a couple issues due to the change in approach, and mostly just playing with some fun ideas around keeping the JVMs busy for the duration of the run. [Mark Miller - Chat @ Spike](https://spikenow.com/r/a/?ref=spike-organic-signature&_ts=1rgzgp) [1rgzgp] On October 9, 2022 at 6:01 GMT, Shawn Heisey <[email protected]> wrote: I'm repeatedly running "./gradlew check -Pvalidation.git.failOnModified=false" to find problems with the commit that I am contemplating. I noticed that as the build system gets closer to the end of the test run, that multiple threads go idle. I suspect that the way the build system is allocating tests to threads is just an even split of all tests at the beginning, then each thread processes the list of tests it has been given in sequence. As the run proceeds, threads go idle and are no longer given work. I've got a server with 12 real CPU cores, so I get a lot of threads by default from the build system. What I am hoping we can do is have it instead queue up the list of tests and assign the next test to a thread that has gone idle. That way all the threads will be occupied longer and will likely complete faster. And when test threads begin staying idle, I will be able to see exactly how many tests are left to execute. Short running tests will be a lot less likely to be waiting for a longer test to finish. Right now I am looking at a check run that has been going for 56 minutes. Only one thread is running tests, and that thread has been "Executing test org.apache...api.collections.CollectionTooManyReplicasTest" for quite a while and right now I have no idea how many more tests that thread has left to run. I am curious whether the test system has an absolute timeout for individual test classes. With strace, I am seeing activity that looks like the test is awaiting a condition that will probably never arrive. I haven't looked at the code. I'm going to cancel this run and start it again. https://www.dropbox.com/s/usd0aj5m7w66csp/test_run_gradlew_check_SOLR-8803.png?dl=0 Is that queuing idea too difficult to implement? Thanks, Shawn --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
