Cassandra's build.xml supports parallel test runners. This functionality is available through `-Dtest.runners` and the `testparallel` ant macro.
It's always been there, but hasn't been active recently since both ci-cassandra and circleci call testclasslist instead of test. Recently testclasslist was updated to enable multiple runners too. Since then we witnessed a lot more test failures… The distributed in-jvm tests just don't work with parallel runners, and currently they need `-Dtest.runners=1` specified to work. And plenty of flakies where tests use fixed ports (StorageServiceServerTest), byteman (eg BMUnitRunner), and around conf files on disk. >From here, I can see two ways forward, a) fix everything to be parallel ready or b) remove test.runners and parallelise with docker instead. All in all, I think this is kinda odd to do (a) when docker is readily available, especially on the CI servers where we are concerned about build times. For (b)… to remove everything related to 'testparallel' and 'test.runners' from the build.xml an example patch is here: https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/16587-2/trunk Then replacing 'ant task parallelism' with docker containers would be done something like this: https://github.com/apache/cassandra-builds/compare/trunk...thelastpickle:mck/16587-2/trunk (this is just a quick PoC, aimed at the ci-cassandra agents that have 4 cores and 16gb ram available to each executor, but I imagine instead something that spawns a number of containers based on system resources, like we currently do with get-cores and get-mem). Also worth noting the overhead here, compared with the ant approach, docker builds everything in each container from scratch, but this too can be improved easily enough. What are folks' opinions? --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org