While I'm certain we could push through all these tests to support
parallelism, I think it will end up requiring continual work since
there is a class of tests that won't always work under concurrency,
but also that won't be immediately obvious until the damage is done.

I'm +1 on punting to docker to parallelize.

On Mon, Apr 12, 2021 at 1:17 PM Mick Semb Wever <m...@apache.org> wrote:
>
> Cassandra's build.xml supports parallel test runners. This
> functionality is available through `-Dtest.runners` and the
> `testparallel` ant macro.
>
> It's always been there, but hasn't been active recently since both
> ci-cassandra and circleci call testclasslist instead of test.
>
> Recently testclasslist was updated to enable multiple runners too.
> Since then we witnessed a lot more test failures… The distributed
> in-jvm tests just don't work with parallel runners, and currently they
> need `-Dtest.runners=1` specified to work. And plenty of flakies where
> tests use fixed ports (StorageServiceServerTest), byteman (eg
> BMUnitRunner), and around conf files on disk.
>
> From here, I can see two ways forward, a) fix everything to be
> parallel ready or b) remove test.runners and parallelise with docker
> instead.
>
> All in all, I think this is kinda odd to do (a) when docker is readily
> available, especially on the CI servers where we are concerned about
> build times.
>
> For (b)… to remove everything related to 'testparallel' and
> 'test.runners' from the build.xml an example patch is here:
> https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/16587-2/trunk
>
> Then replacing 'ant task parallelism' with docker containers would be
> done something like this:
> https://github.com/apache/cassandra-builds/compare/trunk...thelastpickle:mck/16587-2/trunk
> (this is just a quick PoC, aimed at the ci-cassandra agents that have
> 4 cores and 16gb ram available to each executor, but I imagine instead
> something that spawns a number of containers based on system
> resources, like we currently do with get-cores and get-mem). Also
> worth noting the overhead here, compared with the ant approach, docker
> builds everything in each container from scratch, but this too can be
> improved easily enough.
>
> What are folks' opinions?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Reply via email to