[
https://issues.apache.org/jira/browse/CASSANDRA-16604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17324852#comment-17324852
]
Michael Semb Wever commented on CASSANDRA-16604:
------------------------------------------------
Patch at
-
https://github.com/apache/cassandra-builds/compare/trunk...thelastpickle:mck/16604
The patch
- for most test types: creates inner splits (run by docker containers) based
on available cpu and mem,
- retries {{`git clone …`}} commands (a common failure on ci-cassandra is git
timeouts)
- replaces any gitbox git clones with github (which times out less)
- tidies up where/how the code is built before running tests (can shave off a
minute from container runs)
- removes any remaining occurrences of {{`-Dtest.runners=1`}}
- requires the test scripts to be executed with {{bash}} instead of {{sh}}
Basic CI
- https://ci-cassandra.apache.org/job/Cassandra-devbranch-test-parallel/6/
- reports one quarter less in build time (with a parallelism of two docker
containers). (further time can be saved by local docker images and keeping
images updated with latest '~/.m2/repository/`)
Before committing, more extensive CI required
- every test target, every release branch, every arch (amd and arm)
Side note… this type of work remains limited in testability, and is leading to
a far amount of churn in the cassandra-builds repository. Local Jenkins
testing, and copying temporary jenkins jobs in ci-cassandra.a.o, has been the
primary approach so far. But a more robust, repeatable, accessible approach
would be to create a test pipeline script using the jenkins k8s operator. For
those with access to a k8s cluster this would make it possible to setup jenkins
from scratch, run a CI pipeline, and tear it down, from a single command line.
More info:
https://jenkinsci.github.io/kubernetes-operator/docs/getting-started/latest/deploy-jenkins/
> Parallelise docker container runs for tests in ci-cassandra.a.o
> ---------------------------------------------------------------
>
> Key: CASSANDRA-16604
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16604
> Project: Cassandra
> Issue Type: Task
> Components: Test/unit
> Reporter: Michael Semb Wever
> Assignee: Michael Semb Wever
> Priority: Normal
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0.x
>
>
> This was raised on the dev ML, where the consensus was to remove it:
> https://lists.apache.org/thread.html/r1ca3c72b90fa6c57c1cb7dcd02a44221dcca991fe7392abd8c29fe95%40%3Cdev.cassandra.apache.org%3E
> The idea is to then replace ant test parallelism with docker container
> parallelism.
> PoC patch:
> https://github.com/apache/cassandra-builds/compare/trunk...thelastpickle:mck/16587-2/trunk
> This is just a quick PoC, aimed at the ci-cassandra agents that have
> 4 cores and 16gb ram available to each executor, but I imagine instead
> something that spawns a number of containers based on system
> resources, like we currently do with get-cores and get-mem.
> Also worth noting the overhead here, compared with the ant parallelism
> approach, docker
> builds everything in each container from scratch, but this too can be
> improved easily enough.
> Cleaning up any remnant `-Dtest.runners=` options is also part of this ticket.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]