Thank you Mick for all the work you are doing for the C* CI and not only! I am especially excited about the addition of the upgrade tests. It’s also great that things got documented.
Best regards, Ekaterina On Sat, 8 Aug 2020 at 6:41, Mick Semb Wever <m...@apache.org> wrote: > The following is a summary of changes, status, and suggestions to our > community CI, ci-cassandra.apache.org > Please reply with questions, as well as any input on CircleCI status anyone > has to offer. > > This post will touch on… > * Upgrade Tests > * Build Times and Improvements > * Pre Commit Builds > * Stand-alone Pipeline Runs > * Standardising our CI Build Scripts > * JDK11 > * Nightly Build Artefacts > * CI Documentation > * 4.0 QA Status > > > ** Upgrade Tests > > Both in-jvm and normal upgrade tests have been added to > ci-cassandra.apache.org > > Those in-jvm upgrade tests have been included in the pipeline builds. The > normal upgrade tests currently remain stand-alone. Trunk’s version is found > here https://ci-cassandra.apache.org/job/Cassandra-trunk-dtest-upgrade/ > > > ** Build Times and Improvements > > DTests have been parallelised. By default DTest jobs are divided into 64 > splits now, with the exception of the Large DTest jobs which due to having > far fewer tests have only 8 splits. > > This brings dtest runs from ~12 hours down to ~45 minutes. It brings whole > pipeline builds from ~14 hours down to ~2.5 hours. Some patch (devbranch) > builds have completed in 90 minutes. > > For more speed we can split more, but ci-cassandra currently has 36 agents > (72 executors) and is now often saturated and build queues large. We can > also look into Unit Tests which are only ever using one runner on both > ci-cassandra and circleci. A problem here is that using more runners breaks > some of the unit tests. Another possible improvement is to avoid the > compiling in all the test stages, by re-using the built artefacts in the > beginning of each pipeline run. > > > ** Pre Commit Builds > > ci-cassandra.apache.org is less frequently used for pre-commit builds, for > reasons of resource limits and access reserved to committers. More > information on this can be found in > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=153815764 > > Some trusted contributors have been given Jenkins API tokens, along with > using > the script found here > https://the-asf.slack.com/archives/C0162JU2CKY/p1595703740008400 > > These tokens do rotate, and how/when still remains a little unclear. > Committers can generate tokens via their Jenkins profile pages. There’s > also investigation as to whether we can create jenkins accounts with build > permissions for trusted contributors. Again with the resources available, > especially in contrast to additional testing that will probably appear in > the 4.0 beta testing phase, we are restricted to what is feasible here. > > Notifications for all devbranch pipeline builds now go to > #cassandra-builds-patches. These report the build url, the commit SHA and > message, and the patch repository. > > > ** Stand-alone Pipeline Runs > > It has been raised a number of times that it would be great if users > (companies) could run the complete pipeline on their own resources/cloud, > from a single command line, including the setup and teardown of the jenkins > platform. This would be a big win for the community with standardised > testing and test reports to share, and helping to test all the possible > configuration combinations possible. There is some work involved to do this > but we appear to be moving in that direction anyway. For it to happen the > all stage jobs inside the pipeline need to be moved, from being generated > in the dsl script, to being defined in the in-tree Jenkinsfile. > > > ** Standardising our CI Build Scripts > > Today we have a lot of duplication of build scripts. Those in > cassandra-builds/build-scripts/ and those embedded into each of the > circleci config files in-tree. > > I would like to suggest we move the build-scripts in-tree, and start > migrating circleci to re-use the same build scripts. There are differences > from how test lists are split (round-robin `split` to circleci timings > based splitting) to how parallelisation works (circle’s containers vs the > jenkins matrix plugin), but I suspect by focusing on the easy stuff there’s > a lot that can be standardised. > > > ** JDK11 > > JDK11 builds have been contributed, thanks to Shylaja. Trunk’s pipeline now > builds both JDK 8 and 11 artefacts. > > Adding JDK11 test runs hit a hurdle with how the JDK labels are named and > our tests are not friendly with directory names containing spaces. > > > ** Nightly Build Artefacts > > Build artefacts of Tarballs, as well as Debian and RedHat packages, are > attached to the artefact stages inside each pipeline, for both JDK8 and > JDK11 builds. > > To download the latest successful JDK8 build of these, use the following > links. > Trunk: > > https://ci-cassandra.apache.org/job/Cassandra-trunk-artifacts/jdk=JDK%201.8%20(latest),label=cassandra/lastSuccessfulBuild/artifact/ > 3.11: > > https://ci-cassandra.apache.org/job/Cassandra-3.11-artifacts/jdk=JDK%201.8%20(latest),label=cassandra/lastSuccessfulBuild/artifact/ > 3.0: > > https://ci-cassandra.apache.org/job/Cassandra-3.0-artifacts/jdk=JDK%201.8%20(latest),label=cassandra/lastSuccessfulBuild/artifact/ > 2.2: > > https://ci-cassandra.apache.org/job/Cassandra-2.2-artifacts/jdk=JDK%201.8%20(latest),label=cassandra/lastSuccessfulBuild/artifact/ > > Please note that these artefacts come with no guarantees at all, and should > only be used for development purposes. > > > ** CI Documentation > > There’s a draft page put together for our CI systems at > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=153815764 > > It is work-in-progress, and it needs input from CircleCI users. Further > discussion and validation is required before this page can come out of > draft status. > > > ** 4.0 QA Status > > Trunk is currently hovering around 10 failures for each pipeline run. There > has been a doubling of failures in the past week. Maybe a bad commit, maybe > much higher utilisation of ci-cassandra exposing more flakies. This is > still an amazing improvement and effort from everyone considering we have > ~18k tests. >