💯! Amazing work - thanks so much for posting the details, Mick, and Josh
is right on. Kinda bummed I haven't been following C* CI dev, being more on
the ops side lately. Posting this up has me intrigued, so I may just have
to go poke around some and scratch an itch :)

Warm regards,
Michael


On Sun, Apr 28, 2024 at 9:08 PM Josh McKenzie <jmcken...@apache.org> wrote:

> A huge amount of work and time went into this and it's going to have a big
> impact on the project. I want to offer a heartfelt thanks to all involved
> for the focus and energy that went into this!
>
> As the author of the system David lovingly dubbed "JoshCI" (/sigh), I
> definitely want to see us all move to converge as much as possible on the
> CI code we're running. While I remain convinced something like
> CASSANDRA-18731 is vital for hygiene in the long run (unit testing our CI,
> declaratively defining atoms of build logic independently from flow), I
> also think there'd be significant value in more of us moving towards using
> the JenkinsFile where at all possible.
>
> Seriously - thanks again for all this work everyone. CI on Cassandra is a
> Big Data Problem, and not an easy one.
>
> On Sun, Apr 28, 2024, at 10:22 AM, Mick Semb Wever wrote:
>
>
> Good news.
>
> CI on 5.0 and trunk is working again, after an unexpected 6 weeks
> hiatus (and a string of general problems since last year).
> This includes pre-commit for 5.0 and trunk working again.
>
>
> More info…
>
> From 5.0 we now have in-tree a Jenkinsfile that only relies on the in-tree
> scripts – it does not depend upon cassandra-builds and all the individual
> dsl created stage jobs. This aligns how pre-commit and post-commit works.
> More importantly, it makes our CI repeatable regardless of the fork/branch
> of the code, or the jenkins installation.
>
> For 5.0+ pre-commit use the Cassandra-devbranch-5 and make sure your patch
> is after sha 3c85def
> The jenkinsfile now comes with pre-defined profiles, it's recommended to
> use "skinny" until you need the final "pre-commit".  You can also use the
> custom profile with a regexp when you need just specific test types.
> See https://ci-cassandra.apache.org/job/Cassandra-devbranch-5/build
>
> For pre-commit on older branches, you now use Cassandra-devbranch-before-5
>
> For both pre- and post-commit builds, each build now creates two new
> sharable artefacts: ci_summary.html and results_details.tar.xz
> These are based on what apple contributors were sharing from builds from
> their internal CI system.  The format and contents of these files is
> expected to evolve.
>
> Each build now archives its results and logs all under one location in
> nightlies.
>
> e.g. https://nightlies.apache.org/cassandra/Cassandra-5.0/227/
>
>
>
> The post-commit pipeline profile remains *very* heavy, at 130k+ tests.
> These were previously ramped up to include everything in their pipelines,
> given everything that's happening in both branches.   So they take time and
> saturate everything they touch.  We need to re-evaluate what we need to be
> testing to alleviate this.  There'll also be a new pattern of timeouts and
> infra/script -related flakies, as happens whenever there's such a
> significant change, all the patience and help possible is appreciated!
>
>
>
> Now that the jenkinsfile can now be used on any jenkins server for any
> fork/branch, the next work-in-progress is CASSANDRA-18145, to be able to
> run the full pipeline with a single command line (given a k8s context
> (~/.kube/config)).
>
> We already have most of this working – it's possible to make a clone
> ci-cassandra.apache.org on k8s using this wip helm chart:
> https://github.com/thelastpickle/Cassius
> And we are already using this on an auto-scaling gke k8s cluster – you
> might have seen me posting the ci_summary.html and results_details.tar.xz
> files to tickets for pre-commit CI instead of using the ci-cassandra.a.o or
> circleci pre-commit liks.  Already, we have a full pipeline time down to
> two hours and less than a third of the cost of CircleCI, and there's lhf to
> further improve this.  For serious pre-commit testing we are still missing
> and need repeatable test runs, ref CASSANDRA-18942.  On all this I'd like
> to give a special shout out to Aleksandr Volochnev who was instrumental in
> the final (and helm based) work of 18145 which was needed to be able to
> test its prerequisite ticket CASSANDRA-18594 – ci-cassandra.a.o would not
> be running again today without his recent time spent on it.
>
> On a separate note, this new jenkinsfile is designed in preparation for
> CASSANDRA-18731 ('Add declarative root CI structure'), to make it easier to
> define profiles, tests, and their infrastructural requirements.
>
>
> To the community…
>   We are now in a place where we are looking and requesting further
> donations of servers to the ci-cassandra.apache.org jenkins cluster.  We
> can now also use cloud/instance credits to host auto-scaling k8s-based
> ci-cassandra.a.o clones that would be available for community pre-commit
> testing.
>   There's plenty of low-hanging fruit improvements available if folk want
> to get involved.  Performance and throughput of splits is an important area
> as it has a big impact on reducing costs of a whole pipeline run  (there's
> nothing like knowing you saved another $5 every time you clicked a
> button).  And if you can just start using the in-tree test scripts more,
> that helps a lot.
>
>
>
>
>

Reply via email to