Chained builds

Aaron Digulla Tue, 19 Mar 2019 03:48:34 -0700

Hello,

I'd like to set up chained builds. I understand chained builds as "multiple 
projects which depends on each other and where changes have been pushed to 
the same branch in each."

The typical case here is the master branch. People supply features.
Eventually, they are merged to the master branch. Now, all downstream
builds should run if they

1. have the same branch
2. depend on the same Maven version as the project just built

If I have a logical chain of projects A -> B -> C ("->" == "depends on")
and "C" is modified, I want to build B and then A.

This includes:

- A should wait for the build of B since there us a chance that it might
fail otherwise
- A should only build when it has a SNAPSHOT dependency on C. If I have a
release 1.0 of C and 2.0-SNAPSHOT in the master branch but A depends on C
1.0, no build is necessary (but wouldn't hurt)
- Builds should not rely on some global Maven repo.

Reasons for the last point:

A global Maven repo is very much like a global variable. Changes there
always have side effects.

If a SNAPSHOT is pushed to a global Maven repo, everyone in the company
will get this new version first thing in the morning (first Maven build of
the day, when it updates SNAPSHOT dependencies). That can cause all kinds
of weird problems. So I'm very reluctant to publish SNAPSHOT dependencies
globally.

This becomes worse when feature branches are introduced. When several
feature branches are built concurrently, no one can tell which version ends
up in the global repo. When downstream builds start, they will randomly
fail.

I haven't found a good solution for this last point.

I could create a new Maven repo when the build starts for C and then pass
the path to B. If B gets a path to a Maven repo, it uses it; otherwise, it
creates a new Maven repo. etc.
Problem: When the project builds on different nodes, this fails unless I
use a network filesystem. Which adds brittleness plus I'm not 100% sure how
Maven handles concurrent access to a local repo.

I could use Jenkins to pass the artifacts around but that means archiving
them on Master and then downloading them on the client. Archiving is
somewhat slow and a burden on the master node (especially when hundreds of
projects build). But the main problem is to know what to download and how
to get it into the local Maven repo. I guess I could look at the current
job and find the upstream job and then just download all archived artifacts
and try to install them. Not sure whether that would work.

A more serious problem is when I find a problem in B and push a new commit
to the feature branch. Now the build starts with B. If I rely on C creating
the repo for me, the build will fail because the new code from C is not
visible anymore. B will download the last master branch from the global
Maven repo and fail. If I use the "copy archived artifacts" approach, I
have the same problem because there is no upstream "C" job anymore.

So I could create a local Maven repo using the branch name. That would help
with feature branches but raise new issues: When can I safely delete those?
If I delete too early (say, every night), starting a build with B will
randomly fail again.
It would also mean that a lot of projects would eventually build into the
repo with the name "master". There would be almost no way to clean that up.
Maybe I can use the global Maven repo for "master" + "release" builds and
local repos for everything else. Then people would have to remember that
when debugging build problems.

One option would be to deny commits to master which contain SNAPSHOT
dependencies (so the project itself could use a SNAPSHOT version but all
the dependencies would have to be releases). That sounds like a good
solution at first but in your case, "A" is a client specific project. For
some products ("B"), we have 20+ clients. Most of them stay at the latest
release build but a few are part of the next release. It would be a big
overhead to force a release of the product every time a new feature is
integrated into a client. Imagine having to do release builds 2-3 times a
day. We would prefer to have one release build per release cycle of a
product and keep the product and all involved clients at SNAPSHOT for the
whole cycle because that would allow us to notice early when some feature
for client X breaks client Y.

So my final design looks like this:

- Every project has one local Maven repo per branch (somehow shared between
nodes)
- When a build of C succeeds, it triggers B. A waits. B copies the whole
repo of C into it's own.
- When B has been built, all the A's copy B's repo into their own and build.

That would allow to start a chained build at any point.
If the Maven repos gets corrupted, we can delete them all and trigger a
build of C to recreate them.

What I don't like here is the massive disk usage. Even for simple projects,
Maven downloads 100-200 MB of code for its plugins. So 100 repos would need
10 TB of disk space (10 projects with 10 feature branches). It's also
somewhat slow but in my tests, copying 200 MB of Maven repo took < 10s, so
it's bearable.

Also, I'm not sure how to solve "fragmented" chains when there is a feature
branch for A and C but not for B. In this case, the build of C needs to
trigger just A, skipping B. A may depend on the SNAPSHOT version of B from
the master branch.

Any comments? Has someone already set up something like this? Does this
work reliably? How can I solve the disk space issue?

Regards,

--
Aaron Digulla

--
You received this message because you are subscribed to the Google Groups
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/jenkinsci-users/ed355d9b-5d54-4d23-8c57-e9c7581acc2a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Chained builds

Reply via email to