Hi all,
Today I mean to revive a conversation from a few months ago and
hopefully reach consensus on this.
Abderrahim Kitouni and Valentin David have put considerable effort into
this a while back and I think it's time we try to close the loop on
this.
For some background, see conversation on Valentin's merge request[0]
and see Abderrahim's proposal[1] (consider this a re-proposal of
Abderrahim's proposal in my own words, after having better
understanding it than I initially did).
Problem statement
=================
Put briefly, when creating a derived work of a project through a
junction, we find ourselves either duplicating a lot of work, or
implementing some delicate workarounds which are not entirely safe.
This proposal aims to solve this problem elegantly without sacrificing
the encapsulation which allows projects to potentially interact with
clear stable APIs.
Perspective: flexibility vs longevity
-------------------------------------
To put things into perspective; other integration tools allow projects
to interact in a much more flexible manner, which is vitally useful in
development, allowing downstream projects to override fine grained
details and fine tune a "body of work" to the liking of the downstream
consumer - while this flexibility is very handy in development stages,
it usually blows up the moment you want to rebase your downstream work
against a new and improved version of the upstream: Having the ability
to override discrete details usually means those details cannot
practically be maintained in an API stable fashion, causing any rebase
to fail miserably.
BuildStream has always taken a hard stance against this flexibility,
favoring encapsulation and catering instead to longer term product life
cycles, providing projects with a means to produce stable APIs so that
their downstreams can rebase more easily.
Yocto is of course a perfectly contrasted example, as their layering
system allows one to override any variable with various priorities,
allowing to prepend/append or replace build instructions with fine
granularity, and while this is very useful for it's flexibility, it
means rebasing your derived work is extremely painful, as all of these
minute details cannot provide any stability guarantees moving forward.
Some compromise is needed
-------------------------
BuildStream's answer to this has been thus far, that if a downstream
wants a feature done differently, or more specifically; if a downstream
wants the option to have an element built with a different version, or
with a different set of configure time switches, they have two options:
* Fork the upstream and do things differently.
* Work with the upstream to add a project option to build things in
the way they want, if the upstream accepts the feature then it must
be useful to more than one downstream; and maintenance effort can
be shared.
Other hacks exist, like patching junctions, or overwriting shared
libraries in the upstream elements, such that the upstream elements get
relinked against libraries they were not built against at runtime
(which can cause subtle and unexpected breakages, and does not work for
statically linked libraries).
None of these other hacks have been prescribed by the BuildStream
community, however, from our perspective; the two choices above are the
ones we prescribe.
Proposal: Allow replacing element declarations across junctions
===============================================================
The proposal is to move forward with a patch much like Valentin's
proposed patch[0], possibly with a minor format tweak, and with a lot
of test cases ensuring we've got it right.
I'll try to illustrate a bit below, but first...
Unaddressed use cases
---------------------
Before proceeding with the proposal, it's important to grasp the
insufficiency in terms of use cases; where do we fall short ? And if we
are making a compromise, where is the right place to draw the line ?
I think it's fair to say that if you are going to work with an upstream
BuildStream project and expect to consume upgrades moving forward, then
at least the majority of elements in that upstream are going to be
consumed "as is", in otherwords, if you are going to very significantly
modify the upstream elements, you are better off forking the project.
I think that on the other hand, if you are creating an appliance for
example, it would be typical to consume tens or hundreds of elements
from various upstream projects, and need only to build 3 or 4 elements
in custom ways. If I had a kiosk like system, I might have a custom
fork of GTK+ or Qt to build (happened to me before), in such a case I
would want to know that all of the reverse dependencies in the upstream
junction were built against my forked platform library.
Of course, we also have the situation of differing versions which is a
current problem we have in gnome-build-meta, where we want to be able
to build bleeding edge versions of a selection of elements, some of
which reside across the freedesktop-sdk.bst junction boundary.
Build graph illustration
------------------------
After applying Valentin's patch, an example of what a build graph might
look like after overridding an element in a subproject might look like
this:
|
toplevel |
project |
| app.bst gtk-fork.bst
| | |
| | |
------------------------------------------
| | |
| \ |
upstream | otherlib.bst |
project | \ |
| \ |
| gtk+.bst |
| \ |
| \ |
| glib.bst
|
In the above, consider:
* The toplevel project's junction to the upstream project has
declared that it's local 'gtk-fork.bst' element *replaces*
'gtk+.bst' in the upstream project.
* The reverse dependencies in the buildgraph, like 'otherlib.bst',
will depend on the replaced element 'gtk-fork.bst' *instead* of
depending on 'gtk+.bst'.
* The 'gtk-fork.bst' still depends on glib.bst in the upstream
project explicitly.
* The regular circular dependency checks apply, however we have the
ability to push our own declared elements directly into the
subproject and have the subproject's reverse dependencies build
against our own forks.
Format addition to junction elements
------------------------------------
Currently we already have precedent for overriding junction elements in
subprojects, this was mostly done in order to ensure that we can
synchronize subprojects which depend on common subprojects, such as to
explicitly ensure that both subprojects depend on the same version and
configuration of a subsubproject (diamond project dependency shapes).
That said, I think an elegant way forward would be to simply allow
elements to also be specified in the overrides, instead of only
allowing junctions to be specified in the overrides.
Following my illustration above, one might define the upstream.bst
junction element with the following:
kind: junction
overrides:
gtk+.bst: gtk-fork.bst
Thoughts ?
PS: I'd like to thank Abderrahim and Valentin for putting effort into
this proposal before, and I'm sorry I could not follow up on this
sooner.
[0]: https://gitlab.com/BuildStream/buildstream/-/merge_requests/1913
[1]: https://mail.gnome.org/archives/buildstream-list/2020-May/msg00013.html