Hi Tristan,

On Sat, Aug 15, 2020 at 4:22 PM Tristan Van Berkom <
[email protected]> wrote:

> Hi all,
>
> As a quick strawman proposal, I would like to propose that we remove
> the automatically set "%{project-name}" and "%{element-name}"
> variables, in the name of making progress towards resolving this issue
> which is on our 2.0 blocker list:
>

TL;DR, I'm not a big fan of this proposal, as I think the %{element-name} is
likely used beyond %{build-root}.  Furthermore, I think I've actually come
to
like the unique build root :).


>   Renaming of elements and projects affect cache keys
>   https://gitlab.com/BuildStream/buildstream/-/issues/1034
>
> At the same time, I would like to establish consensus that the
> alternative to removing the "%{element-name}" and "%{project-name}"
> automatically set variables, is to simply say that !1034 is not a bug,
> and to resolve it with WONTFIX.
>

I think this is a matter of setting a non-default %{build-root} which would
not
contain %{project-name} nor %{element-name}.  If a project defines this as a
fixed path then element-name and project-name do not affect the cache key.

In other words, I think I'm coming around to a Won't Fix.  Mostly because I
think we made the decision for a path per element for a reason; and I don't
think that reason no longer applies.

Also, selfishly, I'd like to stage source for multiple elements at the same
time*,
for the bazel plugin research discovery[1].


> Here is some history and some facts to explain why.
>
>
> History
> ~~~~~~~
> I have a vague recollection of a BeaverCon in manchester where we had a
> focus on IDE integration and debugging, and one of the hot topics was
> the ability to have your debuggers be able to trace back built binaries
> to their sources without any hassle.
>
> I.e. when compiling with debugging flags enabled, there is a high
> likelyhood that the resulting binaries would end up having their build
> directories hard coded into them.
>
> Running debuggers is otherwise a bit of a hassle, as one needs to
> inform the debugger of where the sources are for each library, if they
> are not their in their original location.
>

I think we actually had PoC VS Code with debugging integration going at the
time.
This did involve workspace and non-workspace element builds.


> Action: What we ended up doing is adding the "%{element-name}" and the
> "%{project-name}" variables which are automatically resolved by
> BuildStream, and using these to resolve the "%{build-root}" variable
> which is declared by the core project.conf and used by virtually every
> element (certainly every BuildElement).
>

Right.  And the %{element-name} variable is not dropping the".bst"-suffix if
present at the moment :).  I wanted to propose a change for that [2].
<https://gitlab.com/BuildStream/buildstream/-/commit/5395789b01761022cc4653574ebe517c45c7e260>


> Somewhat orthogonal but related to this, is we set out to implement
> caching of build trees, which together would get us much closer to good
> debugging environment (we would have the ability to have a `bst shell`
> environment where multiple build trees are staged and we can run a
> debugger, and everything "just works").
>
> Great.
>

:).

Consequences
> ~~~~~~~~~~~~
> The fact that we have "%{build-root}" defined to something that is
> unavoidably unique per element name/project name, means that the
> resulting artifact cache key needs to be different.
>

Yes.


> It might be that we have a bug where BuildElement *fails* to consider
> the "%{build-root}" in the cache key, but I think this is an
> unacceptable path to go down.
>
> BuildStream strives to produce reproducible artifacts, as such the
> cache key must capture anything which has a chance of affecting the
> binary output in an artifact.
>
> The path where sources were staged and the build was performed, has a
> high chance of affecting the artifact output, and as such it should
> certainly be considered in the artifact cache key.
>

Not sure about a high chance: it's very uncommon for software to always
be built from the same base directory, ie. most software is downloaded,
unpacked, then built from someone's home directory.  And those tend to
be unique-ish.
That said, I'll grant you that there is a chance.  Less-well behaved
software
may make assumptions about where it needs to be built from.  For such
elements you would likely have to very deliberately and specifically set the
build-root.

Possible side benefit
> ~~~~~~~~~~~~~~~~~~~~~
> Removing these variables which BuildStream automatically sets up
> unlocks the possibility of caching / reusing resolved variables which
> have the same set of inputs (any two elements for which all variables
> resolve exactly the same, can share resolved variables).
>
> I'm not sure how much this buys us in real-world large projects, it
> depends on whether people favor configuring an element's "variables" or
> whether people favor configuring the element specific "config" section
> more (both configuration approaches are valid, and variables are often
> an interesting thing to configure in an element).
>
> Mileage of this optimization may vary or may not prove worthwhile.
>

Indeed.

Summary
> ~~~~~~~
> If we remove these automated variables, we remove some measure of
> variance from the build process which might not get caught by cache
> keys, which is good.
>
> On the flip side, we will *not* have this "every element has their own
> build directory" behavior which we wanted to have, for the sake of
> hypothetically making debugging easier, which is bad.



> I'm not entirely sure *how* bad it will be, though. Given that
> BuildStream is in a position to know what the "%{build-root}" was for
> each artifact, it should not be very difficult to stage these cached
> build directories all into separate paths and stage them together in
> harmony. Even if these paths are not the same paths which were
> originally encoded into binary debugging information at build time, it
> seems to me a small hop away to just configure debuggers to look in the
> right place.
>
> I.e. it would not "just work", but if one were to actually sit down and
> work on the IDE/debugging UX, I don't think having a single hard coded
> build directory for all elements at build time would really be an
> obstacle.
>

I think it will require [a lot] more integration work that way, based on
that
we already had someone go down that path in the past and ended up
here.

Given that renames may happen but are likely not that common, I think
optimizing for it is not that necessary.  Especially not, if the defaults
can
be overridden to get the desired behavior for this use case.  Which may
in turn trade-off IDE integration.


> Cheers,
>     -Tristan
>

Cheers,

Sander

[1]:
https://github.com/sstriker/bst-plugins-bazel/blob/wip/src/bst_plugins_bazel/elements/bazel.py
[2]:
https://gitlab.com/BuildStream/buildstream/-/tree/sstriker-normalize-variable-element-name

Reply via email to