Re: Proposal: Extending dependency attributes

Douglas Winship Thu, 30 Jul 2020 03:27:25 -0700


On 29/07/2020 23:07, Sander Striker wrote:

Hi,


Took me a bit to get back to this, apologies.

TL;DR in general I find myself supportive, but I'm not convinced of the UI,
which turns out quite verbose.  As a selfish motivation I'm also looking
for ways to identify elements that do or do not match the same kind as the
plugin.  If this is an implicit annotation that could work.

More inline.

On Sun, Jul 26, 2020 at 9:14 AM Tristan Van Berkom <
[email protected]> wrote:

Hi all,

     This is a proposal for extending the `dependency` attributes on
element declarations such that elements can extend the dependency
attributes and participate in parsing them.

This is a proposal aimed at reconciling a few topics which we've been
looking into recently.

   * Plugins accessing private API[0]

     In Chandan's announcement we underlined the importance of being
     able to reduce redundancy in element declarations which stage
     elements into different sysroots.

     Any element which needs to parse an element name in it's
     configuration and later use Element.search() to find that element
     and treat it specially, is likely causing the user to enter
     redundant information by requiring the user to specify the element
     both as a dependency and also in the element configuration.

Indeed.  I'd like to be able to distinguish between types of dependencies,
e.g. toolchain vs libraries, to be able to stage them in different
locations.

   * Dependencies in a sysroot[1]

     This is a proposal from Valentin from two years ago, which
     underlines the usefulness of being able to stage elements in
     multiple locations (which BuildStream does already, except that it
     appears to not be an obvious thing for users).

     This proposal would not solve Valentin's proposal by itself, but it
     would provide a more convenient API for Valentin to go ahead with
     extending the BuildElement with a feature to satisfy his
     requirements.

     The current way to implement this in BuildStream would be to add
     new configuration to BuildElement allowing BuildElement to pick and
     choose which dependencies it decides to stage at which paths.

     After this proposal, it would allow the BuildElement abstract class
     to add syntactic sugar to the dependency objects in the `depends`,
     `build-depends` and `runtime-depends` lists, making the final user
     facing feature a bit more user friendly.

This sounds good.

   * Replace Element.search() completely.

     The Element.search() API has been broken[2] for a very long time,
     as it does not support addressing elements across junction
     boundaries.

     I created this merge request[3] this weekend to address this, since
     we now have a clearer story about project relative element paths,
     however there are some problems with the patch.

     Namely, in order to make Element.search() performant and avoid
     redundantly reloading .bst files from the disk later on, we need to
     cache a lot of data which we usually discard from the loader,
     increasing our memory footprint by around 12%.

     Considering that we've been wanting to provide a solution that is
     more tightly coupled to the dependency lists, I think it is best if
     we just let Element.search() die completely, as any utility it
     provides can be replaced by this proposal.

Ideally there is no need for Element.search(), as it seems to mostly be
used for supporting "depends"-like functionality outside of core depends.

Proposed format changes

=======================
The Dependency dictionary[4] would gain a single new `config`
attribute, which would be a dictionary who's semantics would be defined
by the owning element.

   kind: foo

   build-depends:
   - filename: payload.bst
     config:
       #
       # The plugin loaded as the `foo` kind can
       # now be told anything specific about this
       # dependency `payload.bst`.
       #
       frob-me: True

   # Configuration that is parsed by the plugin
   # loaded as the `foo` kind in this project
   # is specified already on the toplevel `config`
   # dictionary.
   #
   config:
     some-config: "Yes please"

I'm not sure I follow.  Maybe this is more relatable with a less abstract
example?

How about a plugin that takes input elements, and outputs them ascompressed archives?

(Perhaps not entirely realistic or necessary, but it works for asemi-concrete example).

kind: multi_archiverbuild-depends: - filename: foo.bst config:compression: tar.gz outputname: foo.tar.gz - filename: bar.bst config:compression: zip outputname: bar.zip

Proposed plugin API changes
===========================
The element would gain a new method for parsing dependency
configurations with a similar contract as Plugin.configure().

   * This new method would be called after Plugin.configure() and before
     Plugin.preflight()

   * This new method would be given tuples of (Element, MappingNode) for
     each direct dependency, regardless of whether it had specified a
     `config` node (so the MappingNode is allowed to be `None` for some
     entries).

     This allows the plugin to make a judgement about dependencies which
     no configuration was specified for.

     For example: an element like x86Image might require 2 build
     dependencies, one for the tooling to stage at `/` and another for
     the payload to deploy at `%{build-root}`, so it could raise an
     error if there is a third dependency specified which would be
     ignored.

   * This new method would be optional to implement, however it will be
     mandatory if the user ever specifies a `config` dictionary for any
     of the given plugin instance's dependencies.

     This is to say that BuildStream core would take care of simply
     raising a nice LoadError informing the user that `config` is an
     invalid configuration for the said element, if the said element
     does not implement any special parsing of dependency
     configurations.

Why?  I see no reason for BuildStream core to raise concerns when users
make anticipatory changes to an element definition to work with a (future)
version of a plugin.  Or when a user switches an element from one kind to a
similar kind that doesn't support these opaque configurations.  What am I
missing?

Example of Valentin's BuildElement proposal[1]

Example of Valentin's BuildElement proposal[1]

==============================================
I thought it would be good to include a small example of what the
BuildElement sysroot example might look like in YAML, based on the
`readline.bst` example in his original email:

   depends:
   - filename: glibc.bst
     config:
       sysroot: "%{sysroot}"
   - filename: ncurses.bst
     config:
       sysroot: "%{sysroot}"
   - filename: system-image.bst
     type: build
   - filename: sysroot-compiler.bst
     type: build

In this example, the BuildElement class would implement
Element.configure_dependencies() in such a way as to collect a list of
all of the dependencies which are supposed to go in specific sysroots,
and stage those in those sysroots instead of `/`.

I find the example a bit confusing, as what does it mean to have a
different sysroot config on a runtime dependency?  Does that only affect
these dependencies in their capacity as build-depends?  What is the
expected effect here?

Other than that, it's pretty verbose.  I was envisioning something like:
   build-depends:
   - liba.bst
   - compiler.bst [toolchain]
   - platform.bst [platform]

That would be a pretty useful format. It preserves the option of addingdependencies as a single line. (I'm assuming that multiple tags could beadded on the same row, to one dependency.)

It's much less powerful than having a dictionary though. A dict givesthe plugin designer the flexibility to make config as complex or assimple as it needs to be.

If we implemented both formats, then I image that "compiler.bst[toolchain]" would be a shorthand for:


  - filename: compiler.bst
    config:
      toolchain: true


And have these elements essentially get a list of tags associated, so that
one has a way to filter based on tags present/absent when iterating (or
have a specialized function for it).

Now, maybe something similar can be achieved if we would allow something
like the following, which avoids duplication of the repetitive verbose
lines:
   depends:
   - filenames:
     - glibc.bst
     - ncurses.bst
     config:
       sysroot: "%{sysroot}"
   - filenames:
     - system-image.bst
     - sysroot-compiler.bst
     type: build

I like the idea of being able to group dependencies like this. In alarge element declaration with a lot of dependencies, this could save alot of time and space. There are some concerns though:

First: 'filename' and 'filenames' are such similar keys. There's a lotof room there for typos. I'd suggest making 'filename' work with bothformats, single scalars or lists of scalars. If we add 'filenames', thenit should just be a synonym for filename.

Second: would grouping matter? For instance, would there be a differencebetween:


  depends:
  - filenames:
    - foo.bst
    - bar.bst
    config:
      compression: zip

and:

  depends:
  - filename: foo.bst
    config:
      compression: zip
  - filename: bar.bst
    config:
      compression: zip

Those two formats look quite different, semantically. If we allow bothformats, plugin-writers might expect to be able to use them to meandifferent things. For instance the first option fight be the format toproduce a single zip file, while the second option might be the formatto produce two zip files.

If we want to give configure_dependencies() the ability to distinguishbetween the two formats, then would that be easy to implement? is itcompatible with the data patterns that we currently use? If we don'twant to, then will it be intuitive that these two different formats meanthe same thing?

I've sadly stalled on progress on a bazel plugin[5] that I started for
research/discovery reasons as outlined in Chandan's summary message[0].
However, even in its current very WIP form, it raises some questions on how
to do certain things.  Like filter all dependencies of the same kind, to
stage them differently.  I imagine that some of this work would end up
being split between stage() and configure_dependencies() following your
proposal?

Implementation plan

===================
To implement this change I would go through 3 phases of implementation.


   * Implement the new format and method

     This would be done in the BuildStream core repository and
     would introduce bare bones plugins in the `tests/` directory
     for the purpose of testing the correct behaviors of the
     feature.

     At the same time, this would result in deprecation of
     Element.search().

   * Migrate existing plugins

     In this phase, we would update the ScriptElement, base
     class, the `script` convenience element, and any plugins
     in `bst-plugins-experimental` which use `Element.search()`
     to use the new `Element.configure_dependencies()` instead.

To support layout, right?

   * Remove Element.search()

     Once plugins have been migrated, we would remove
     Element.search() entirely.

That seems fair.

Assuming we can agree on the UI this seems reasonable :).

Any thoughts ? Please discuss :)
Cheers,

Sander

Cheers,
     -Tristan

[0]:
https://lists.apache.org/thread.html/r057a6fa1ed275fcbc8d92972753bf32ce1e16ab3740580da69be3f3c%40%3Cdev.buildstream.apache.org%3E
[1]:
https://mail.gnome.org/archives/buildstream-list/2018-August/msg00009.html
[2]: https://gitlab.com/BuildStream/buildstream/-/issues/931
[3]: https://gitlab.com/BuildStream/buildstream/-/merge_requests/1957
[4]:
https://docs.buildstream.build/master/format_declaring.html#expressing-dependencies

[5]:
https://github.com/sstriker/bst-plugins-bazel/blob/wip/src/bst_plugins_bazel/elements/bazel.py

Re: Proposal: Extending dependency attributes

Reply via email to