Re: Proposal: Redesign of configuration surface for all remote services

Sander Striker Tue, 02 Feb 2021 03:36:05 -0800

Hi Tristan,

First of all, Thank You for the write up as well as the progress made so
far.


I'll be honest in that I am not sure I got the full picture of implications
- maybe as a result of your attempt to outline it in detail ;-).  But
that's on me.

I have a couple of questions:
*Remote Execution*
There seems to be a skew towards only using a single service in the
proposal.  There are some aspects to Remote Execution that we haven't
clearly articulated or fully implemented.  We have only implemented
scenario 1 currently.

Scenario 1: Full Remote Execution [implemented]
- BuildStream checks ActionCache
  - on cache hit, get ActionResult and blobs from CAS associated with
Remote Execution.  Note: the blob content fetching could be on-demand
  - on cache miss, uploads any missing blobs for the Action to the CAS
associated with Remote Execution, finally it uses the Execution service to
execute the Action.

Scenario 2: Remote Cache (read-only) [not currently implemented]
- BuildStream checks ActionCache
  - on cache hit, get ActionResult and blobs from CAS associated with
Remote Execution.  Note: the blob content fetching could be on-demand
  - on cache miss fall back to local execution

Scenario 3: Remote Cache (read-write) [not currently implemented]
- BuildStream checks ActionCache
  - on cache hit, get ActionResult and blobs from CAS associated with
Remote Execution.  Note: the blob content fetching could be on-demand
  - on cache miss fall back to local execution, and upload result blobs to
CAS associated with Remote Execution, finally write ActionResult to
ActionCache

Especially in case of Scenario 2, it is not unimaginable to have a project
propose a read-only endpoint.  Getting these implemented is currently
mostly tricky due to the "fall back to local execution" bit - as we choose
to either remote or to locally execute up front.

*Write access configuration*
With the current credentials it may make sense to only have push be enabled
when auth credentials have been set up in the user configuration.  To also
set `push: true` seems redundant if that is also set in project.conf.  If
you don't want to push as a user, set `push: false` in user conf if you
have credentials configured.
At some point in the future, I can imagine credential prompting as well,
and I would imagine bst helps me out by caching those creds or e.g. storing
a JWT for later use.

Any thoughts on how the above fits with the current changes?

Cheers,

Sander


On Thu, Jan 28, 2021 at 9:12 AM Tristan Van Berkom <
[email protected]> wrote:

> Hi all,
>
> Since there was no activity on this thread for a very long time, I
> decided to go ahead and take a crack at this.
>
> I have a good branch now that is ready for review. The MR is up here:
> https://github.com/apache/buildstream/pull/1453
>
>
> I'm sending a detailed email because it's a large proposal and I would
> like this to be visible, so that people can chime in incase we've
> missed an important use case.
>
> Cheers,
>     -Tristan
>
>
> Here is the design/changes I've come up with.
> =============================================
>
>
> The offending junction configurations
> -------------------------------------
> The junction configurations:
>
>   "cache-junction-elements"
>   "ignore-junction-remotes"
>
> Are completely removed, reducing the worrisome ambiguity of what
> happens in what configuration.
>
> This is replaced by the enhanced user configuration.
>
>
> Authentication
> --------------
> For all of the authentication related properties, `server-cert`,
> `client-cert` and `client-key`, these have been split out into a
> subdictionary named "auth" for any remote configuration.
>
> This may allow better extensibility for alternative authentication
> methods in the future, however right now it serves us very well to be
> able to document the "auth" dictionary in one central place in the
> documentation.
>
>
> Remote Execution Configuration
> ------------------------------
> As described in this thread, this is now only configurable with user
> configuration, and only one "remote-execution" block is ever
> considered.
>
> There is no longer any ambiguity here, "remote-execution" applies to an
> entire session, one build session cannot be built across multiple
> different remote execution build clusters.
>
>
> Artifact and Source cache configuration
> ---------------------------------------
> Projects are still allowed to provide recommendations for artifact and
> source cache servers.
>
> User configuration now has the ability to override them, i.e. disregard
> artifact and source cache servers declared in projects.
>
> Also, it is no longer possible to declare an artifact/source cache
> server as a dictionary, it MUST be a list.
>
> This choice is simply because it the dict-or-list tactic here does not
> buy us any convenience whatsoever, and clarity that it is in fact a
> list of dictionaries is more worthwhile.
>
> Consider:
> ~~~~~~~~~
>
>   artifacts:
>     url: https://pony.moose/zebra:4040
>     push: true
>
> Versus:
> ~~~~~~~
>
>   artifacts:
>   - url: https://pony.moose/zebra:4040
>     push: true
>
> It is exactly the same amount of typing, no point in supporting both
> here.
>
>
> Project Configuration
> ~~~~~~~~~~~~~~~~~~~~~
>
>   #
>   # This is mostly unchanged, except for the `auth`
>   #
>   artifacts:
>   - url: https://pony.com:9999
>     type: both
>     push: false
>     instance-name: this-shard
>     auth:
>       server-cert: server.crt
>
>
> User Configuration
> ~~~~~~~~~~~~~~~~~~
> We can declare global artifact configuration, which either
> overrides or augments project recommended cache servers.
>
> When "augmenting", the user configuration is still at a higher priority
> than the project recommendations (as in: user configuration caches will
> be consulted *first* when interacting with remotes).
>
>
>   #
>   # Global artifact configuration
>   #
>   artifacts:
>
>     #
>     # Here we decide whether user configuration overrides
>     # project recommendations.
>     #
>     override-project-caches: true
>
>     #
>     # And we declare the global artifact configurations
>     # under the new "servers" sub-dictionary instead
>     #
>     servers:
>     - url: https://pony.com:9999
>       type: both
>       push: true
>       instance-name: this-shard
>       auth:
>         server-cert: server.crt
>         client-key: client.key
>         client-cert: client.crt
>
>
> We can still declare artifact configuration in the overrides, with
> exactly the same new configuration:
>
>   #
>   # Artifact configuration for project "foo"
>   #
>   projects:
>     foo:
>
>       artifacts:
>         #
>         # Lets completely override the cache for only project "foo"
>         #
>         override-project-caches: true
>
>         #
>         # And declare the servers here
>         #
>         servers:
>         - url: https://pony.com:9999
>           type: both
>           push: true
>           instance-name: this-shard
>           auth:
>             server-cert: server.crt
>             client-key: client.key
>             client-cert: client.crt
>
>
> Use case overview
> =================
> Here is an overview of the previously discussed desirable use cases.
>
> Inline responses to my initial proposal:
>
> [...]
> > Use cases we want
> > =================
> > Here I will try to provide a birds eye view of what our use cases are,
> > what does a BuildStream client application require from these
> > services ?
> >
> >   * The ability to store and retrieve artifacts on a remote artifact
> >     server.
>
> Of course.
>
> >   * The ability to store and retrieve staged source packages, indexed
> >     by source cache key, on remote source cache services.
>
> Of course.
>
> >   * The ability to farm out builds to a remote execution service
>
> Depending only on your ability to setup a remote execution build
> cluster, of course.
>
> >   * The ability to make requirements of worker instances on a remote
> >     execution service.
> >
> >     - Possibly also the ability to bail out early if the remote
> >       execution service knows that it cannot provide a worker which
> >       with the properties which some of the project elements require.
>
> Nothing changes here thus far, although that is not to say we are
> perfect in this regard yet, but this patch does not effect this.
>
> >   * Ability to have redundancies in configuration of remote servers, in
> >     case a service is down we usually allow configuration of services
> >     in list format.
>
> We still have this.
>
> >   * Ability to carry artifacts forward from a third party artifact
> >     cache which was recommended by project configuration across a
> >     junction boundary.
> >
> >     I.e. for better repeatability, it is often desirable to re-cache
> >     the artifacts from an upstream project on your own infra in order
> >     to ensure you have your own copy.
> >
> >     NOTE: This is currently only available in project data and not
> >           overridable by user configuration in the form of the
> >           `cache-junction-elements`[2] configuration, which I already
> >           pointed out was problematic in my original report[0].
>
> With this patch, we can achieve this use case by configuring a global
> cache server for pushing, and setting `override-project-caches` to
> `false`.
>
> The result will be:
>
>   * When pulling, we will:
>
>     - First try to pull from the globally defined cache servers
>     - Fall back on project defined cache servers
>
>   * When pushing, we will:
>
>     - First push to our globally defined cache servers
>     - Not push to the project defined cache servers
>
> Why will we not push to the project defined cache servers ?
>
> Well, it is currently only a consequence of the simple fact that
> normally people simply do not configure "push" servers in a
> project.conf, as that would likely imply that they are publishing the
> private key needed to push to their server along with their project, so
> that everyone and their dog can push anything they like to the artifact
> server.
>
> We could additionally police this in the code and completely disallow
> such nonsense configurations, but for now I've just left a fat notice
> in the project.conf documentation which points out that is is a very
> bad idea to configure a "push" remote from your project.conf (following
> the "let people shoot their own feet if they really want to" policy).
>
> >   * Ability to avoid downloading artifacts found on third party
> >     infrastructure.
> >
> >     I.e. for better trustability, you may want to ensure that all of
> >     the built artifacts you end up consuming were built on
> >     infrastructure you control, rather than downloaded from an upstream
> >     project's artifact server.
> >
> >     NOTE: This is currently only available in project data and not
> >           overridable by user configuration in the form of the
> >           `ignore-junction-remotes`[2] configuration, which I already
> >           pointed out was problematic in my original report[0].
>
> This is now possible by simply declaring `override-project-caches` to
> `true` in the global configuration, regardless of whether or not you
> have provided any remotes in your global configuration.
>
> >   * Ability to farm out any local caching work a remote service, to
> >     reduce uploads and downloads for builds when configured on an RE
> >     service (by way of specifying the RE service's CAS here),
> >     configuring multiple build machines which run without RE may also
> >     be optimized by way of using the same remote CAS for this.
> >
> >     NOTE: This has not yet landed and is a part of Jürg's ongoing
> >     work[3].
>
> This orthogonal feature has yet to land in master.
>
> >   * Ability to clearly override the recommendation of any project.conf
> >     in the loaded pipeline using user configuration, which should
> >     always have the last word on any user configuration.
>
> This can be achieved on a per-project bases by setting the new
> `override-project-caches` attribute to `true` in the overrides section
> of the user configuration.
>
>
>
>
>

Re: Proposal: Redesign of configuration surface for all remote services

Reply via email to