Re: Proposal: Sealing up the sandbox, final steps

Tristan Van Berkom Wed, 18 Nov 2020 00:12:18 -0800

Hi,

I'll try to bring this conversation forward a bit, we don't have all
the answers yet and this mail does not propose a precise solution for
every detail yet either...

On Thu, 2020-11-12 at 12:03 +0100, Jürg Billeter wrote:
> Hi,
> 
> On Tue, 2020-11-03 at 18:01 +0900, Tristan Van Berkom wrote:
> > This proposal consists of a handful of changes, so I will avoid
> > drilling too deeply into the finer details, but instead try to explain
> > what these changes are, how they are intertwined, and how they benefit
> > us.
> 
> +1 on the general direction of this proposal. We may have to look at
> these finer details, though, and how they impact real-world plugins
> before a final decision.

Yes of course.

Since there are more details than usual, it was useful to open the
conversation with a more general idea.

I do think we don't need to drill all the way down to specific API
design though, I think it will be okay to shape that during
implementation and review.

> > [...]
> > 
> > 
> > Early assembly of sandbox commands
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > This will require an API change such that commands should be
> > constructed early on, probably directly at Plugin.configure() time.
> > 
> > Element will gain some API for command list to be created early on, and
> > these commands will be used automatically by the Element base class to
> > feed into the cache key.
> 
> I'm definitely interested in the details of these API changes.
> 
> We should likely cover all of what `configure_sandbox()` currently does
> and also the output/collect path (currently the return value of
> `assemble()`). They should affect the cache key and having the output
> path early may allow optimizations with remote execution.

Indeed, it will be good to get rid of configure_sandbox() and combine
this with an earlier stage setup API.

What we would be left with instead, would be a handful of Element APIs
which would be expected to be called at `Element.configure()` time.

Perhaps, it would also make sense to have some of these APIs be on the
SandboxConfig object instead, if only as a matter of taste.

The Element `Plugin.configure()` implementations would then be
responsible for assigning the following things:

  * Resolved set of commands to run in the sandbox (if any)
    - This of course includes all commands which are to run, which will
      include commands which we currently label "integration commands".

  * Resolved environment variables to run in the sandbox
    - One setting for all commands, but ability to set environment
      variables for each individual command
    - One can consider the environment setting as an attribute
      of the commands themselves, these are tightly coupled.

  * The working directory in which commands will be run
    - Similarly to environments, it is possible to consider this
      as contextual to individual command
    - We don't have this flexibility in mind, but we should either
      encode this flexibility at the same time, or at least design
      the API so that such flexibility can be added later.

  * Configuration of the nature of paths in the sandbox
    - Will artifacts be staged at those paths (is it still a real
      requirement to know this in the API ?)
    - Will the paths in question need to be read-write
    - Currently we have a sandbox flag indicating whether the root
      should be readonly or not.

  * The output path

  * Creating any blobs which will be intended for the sandbox with
    the new API which will allow plugin authors to compose data
    programatically to stage into the sandbox.

In relation to the above, there is the problem of doing the integration
commands vs other build commands. I think this teaches us that the
configuration of whether paths are read-only or not while a given
command runs is also contextual to the command being run.

I don't know how this all ends up working with remote execution setups,
but we do have integration commands which absolutely must run with a
read-write rootfs, and the result of those permutations absolutely must
be carried forward to later commands.

Ideally we have a solution where the setup batched commands have some
control over which paths are read-only at which stage.

I'm sure Jürg has some valuable input on this part :)

> How will these commands interact with custom Python assemble logic?
> Will there be a public `Element.run_commands()` method, which is called
> by the default implementation of `Element.assemble()` but plugins can
> also call it in their `assemble()` override? Or is there no need for
> such combinations and plugins would choose between running sandbox
> commands or custom Python assemble logic but never both in the same
> plugin?
> 
> Also, will plugins still need access to the `Sandbox` object?

Good questions.

Basically the tasks which remain are:

  * Staging dependencies

  * Running the actual commands, including integration
    commands.

  * Staging any custom python crafted data into the sandbox.

As far as I can see, all of this could also be swallowed up by the APIs
which an Element must call at Plugin.configure() time, and we could
just do away with `Element.stage()` and `Element.assemble()` methods
altogether.

I'm not sure that is the direction to take (it might be ?).

I also like the idea of having a default implementation and having a
helper function to run commands (including integration commands).

I think if we want to keep the Element.stage()/Element.assemble()
methods around, I would personally prefer to keep the `Sandbox` object
public (with a reduced API surface), if and only if it makes for a more
presentable and human readable plugin facing API.

I.e. we could have Element level APIs all the way, but it might be less
elegant than having a Sandbox object.

Cheers,
    -Tristan

Re: Proposal: Sealing up the sandbox, final steps

Reply via email to