Hi!

On Wed, 2016-03-30 at 08:48:45 +0200, Helmut Grohne wrote:
> On Wed, Mar 30, 2016 at 01:29:15AM +0200, Guillem Jover wrote:
> > > b) Packages that do not "set -u" (nounset), can now prepend $DPKG_ROOT
> > >    to any file they operate on. With old versions $DPKG_ROOT will be
> > >    unset and with change a) $DPKG_ROOT will be empty. Thus this change
> > >    is backwards-compatible.
> > 
> > Right, and these can always be set like ": ${DPKG_ROOT:=}". We could
> > even recommend this as part of some doc/howto/spec for the initial
> > deployments, before packages can assume a recent enough dpkg.
> 
> I like the recommendation, but I note that only about 30 packages employ
> "set -u" (i.e. about 0.1%).

I've added this for now to the wiki spec.

  <https://wiki.debian.org/Teams/Dpkg/Spec/InstallBootstrap>

> > > c) dpkg should gain a new force flag. I call it --force-remote-configure
> > >    for now. It is supposed to force dpkg into running maintainer scripts
> > >    without chroot even when the package in question did not declare that
> > >    its maintainer scripts support this mode of operation. Note that we
> > >    currently have no way to express whether a package supports running
> > >    maintainer scripts without chroot. The flag is being added by
> > >    0002-add-force-remote-scripts.patch and the behavior is implemented
> > >    by 0003-inhibit-chroot-when-force-remote-scripts.patch. Packages can
> > >    only reasonably support this mode after implementing b).
> > 
> > I don't quite like the name, as remote to me implies on some other
> > machine. Perhaps foreign, host or extern(al), although some of these
> > are a bit overloaded terms already.
> 
> Of course, I am not attached to the name. I just needed some string that
> remotely made sense. What about "chrootless-configure" to make it
> crystal clear? It should go hand in hand with d) if possible. What is
> your preference?

Hmm, that's always difficult. :) "configure" probably not, as this
involves several maintainer scripts not all of them being configure
related. "chrootless" while quite clear is a bit long, but if there's
nothing better then that does. Things that come to mind perhaps which
are also pretty long, but other proposals welcome:

  maintscript-chrootless
  maintscript-jailbreak
  maintscript-detached

Another option could be to add:

  maintscript-chroot

and make that the default. The problem I can see with this is that it
feels like it is promoting a bit the chrootless case as perhaps safer
or similar.

> > Also I'm not sure this makes sense as a force option (instead of its own
> > proper option), as it is a behavior change that we want to always be safe
> > to use, in contrast to a force option that just forces the behavior.
> > Having a force option that would not chroot some times is a bit
> > strange. We might still want to have both though, and there was a
> > related bug report for that (#614126).
> 
> For the same reasons that you bring forward against making this a force
> option, I think it should be a force option: No package in the archive
> is currently known to be safe to configure from outside the chroot. So
> doing that is inherently unsafe. We generally use force options to do
> unsafe things. I see this force option as a development tool and not as
> switch being used regularly. I think it is similar to --force-depends:
> Proceed even though the package declared that it doesn't support the
> mode of operation. The declaration is the absence of a support field as
> in d). Does this make sense to you?

Yes, but as I mention not as the primary user-facing interface.

> Possibly another switch is needed to enable this feature at all? I
> thought we could just make it default (for supported packages), but
> maybe that has downsides as well.

I don't think this behavior should be made the default, it'd be very
unexpected and can cause issues if the external environment is not
"compliant".

> > > d) Once a) is accepted and b) starts getting implemented, we need to
> > >    think about a way for packages to tell that they support "remote
> > >    scripts". One way to do so would be to add a header "Remote-Scripts:
> > >    yes" to the binary package stanza. Packages thus marked would be
> > >    required to honour DPKG_ROOT in all maintainer scripts. This flag
> > >    makes no provisions yet on what programs can be assumed to be
> > >    installed outside the chroot that is operated on.
> > 
> > Yes, ideally only packages marked as such would get a chroot-less
> > environmemt when requested with the new option, because expecting the
> > user to know when a package supports this mode and on what specific
> > version is a terrible interface IMO, as it requires the user to
> > analyze the .debs before unpacking them.
> 
> This also seems in support of the force option to override.
> 
> > The other thing, that ISTR you brought up on IRC are triggers. I'm not
> > sure how we'd handle those either. :/
> 
> I thought about this some more and the only sensible approach for
> triggers I could come up with is handling them on their own: The
> triggered package must declare support for this new mode or its trigger
> processing will proceed in the old way (unless forced). The major
> downside here is that you cannot tell beforehand whether you can
> configure a package without chroot, but I don't see that changing
> without breaking the flexibility of triggers.

Right.

> > > e) Once a) is accepted and b) starts getting implemented, we need to
> > >    think about what programs maintainer scripts can assume to be
> > >    available outside the chroot. Some ways to handle that:
> > >     * Packages may only assume "common unix functionality". Such a set
> > >       would have to be defined somehow and roughly equates what
> > >       debootstrap requires.
> > 
> > Defining this in terms of strict POSIX compliance (or a subset of the
> > utilities defined within) would be the easiest/best I think.
> 
> I brought this forward as an option, but it really is one that I don't
> like: Crucially, ldconfig is a glibc-ism and not POSIX. Thus the
> libc-bin trigger cannot be converted and the utility of the whole
> approach is fairly limited. In particular, it no longer addresses the
> motivating use cases (debootstrap and multistrap).

AFAIK ldconfig currently only updates a cache and it should not be
needed at all. The other non-caching action that ldconfig performs is
creating symlinks, but we require by policy that those should be
shipped by the package so I think not calling ldconfig in the trigger
should be safe?

In addition we can still call binaries from inside the chroot if they
allow for this kind of detached operation, but in many cases this might
require modifictions to the upstream code too (including for several
dpkg tools). Of course that is a problem for foreign arch setups, but
not for same arch ones.

> > >     * A new set of headers Maint-{Depends,Conflicts,...} is added to
> > >       request tools to be installed. These new relations would be
> > >       checked outside the chroot (if any).
> > > 
> > >       A full Debian release or two needs to pass before such headers can
> > >       be used in the archive. This also poses the problem that a user
> > >       can remove packages required for removing other packages and thus
> > >       revoking the ability to remove certain packages. It is not clear
> > >       how the absence of Maint-Depends is supposed to be handled. It is
> > >       not clear whether dpkg needs to lock the dpkg database outside the
> > >       chroot.
> 
> I do have an answer to the absence of Maint-Depends now: Also add
> Runtime-Depends. Then Depends would simply beam both Maint-Depends and
> Runtime-Depends like Build-Depends means both Build-Depends-Arch and
> Build-Depends-Indep. I note that even without the rest of the changes,
> the splitting of Depends would make deity (or at least Don) a little
> happier.

I'm not sure why it would make deity happier, it would still need to
satisfy both when installing stuff. Also the rpm equivalent has
instances for pre and post, and in that scenario Pre-Depends might
also deserve splitting I guess, which means a myriad of new fields. :/

If going the full rpm way, this also might imply in many cases duplicate
information in multiple fields. Say you need package-x in postinst and
prerm, then we'd need to include it twice in Maint-Depends-Preinst and
Maint-Depends-Prerm for example. This does not happen currently as you
list those packages once only in the weakest field necessary. Also I'm
not sure how rpm scriplets really work, but in our case our rollback
mechanism in case of errors involves jumping from post to pre and the
other way around, so having such fine grained separation does not seem
worth it to me.

Even assuming a simple two-way split between Runtime and Maint
dependencies has other potential issue, such as triggers which are
out-of-band (and not always declarative). If the package manager
frontends allowed to remove packages which are only maintscript
dependencies then this would be a mess. Another similar case is
disappearing packages which are also out-of-band events, another package
might completely replace an existing one w/o the latter having any
previous knowledge of that fact, and that's file-based so not something
a frontend can predict. I can imagine that just removing maintscript-only
dependencies might cause dependency issues.

There are at least two main use cases for this split of the dependencies
as you've mentioned: to make running the maintscripts from an external
environment easier, and to be able to remove them in case of generating
stripped down embedded images.

The first one I think is better served by trying to:

  1) remove as many maintscripts as possible, via triggers for example,
     or simply by making them unnecessary.
  2) split the installation bootstrap logic into a different
     maintscript, as described in the InstallBootstrap spec.
  3) switch to a more declarative way of doing things.

Which I think would be a very welcomed initiative by the project at
large.

The second depends on how much of a problem this really is. Do we know
if this would avoid 5 packages or 100, and how much those would weight
in terms of space or transitive dependencies? Because depending on the
size of the problem this becomes a non-argument IMO. This one in addition
of being helped by some of the previous changes, could also be handled as
simply informational annotations, such as a new field such as:

  Package: core-package
  Depends: libcore, tool-a, tool-b
  Maint-Only-Depends: tool-b

Which of course also has the problem of duplicated metadata, but is at
least really non-intrusive with the dependency solvers and much of our
tooling.

So I hope you understand that my overall ecosystem complexity alarms
have all gone up. Which at the same time always feels bad because it
seems like a reaction against progress (even if that might end up
being misplaced :) !

> > As I mentioned at the time I don't think this is a workable solution.
> > These are the problems I see right away with it:
> 
> At the same time, I think it is the only way to really improve the
> problems with debootstrap and multistrap.

I'm not sure I can agree here, it seems there are too many holes in
this part, and it's very flaky, or just unworkable as presented.

> >  * dpkg would need to lock the external database, which means multiple
> >    chroots could not be used concurrently.
> 
> dpkg-checkbuilddeps does not lock the external database. Why would dpkg
> have to?

I don't think this is the right question to counter that argument.
It's a valid question on its own though. The point is that these are
two separate universes. The installed system must preserve integrity
at all costs, when that is lost your running system is broken and it
might stop running at all, stop booting, etc. If the dependencies
disappear while you are building, at most you get a broken build. You
could always retry it and that does not affect the integrity of the
system as long as you don't try to install broken packages for
example.

Is it dangerous to change the package state while building? Certainly!
And we might want to perhaps run dpkg-checkbuilddeps after the build
is finished and abort if the deps are not satisfied. This still leaves
a big window inbetween where packages might have been removed and
added back though.

> >  * As you mention, the dependencies might disappear under dpkg's feet,
> >    while it is not running in this mode.
> 
> dpkg-checkbuilddeps does not guarantee that Build-Dependencies do not
> disappear during build (which can take much longer than package
> installation), so why would dpkg require this?

That's like asking why should dpkg right now check dependency
satisfiability and lock the database to guarantee that that integrity
is preserved when installing packages!

> What is the point in making different assumptions on dpkg and on
> dpkg-checkbuilddeps? Both construct something external to the current
> installation.

Because in the dpkg chroot-less case we are still operating on the
chroot contents so integrity is paramount. But see above.

> >  * It ties the dependency graph from one installation into another,
> >    which can seriously complicate upgrades and similar.
> 
> In my picture, those dependencies on the outer systems are nothing dpkg
> can do anything about. It just verifies them. If they aren't met, there
> are basically two options:
>  * Fail.
>  * Proceed configuring with chroot as usual.

The only sane option would be to fail, proceeding makes sense only
ever as part of a force option.

> Also from the point of creating small Debian installations, splitting
> Depends into pieces would be preferable. rpm already does that:
> 
> Requires(post): foo
> 
> This would allow us to strip an essential installation of packages only
> required for configuring packages.
> 
> You see, I have a strong preference for allowing arbitrary packages to
> be required from the outer installation.

This is still very Debian specific, as in it requires the external
environment to be a Debian system too. Even worse, ISTM it is even
suite specific! Say you depend on a package conf-a >= 2.0 un suite 1.0
from the maintscripts, but the external distribution with suite 2.0
contains conf-a 2:1.0 (even though this would probably even cause
problems on upgrades). Another actual case would be if the depends
would be on something like git, which has been different things in
Debian depening on the suite. And of course different derivatives, or
distributions based on dpkg but not necessarily transitively on Debian
do not share the Debian package-version namespace.

> > Yeah doing a) by itself now seems pretty innocuous to me as well, so
> > I'd be fine with including that, but the rest seems still pretty
> > undefined.
> 
> Good. Likely a) also needs some form of documentation then. Do you see a
> good place for that documentation? In the long term it would live in the
> policy, but only after agreeing on the support declaration d).

Part of this belongs in the dpkg(1) man page, at least the options and
envvar. How to use it belongs either in the spec above, or in a new
file under doc/ in the dpkg git repo to be installed alongside the
triggers.txt file, in a way a spec inside the dpkg tree.

> I'd hope that we could also merge the other patches soonish, because
>  * It seems that their functionality is non-controversial. The pieces
>    that are related to dependencies and other aspects, but changing the
>    value of DPKG_ROOT and dropping the chroot call appear to be safe.
>  * It enables exploration. Only using it practically will tell us where
>    the real problems are.
>  * It is safe as the new functionality is never enabled by accident.

As long as it is very clear that the force option is not the primary
user facing interface and that's clearly and prominently documented,
I'm fine with this.

Thanks,
Guillem

Reply via email to