Re: Reviving schroot as used by sbuild

2024-09-14 Thread Helmut Grohne
Hi Sam and others,

On Fri, Jun 28, 2024 at 07:08:20AM -0600, Sam Hartman wrote:
> I'll be honest, I think building a new container backend makes no sense
> at all.

I looked hard at this as it was voiced by many. I have to say, I remain
unconvinced of the arguments brought forward.

> There's a lot of work that has gone into systemd-nspawn, podman, docker,
> crun, runc, and the related ecosystems.

I consider myself an expert user of systemd-nspawn. One thing that it
really lacks on bookworm is unprivileged execution. If you run your
builds as root, there is debspawn. In future, systemd-nspawn shall work
unprivileged - if your image is dm-verity signed. Bummer. I do not see
it as meeting our technical requirements in any way.

podman is a much more sensible suggestion and Simon gave a lot of
feedback on how to integrate it. Still its architecture is limiting in
multiple central aspects. For one thing, podman works with a static set
of namespaces per container instance, but what we want here is use
different network namespaces for installing build-depends and performing
a build. Another aspect is that people are already complaining about the
tarball-unpack approach taken by sbuild --chroot-mode=unshare being
slow. podman will make it slower due to requiring the unpack to happen
inside the users $HOME. My initial experiments indicate that we're in
for a factor two whereas we could get this down significantly by using
an overlayfs approach that we cannot shoehorn into podman. podman
upstream insists on CAP_SYS_ADMIN being a no go while systemd upstream
insists on CAP_SYS_ADMIN being a requirement. Whilst this is fine for
building, we also want to run autopkgtests. Running podman also requires
a systemd-logind session - something that is not usually available on a
buildd, in an application container (where you may also want to build a
package) or when you su/sudo to a different user. My conclusion is that
morphing podman into something usable is more work than writing a
container runtime and that doesn't even account for the political
disagreements involved.

Let me skip docker as it is very similar to podman in all of the aspects
above.

Then you mention crun and runc. These are vaguely API-compatible and
they are the lower level building blocks of both podman and docker. The
issue about CAP_SYS_ADMIN mentioned for podman earlier can be resolved
with ease at this level (at the cost of having containers that do not
contain, which was the reason for podman refuse doing this). The earlier
note about network namespaces fully applies here though. By going down
to this level, we also loose quite a bit of the benefits of image
management that the podman level included.

Your vague mentioning of related tools probably includes slirp4netns,
passt, uidmap and others. Tools at this level do not interfere with our
requirements and as such I fully concur with reusing them.

Beyond all of this, I am taking issue with a fundamental design decision
of all the mentioned container runtimes. They all have an architecture
that allows an outside process to "join" a container (podman exec).
Whilst that is a useful feature, it is using the setuid approach of
privilege transitions that we have learned for years to be inherently
vulnerable and that systemd folks have been working hard on replacing
with IPC mechanisms. As far as I understand it, a significant portion of
container runtime escapes work by exploiting this joining architecture
and the involuntary acquisition of host resources into a container. If
this were implemented via IPC, we could side step an entire class of
vulnerabilities.

> I think an approach that allowed sbuild to actually use a real container
> backend would be long-term more maintainable and would allow Debian's
> DevOps practices to better align with the rest of the world.

I have a hard time agreeing with this. I have been using rootless
containers far longer than podman supporting them and I still feel very
limited whenever I am supposed to use podman and prefer resorting to
other tools that are more capable and performant.

> I have some work I've been doing in this space which won't be useful to
> you because it is not built on top of sbuild.
> (Although I'd be happy to share under LGPL-3 for anyone interested.)

You can. I'm not sure we'll have to stick to sbuild. If we end up
converting our official buildds to something else, so be it. However,
I'd like to get to a point where building packages just works in a way
that doesn't require root privileges by default. We don't have this "it
just works" experience now.

> But I find that I disagree with the idea of writing a new container
> runtime for sbuild so strongly that I can no longer use sbuild for
> Debian work, so I started working on my own package building solution.

Please bear in mind that effectively, sbuild has gained its own
container runtime already and that what I am looking into here is
extracting it into a separate package interfacing w

Re: Should OpenSSL/ libssl3 depend on brotli?

2024-09-09 Thread Helmut Grohne
Hi Sebastian,

On Sat, Sep 07, 2024 at 12:12:58AM +0200, Sebastian Andrzej Siewior wrote:
> Is it okay for libssl3 do depend on libbrotli? It would increase minimal
> installs by ~900KiB on amd64.

Thanks for reaching out. From a purely architecture bootstrap centric
view, I approve your request. brotli has few dependencies and needs to
be built during architecture bootstrap already for curl and freetype. It
can be built before openssl at no extra effort.

I make no claims about other aspects of the proposed change.

Helmut



Re: DEP18 follow-up: What would be the best path to have all top-150 packages use Salsa CI?

2024-08-21 Thread Helmut Grohne
Hi Otto,

On Tue, Aug 20, 2024 at 06:35:52PM -0700, Otto Kekäläinen wrote:
> In short:
> I would very much like to see all top-150 packages run Salsa CI at
> least once before being uploaded to unstable. What people think is a
> reasonable way to proceed to reach this goal?
> 
> 
> Background:
> We have had several cases recently where an upload to Debian unstable
> causes widespread failure in unstable, and it could have been easily
> prevented if Salsa CI pipeline had run for the package and revealed
> the problem before upload to archive for everyone to suffer.

I'd like to rephrase your quest. What you really want is unstable to be
less unstable. Whilst a number of people disagree with that notion, I
sympathise with that view. We do use unstable to discover problems
before hitting testing, but in order for this to be effective, the bugs
to find should mostly be integration problems and it shouldn't be too
bumpy to let actual users ride unstable and report non-obvious problems.

Your proposal here is to improve the situation using salsa-ci and it is
not the worst of ideas given that salsa-ci works well for large numbers
of packages.

One complaint I've seen about this workflow and one that I agree with is
the waiting time. Checking salsa-ci before uploading incurs an extra
context switch. What we'd really like to do is click the equivalent to
"Auto-merge" (if the pipeline passes). As I write this, Sean or Ian will
likely come along and say tag2upload. Working in this direction and
enabling a "upload if ci passes" would bring the salsa-ci experience to
another level.

Let me suggest that there are more ways to do this. Freexian is putting
a ton of effort into https://debusine.debian.net. It can do much of the
same tasks as salsa-ci already (with less flexibility). Extending it to
act as an upload-proxy that forwards your upload to the archive if
builds pass could be another option of improving unstable quality. In
earlier times, debomatic.debian.net was used as a pre-upload QA tool if
I remember correctly.

Then the top-150 packages tend to be packages with unusual aspects. For
instance, the git repositories for gcc-VER, glibc and linux all lack
upstream sources. For linux, there is a pipeline, but in order to
complete in a timely manner, it enables the pkg.linux.quick build
profile and the pipeline is elaborate with a complex extract-source
stage. It's not a matter of just enabling the pipeline for our core
packages but spending a lot of time fiddling with the settings until it
works. I guess that sending a working pipeline configuration for these
could improve the situation. Would it also make sense to source
dedicated gitlab runners for heavy core packages to further reduce the
feedback time and the impact of enabling CI there?

Given this I want to resonate what others already said. This seems more
like something to put effort into and make it work practically than
worth discussing. Doing a survey mail to the relevant maintainers for
figuring out where to best direct that effort also seems sensible to me.
More than once, I've experienced that it's not the technically best
solution that ends up working well, but the one that has the best
support crowd. To be blunt, I don't think the /usr-move we do in DEP17
is the technically best solution, but most people seem to be happy with
the level of support.

Thanks for working on making unstable more enjoyable!

Helmut



Re: Removing more packages from unstable

2024-08-21 Thread Helmut Grohne
Hi Johannes and Bastian,

On Tue, Aug 20, 2024 at 10:35:47AM +0200, Bastian Venthur wrote:
> On 20.08.24 07:55, Johannes Schauer Marin Rodrigues wrote:
> > Hi,
> [...]
> > if I remember correctly, a package can also become a key package by having a
> > high-enough popcon value. If that is correct, maybe there should also be the
> > inverse. Looking at your list, about 85% of those packages have a popcon 
> > lower
> > than 100. Taking the popcon value into account would also kinda make your
> > hand-curated list of exceptions obsolete as your current list has popcons 
> > well
> > above 100, for example. If the popcon is taken into consideration, that 
> > would
> > also give a little bit of insurance that only very few users will be 
> > affected.
> 
> That's what I thought too: we should somehow incorporate the popcon value.

I considered adding popcon to the criteria before hitting send. In the
end, I opted for not including it based on my own cost/benefit analysis.
While popcon may be a signal for the benefit-of-keeping aspect, it
provides little value for the cost-of-keeping part that feels most
important to me. As you point out, popcon is partially considered via
the key package constraint. As others (e.g. Niels) point out, the cost
of a package largely is a function of our ability to modify it and long
lasting RC bugs are a relatively high quality signal indicating that a
package is difficult to modify. Either some of those many users
(according to popcon) eventually gets interested in doing the hard work,
or we should put it onto the chopping block. Even mailing the rc bug
would reset its last modified timer.

If there ends up being consensus for adding popcon as a signal, so be
it. I've explained my reasons and am not too strongly attached to
excluding popcon.

Helmut



Removing more packages from unstable

2024-08-19 Thread Helmut Grohne
Hi fellow developers,

(modified resend, as first attempt didn't arrive)

please allow me to open a can of worms. Package removal from unstable.
Deciding when it is time to remove a package from unstable is difficult.
There may be users still and it is unclear whether keeping the package
imposes a cost. In this mail I want to argue for more aggressive package
removal and seek consensus on a way forward.

What does a package cost?

There are various QA-related teams looking at packages from other
maintainers. When it trips a check, that often incurs time from some QA
person investigating a report or failure. Examples:
 * Lucas Nussbaum, Santiago Vila and a few more regularly perform
   archive rebuilds and report failures. They review a significant
   fraction of reports before sending, but there also is CPU resources
   for performing all those builds involved.
 * Reproducible builds folks actively investigate packages that fail to
   build reproducibly (or fail to build in the first place) and file bug
   reports often accompanied by patches.
 * Some cross build folks regularly send patches for cross build
   failures and also occasionally real FTBFS. About one such patch per
   month gets closed by ftp master package removal without ever having
   been applied.
 * DEP17 support folks send patches. Many of the remaining packages have
   accumulated RC bugs such as FTBFS.
 * As packages fail to migrate to testing for a long time, a release
   team member eventually looks at the package.
 * There are many more people doing various forms of QA and sending
   patches.

By virtue of being part of Debian, a package receives attention by a
significant number of developers. Assigning a number is non-trivial, but
we can say for sure that it is significant. Especially developers doing
/usr-move NMUS (e.g. Chris Hofstaedler and Michael Biebl)  now wonder
how much effort to put into the remaining packages.  Removing more
packages would reduce the number of NMUs required there.

I suggest that we are keeping too many packages in unstable and that
they incur a non-trivial cost. It is not clear at all where to draw the
line, but maybe we can shift the line more towards removal?

What does package removal cost?

Before a package can be removed, it needs to be reviewed for reverse
dependencies and if there are any, they have to be switched to something
else or removed as well. The actual package removal first and foremost
is carried out by a ftp master. There may still be people actively using
the package and they have to find some replacement for their task at
hand.  Sometimes, packages are reintroduced. Doing so incurs a pass
through NEW (and review by the ftp team). Closed and archived bugs need
to be reopened and reviewed. Sometimes, it is quicker to just NMU a
particular problem that to review a package for removal.

When to remove a package?

I looked at UDD and came up with a suggested query.

SELECT s.source, s.maintainer, b.id, b.title
FROM sources AS s JOIN bugs AS b ON s.source = b.source
WHERE s.release = 'sid'
AND NOT exists(SELECT 1 FROM sources AS t WHERE s.source = t.source AND 
t.release IN ('bookworm', 'trixie'))
AND NOT exists(SELECT 1 FROM key_packages AS k WHERE k.source = 
s.source)
AND b.affects_unstable = true
AND b.severity >= 'serious'
AND b.last_modified <= now() - interval '1 year'
AND s.source NOT IN ('check-all-the-things', 'debbugs', 'firefox', 
'gcc-snapshot', 'gitlab', 'hurd', 'openjdk-19', 'openjdk-20', 
'singularity-container', 'virtualbox', 'wine-development')
ORDER BY s.source, b.id;

A very similar query is achievable using the web interface:

https://udd.debian.org/bugs/?release=sid¬bookworm=only¬trixie=only&merged=ign&keypackages=ign&flastmod=ign&flastmodval=366&rc=1&sortby=id&sorto=asc&format=html#results

Human readable explanation:
 * Packages in sid
 * not in bookworm or trixie
 * not a key package
 * affected by an RC bug that has been last modified more than a year ago
 * not among a few selected exceptions

These results yield 360 or 351 bugs respectively. I am including a
package list from the SQL for those who prefer following offline, but
including more would trip the spam filter.

What do you think about the proposed criteria and suggested set of
source packages? Is it reasonable to remove these packages from
unstable? In a sense, it is extending the idea of the testing auto
remover to unstable. Similarly, a package can be temporarily saved by
mailing the respective bug.

Let us assume that we agree on there being a set of packages to be
removed. What is a reasonable process? Is it ok to just file a pile of
RoQA bugs or is more warning warranted? Should we maybe use a process
similar to salvaging where there is an "ITR" (intent to remove) bug that
is reassigned to ftp after a reasonable timeout?

My personal motivation for looking into this actually is the /usr-move
work and the cross build support work. Plea

Re: Reviving schroot as used by sbuild

2024-07-06 Thread Helmut Grohne
Hi Philipp,

Let me go into some detail that is tangential to the larger discussion.

On Mon, Jul 01, 2024 at 09:18:19AM +0200, Philipp Kern wrote:
> How well does this setup nest? I had a lot of trouble trying to run the
> unshare backend within an unprivileged container as setup by systemd-nspawn
> - mostly with device nodes. In the end I had to give up and replaced the
> container with a full-blown VM. I understand that some of the things compose
> a little if the submaps are set up correctly, with less IDs allocated to the
> nested child. Is there a way to make this work properly, or would you always
> run into setup issues with device nodes at this point?

Technically speaking, nesting is possible. The individual container
implementation may limit you, but that's an implementation limit and not
a fundamental one. I'm assuming that you want to nest a rootless
container in a rootless container as that tends to be the most difficult
one. Roughly speaking your unprivileged container wants access to your
user id and a 64k allocation of subuids. This applies to the nested
container. If your outer container maps two 64k ranges (one to 0 to
65535 and the other to whatever your user has in its contained
/etc/subuid), your contained user should actually be able to spawn a
podman container unless I am missing something important. Devices
usually are not a problem (for rootless containers) as you cannot create
them anyway so you end up bind mounting them and the bind mounting
technique nests well.

A typical Debian installation only allocates a single 64k range to each
user. Your first step here is growing that range or adding another one.
(Yes, you may have multiple lines for your user in /etc/subuid.) Then
the podman-run documentation hints at --uidmap and it says that you can
specify it multiple times to map multiple ranges. This is how you
construct your outer container. Then inside, nesting should just work.
Admittedly, I've not tried this.

The takeaway should be that if your outer container is constructed in
the right way, you should be able to nest other containers (e.g. podman,
mmdebstrap, sbuild unshare, ...) without issues. It's not like this just
works out of the box, but it should be feasible.

Helmut



Re: Reviving schroot as used by sbuild

2024-06-27 Thread Helmut Grohne
Hi Simon,

Thanks for having taken the time to do another extensive writeup. Much
appreciated.

On Wed, Jun 26, 2024 at 06:11:09PM +0100, Simon McVittie wrote:
> On Tue, 25 Jun 2024 at 18:55:45 +0200, Helmut Grohne wrote:
> > The main difference to how everyone else does this is that in a typical
> > sbuild interaction it will create a new user namespace for every single
> > command run as part of the session. sbuild issues tens of commands
> > before launching dpkg-buildpackage and each of them creates new
> > namespaces in the Linux kernel (all of them using the same uid mappings,
> > performing the same bind mounts and so on). The most common way to think
> > of containers is different: You create those namespaces once and reuse
> > the same namespace kernel objects for multiple commands part of the same
> > session (e.g. installation of build dependencies and dpkg-buildpackage).
> 
> Yes. My concern here is that there might be non-obvious reasons why
> everyone else is doing this the other way, which could lead to behavioural
> differences between unschroot and all the others that will come back to
> bite us later.

I do not share this concern (but other concerns of yours). The risk of
behavioural differences is fairly low, because we do not expect any
non-filesystem state to transition from one command to the next. Much to
the contrary, the use of a pid namespace for each command ensures
reliable process cleanup, so no background processes can accidentally
stick around.

I am concerned about behavioural differences due to the reimplementation
from first principles aspect though. Jochen and Aurelien will know more
here, but I think we had a fair number of ftbfs due to such differences.
None of them was due to the architecture of creating a namespaces for
each command and most of them were due to not having gotten right
containers in general. Some were broken packages such as skipping tests
when detecting schroot.

Also note that just because I do not share your concern here does not
imply that I'd be favouring sticking to that architecture. I expressed
elsewhere that I see benefits in changing it for other reasons. At this
point I more and more see this as a non-boolean question. There is a
spectrum between "create namespaces once and use them for the entire
session" and "create new namespaces for each command" and more and more
I start to believe that what would be best for sbuild is somewhere in
between.

> For whole-system containers running an OS image from init upwards,
> or for virtual machines, using ssh as the IPC mechanism seems
> pragmatic. Recent versions of systemd can even be given a ssh public
> key via the systemd.system-credentials(7) mechanism (e.g. on the kernel
> command line) to set it up to be accepted for root logins, which avoids
> needing to do this setup in cloud-init, autopkgtest's setup-testbed,
> or similar.

Another excursion: systemd goes beyond this and also provides the ssh
port via an AF_VSOCK (in case of VMs) or a unix domain socket on the
outside (in case of containers) to make safe discovery of the ssh access
easier.

> For "application" containers like the ones you would presumably want
> to be using for sbuild, presumably something non-ssh is desirable.

I partially concur, but this goes into the larger story I hinted at in
my initial mail. If we move beyond containers and look into building
inside a VM (e.g. sbuild-qemu) we are in a difficult spot, because we
need e.g. systemd for booting, but we may not want it in our build
environment. So long term, I think sbuild will have to differentiate
between three contexts:
 * The system it is being run on
 * The containment or virtualisation environment used to perform the
   build
 * The system where the build is being performed inside the containment
   or virtualisation environment

At present, sbuild does not distinguish the latter two and always treats
them equal. When building inside a VM, we may eventually want to create
a chroot inside the VM to arrive at a minimal environment. The same
technique is applicable to system containers. When doing this, we
minimize the build environment and do not mind the extra ssh dependency
in the container or virtualisation environment. For now though, this is
all wishful thinking. As long as this distinction does not exist, we
pretty much want minimal application containers for building as you
said.

> If you build an image by importing a tarball that you have built in
> whatever way you prefer, minimally something like this:
> 
> $ cat > Dockerfile < FROM scratch
> ADD minbase.tar.gz /
> EOF
> $ podman build -f Dockerfile -t local-debian:sid .

I don't quite understand the need for a Dockerfile here. I suspect that
this is the obvious way that works reliably, but my impression was that

Re: Reviving schroot as used by sbuild

2024-06-25 Thread Helmut Grohne
Hi Simon,

On Tue, Jun 25, 2024 at 02:02:11PM +0100, Simon McVittie wrote:
> Could we use a container framework that is also used outside the Debian
> bubble, rather than writing our own from first principles every time, and
> ending up with a single-maintainer project being load-bearing for Debian
> *again*? I had hoped that after sbuild's history with schroot becoming
> unmaintained, and then being revived by a maintainer-of-last-resort who
> is one of the same few people who are critical-path for various other
> important things, we would recognise that as an anti-pattern that we
> should avoid if we can.
 
This is a reasonable concern. I contend that while unschroot.py is very
Debian-specific, the underlying plumbing layer is not. I would not have
started working on this if what I wanted to do was doable with existing
code, but maybe it was not the code didn't do it, but me not using the
existing code correctly.

Please allow me to point out that right now, sbuild contains a custom
container framework that is subject to eventually becoming a starving
single-maintainer project and I am trying to extract and separate this
existing container framework from sbuild into more reusable components.
Likewise, mmdebstrap contains another custom container framework that is
similar but not equal to the one in sbuild.

> At the moment, rootless Podman would seem like the obvious choice. As far
> as I'm aware, it has the same user namespaces requirements as the unshare
> backends in mmdebstrap, autopkgtest and schroot (user namespaces enabled,
> setuid newuidmap, 65536 uids in /etc/subuid, 65536 gids in /etc/subgid).

I concur, the privilege requirements for rootless podman are exactly the
ones I am interested in. Indeed, podman was the thing investigated most
thoroughly, but evidently not thoroughly enough.

> Podman uses the same OCI images as Docker, so it can either pull from a
> trusted OCI registry, or use images that were built by importing a tarball
> generated by e.g. mmdebstrap or sbuild-createchroot. I assume that for
> Debian we would want to do the latter, at least initially, to avoid
> being forced to either trust an external registry like hub.docker.com
> or operate our own.

At least for me, building container images locally is a requirement. I
have no interest in using a container registry. Faidon pointing at
--roofs goes further into this direction.

> podman is also supported as a backend by autopkgtest-virt-podman, Toolbx
> (podman-toolbox in Debian) and distrobox. autopkgtest's
> autopkgtest-build-podman does not yet support starting from a tarball
> as described above, but it easily could (contributions welcome).

Thank you for pointing at these. I need to familiarize myself with them.

> Or, if Podman is too "not invented here" for Debian's use, using rootless
> lxd/Incus is another option - although that introduces a dependency
> on projects and formats that are rarely used outside the Debian/Ubuntu
> bubble, which risks them becoming another schroot (and also requires us to
> decide whether we follow Canonical's lxd or the community fork Incus
> post-fork, which could get somewhat political).

lxd/incus also was on my list, but my understanding is that they do not
work without their system services at all and being able to operate
containers (i.e. being incus-admin or the like) roughly becomes
equivalent to being full root on the system defeating the purpose of the
exercise. If anything is "not invented here", that'd be unschroot rather
than podman.

> > There are two approaches to
> > managing an ephemeral build container using namespaces. In one approach,
> > we create a directory hierarchy of a container root filesystem and for
> > each command and hook that we invoke there, we create new namespaces on
> > demand. In particular, there are no background processes when nothing is
> > running in that container and all that remains is its directory
> > hierarchy. Such a container session can easily survive a reboot (unless
> > stored on tmpfs). Both sbuild --chroot-mode=unshare and unschroot.py
> > follow this approach. For comparison, schroot sets up mounts (e.g /proc)
> > when it begins a session and cleans them up when it ends. No such
> > persistent mounts exist in either sbuild --chroot-mode=unshare or
> > unschroot.py.
> 
> Persisting a container root filesystem between multiple operations comes
> with some serious correctness issues if there are "hooks" that can modify
> it destructively on each operation: see 
> and . As a result of that, I think the
> only model that should be used in new systems is to have some concept of
> a session (like schroot type=file, but unlike schroot type=directory)
> so that those "hooks" only run once, on session creation, preventing
> them from arbitrarily reverting/overwriting changes that are subsequently
> made by packages installed into the chroot/container (for example dbus'
> creation

Reviving schroot as used by sbuild

2024-06-25 Thread Helmut Grohne
Hi,

sbuild is our primary tool for constructing a build environment to build
Debian packages. It is used on all buildds and for a long time, the
backend used with sbuild has always been schroot. More recently, a
number of buildds have been moved away from schroot towards
--chroot-mode=unshare thanks to the work of at least Aurelien Jarno and
Jochen Sprickerhof and a few more working more behind the scenes for me
to spot them directly.

In this work, limitations with --chroot-mode=unshare became apparent and
that lead to Johannes, Jochen and me sitting down in Berlin pondering
ideas on how to improve the situation. That is a longer story, but
eventually Timo Röhling asked the innocuous question of why we cannot
just use schroot and make it work with namespaces.

That lead me to sit down and write a proof of concept. As a result, we
now have a little script called unschroot.py that vaguely can be used as
a drop-in replacement for schroot when used with sbuild. In trixie and
bookworm-backports it can now be plugged into sbuild by setting $schroot
= "path/to/unschroot.py" thanks to Johannes. It's not that long and can
be viewed at
https://git.subdivi.de/~helmut/python-linuxnamespaces.git/tree/examples/unschroot.py.
It is vaguely close to reaching feature-parity with sbuild
--chroot-mode=unshare and operates in a very similar way. As it is now,
it doesn't bring us any benefits beyond separating the containment
aspect from the build aspect into different tools.

The split into different tools is important in my view. I argue that it
allows easier experimentation and its architecture may enable features
that were difficult to implement using sbuild --chroot-mode=unshare as
sbuild is significantly becoming a container runtime of its own and
there things start to get messy.

Is this a path worth pursuing further? Would we actually consider moving
back from sbuild --chroot-mode=unshare to sbuild --chroot-mode=schroot
with a different schroot implementation?

Related to that, what would be compelling features to switch?

Let me go a bit further into detail. There are two approaches to
managing an ephemeral build container using namespaces. In one approach,
we create a directory hierarchy of a container root filesystem and for
each command and hook that we invoke there, we create new namespaces on
demand. In particular, there are no background processes when nothing is
running in that container and all that remains is its directory
hierarchy. Such a container session can easily survive a reboot (unless
stored on tmpfs). Both sbuild --chroot-mode=unshare and unschroot.py
follow this approach. For comparison, schroot sets up mounts (e.g /proc)
when it begins a session and cleans them up when it ends. No such
persistent mounts exist in either sbuild --chroot-mode=unshare or
unschroot.py.

The other approach is using one set of namespaces for the entire
session. Practically, this implies having a background process keeping
this namespace alive for the duration of the session and talking to it
via some IPC mechanism. We may still spawn a new pid namespace for each
command to get reliable process cleanup, but the use of a persistent
mount namespace enables the use of fuse2fs, squashfuse, overlayfs and
bindfs to construct the root directory of the container by other means
than unpacking a tar into a directory. In particular, the use of bindfs
allows sharing e.g. the user's ccache with the build container in
principle (with proper id shifting). At the time of this writing, this
second approach is wishful thinking and not implemented at all. I merely
believe that it is implementable with the schroot API already
implemented by unschroot.py above.

Another possible extension is a hooking mechanism. Regular schroot has
hooks already and I've seen requests for sbuild to use package-specific
chroots. For instance, one may have a separate Haskell or Rust container
that already has a basic set of ecosystem-specific dependencies to speed
up the installation of Build-Depends. On-demand updating chroots also
have been requested. However, it's not clear to me what a useful
interface e.g. unschroot.py could provide for such hooking yet and I
invite you to provide more use cases for such hooking. Also sketching
how you imagine interfacing with this would be helpful. For instance,
you may explain what kind of configuration files or options you'd like
to use and how you imagine them to work.

I note that this is not a promise that I am going to implement your
wishes. I intend to do more work on this and barring really useful
extensions, my next goal would be moving to that other approach.

Please allow me to thank Freexian for supporting part of this work
financially even though it has been my initiative and is not otherwise
influenced by Freexian at the time of this writing.

Let me also explain the relation between "unschroot.py" and the
containing repository "python-linuxnamespaces". linuxnamespaces is a
(probably) distribution-agnostic 

Re: Seeking consensus on file conflict between python3-proto-plus and nanopb

2024-06-11 Thread Helmut Grohne
Hi Laszlo and Yogeswaran,

I'm explicitly adding Laszlo to Cc to increase the chances of him
chiming in.

On Mon, Jun 10, 2024 at 06:40:02PM -0400, Yogeswaran Umasankar wrote:
> There is a file conflict between python3-proto-plus and nanopb. The
> conflict arises due to both packages has a file at
> /usr/lib/python3/dist-packages/proto/__init__.py [0]. I am maintaining
> python3-proto-plus, and I am seeking guidance.

Thank you for going the extra mile and resolving this constructively and
consensually.

> The module name "proto" is an integral part of the python3-proto-plus
> package. Renaming the "proto" module in python3-proto-plus would
> significantly impact future dependent packages.

I agree with this assessment. At the same time, I note that "proto" is a
fairly generic name. Even though it seems unlikely that upstream would
want to change it, I think telling them would be useful still. It
definitely is conceivable that another project would later try to also
use this module name and therefore it is best to avoid it.

> It appears that nanopb's use of the module name "proto" does not align
> with the conventional identification of a Python module. Given this, it
> might be plausible to make this module private within the nanopb
> package. This adjustment could potentially resolve the conflict without
> affecting the dependent packages.

Yes. In particular, I could not locate external uses of nanopb's proto
module. chromium and firefox also use this module name though my
impression is that they have another conflict on this name again arguing
in favour of not using it at all.

> I have attempted to reach out to the nanopb maintainer to discuss this
> issue, but I have not yet received a response. In case the maintainer is
> MIA, should I proceed with renaming the "proto" module in nanopb to
> "nanopb-proto"? As one of the team members, I am willing to implement
> this change if it is deemed the best solution.

I recommend that you send a patch to the bug and give Laszlo two weeks
before proceeding to NMU in the absence of a reply as the proposed
change is a bit intrusive.

Helmut



Re: Enabling some -Werror=format* by default?

2024-06-10 Thread Helmut Grohne
On Mon, Jun 10, 2024 at 04:06:13PM +0500, Andrey Rakhmatullin wrote:
> Do you think it makes sense to add this a flag that enables -Werror=format
> to dpkg-buildflags(1), before, or after a test rebuild, before, or after
> the MBF if we do one?

I think that a test rebuild and the MBF are reasonable preconditions to
extend the default build flags (and with default I mean changing
hardening=+all).

As multiple people pointed out, the effects of the flags are hard to
predict and they can even cause misbuilds (via failing configure
checks), so these flags do have a non-trivial cost (and benefits).

Ideally, we'd not just do a rebuild with the flags, but also do a
rebuild without and then compare the binary .debs. In the event that we
misguide configure, we expect the .debs to differ and otherwise to equal
due to the work of the reproducible builds folks. That equality has a
really annoying difference in practice though: Build ids are dependent
on the compiler flags, so the comparison would have to magically ignore
changes in build id and this is where things become quite difficult.

> Another related question: if not via dpkg-buildflags, how do we do
> rebuilds with changed default flags?

If you export DEB_CFLAGS_APPEND=-Werror=format=2 and
DEB_CXXFLAGS_APPEND=-Werror=format=2 (not to be confused with
DEB_*_MAINT_APPEND which is often set in d/rules), you should get most
packages to pick up these flags.

Possibly debusine.debian.net can be used to perform such a rebuild
rather than using your own resources. Steering it to do this is a
non-trivial task at present, but I my impression is that you will
receive support in doing so and it can do native armhf builds
(eliminating the need for cross builds). Your mileage will vary.

In any case, my impression is that the first step towards changing
hardening flags is actually performing test builds in whatever form.

Helmut



MBF: Move remaining files into /usr

2024-06-10 Thread Helmut Grohne
As many were so happy with the upload of the debootstrap set, we want to
direct your attention to the long tail of the /usr-move transition that
we want to see fixed in trixie: Moving aliased files in all remaining
packages to /usr. More precisely, the transition should be fully
completed in trixie before we enter the transition freeze likely in
January 2025. Dragging it, including the restrictions on package splits
and moving files, into forky would cause a lot of extra effort.

At this time, packages needing work mostly fall into three minimally
overlapping classes. Two of them already have bugs filed. This MBF is
about filing bugs for the biggest one.

 * "dh-sequence-movetousr": adding dh-sequence-movetousr to
   Build-Depends moves all files. We want to file bugs for these now.    
   191 packages.
 * "ftbfs#NNN": package currently FTBFS. Automatic analysis was not possible. 
Most
   of the packages have been failing to build for quite a while. We'll also 
look into
   removing these packages from unstable.
   28 packages.
 * "dep17#NNN": package already has a bug report on how to move. Often with a 
patch.
   78 packages.

We intend to use the following bug template:

==
Source: $SOURCEPKG
Version: $SOURCEVERSION
Severity: important
Tags: patch trixie sid
User: helm...@debian.org
Usertags: dep17m2 dep17dhmovetousr

This package is part of the /usr-move (DEP17) transition, because it
contains files in aliased locations and should have those files moved to
the corresponding /usr location. The goal of this move is eliminating
bugs arising from aliasing, such as file loss during package upgrades.

The following files in the following binary packages are affected.

...

You may add dh-sequence-movetousr to Build-Depends to perform the move.
This is an easy and readily applicable measure that has been verified
for this package using a test build. The main advantage of this method
is the low effort and it just works when backporting to bookworm.
However, it is more of a stop-gap measure as eventually the installation
procedure should refer to the files that are actually used for
installation. This often means updating debian/*.install files but also
changing flags passed to a configure script or similar measures. In case
you do not anticipate your package being uploaded to bookworm-backports,
please prefer a manual move, but generally prefer moving over delaying
any further.

After having done this move, please keep in mind that the relevant
changes need to be reverted for bookworm-backports, with these
exceptions:
 * dh-sequence-movetousr and dh_movetousr cancel themselves.
 * dh_installsystemd and dh_installudev revert to the aliased location.
 * The pkg-config variables systemdsystemunitdir in systemd.pc and
   udevdir in udev.pc reverts to aliased.

Please keep in mind that restructuring changes may introduce problems
after moving. A change is considered restructuring if formerly aliased
files formerly owned by one package are later to be owned by a package
with a different name. Such uploads should be done to experimental and
quarantine for three days before moving to unstable. This way, automatic
analysis (https://salsa.debian.org/helmutg/dumat) can detect problems
and file bugs. Such bugs shall include support for resolving the
problems.

The severity of this bug shall be raised to RC on August 6th.

For additional information about refer to
https://wiki.debian.org/UsrMerge and
https://subdivi.de/~helmut/dep17.html.
==

Additionally, we intend to upgrade all existing dep17* usertagged bugs
to important severity at the time of the MBF.  We intend to upgrade
these bugs to RC severity on August 6th, too.

Please find the dd-list attached. An irregularly updated version can be
found at: https://subdivi.de/~helmut/usrmove.ddlist

You may opt for not receiving a bug report by performing the requested
change before the bugs are filed.

Does anyone object to this MBF or wants an aspect about it changed?

Kind regards

Chris and Helmut
Please fix your packages for the /usr-move aka DEP17. Legend:
 * "upload" means that a source-ful upload fixes all relevant /usr-move issues
   (in Arch:all packages)
 * "dh-sequence-movetousr" means that adding dh-sequence-movetousr to
   Build-Depends moves all files. You may move them manually if you prefer.
 * "ftbfs#NNN" identifies a FTBFS bug that prevents automatic analysis.
 * "dep17#NNN" identifies a relevant bug report. Please consult this bug and
   fix it if possible. Most of them have a patch.

A. Maitland Bottoms 
libiio dh-sequence-movetousr

Adam Borowski 
btrfs-progs ftbfs#1071297
ndctl dh-sequence-movetousr
powermgmt-base dh-sequence-movetousr

Adrian Alves 
grokmirror dh-sequence-movetousr

Adrian Vondendriesch 
booth dh-sequence-movetousr
corosync dh-sequence-movetousr
fence-agents dh-sequence-movetousr
sbd dh-sequence-movetousr

Alastair McKinstry 
csh dh-sequence-movetousr

Alberto Bertogli 
dnss dh-seque

Re: Enabling some -Werror=format* by default?

2024-06-10 Thread Helmut Grohne
On Fri, Jun 07, 2024 at 12:19:28AM +0500, Andrey Rakhmatullin wrote:
> We recently increased the time_t size on certain architectures and some
> packages started failing to build because they were using a format
> specifier too narrow for the wider time_t, e.g. #1067829.
> But the only reason those package FTBFS is they enable either -Werror or
> some more specific non-default switch like -Werror=format=2, so I suspect
> much more packages contain similar code but gained only a warning. Isn't
> this a bad thing? Should we enable at least some combination of -Wformat*
> switches by default? Should we at least add a new flag to dpkg-buildflags
> and do some test rebuilds with it enabled?

It wasn't quite clear to me what -Werror=format=2 actually means.
According to the gcc documentation[1], -Wformat=2 currently means:

-Wformat -Wformat-nonliteral -Wformat-security -Wformat-y2k.

Of these, we already enable -Werror=format-security, but not the other
ones. It is not clear to me, which of these actually catches the the
type mismatches. Would you do more research here?

It also is unclear how this impacts the archive and yes, I'd recommend a
rebuild. Note though that we likely need this rebuild both on a 64bit
architecture and a 32bit architecture that is not i386 (due to how t64
works). A partial archive rebuild may work to gauge the size of the
problem.

I note that this kind of change affects cross builds, so performing
cross builds for armhf on amd64 will likely show many of these failures
(in addition to all the regular cross build failures).

I recommend doing more research before moving forward with this. In
particular a MBF about introduced problems would be prudent before doing
the switch and even if we don't switch, such a MBF bears value on its
own.

Helmut

[1] https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html



DEP17 /usr-move: debootstrap set uploaded

2024-06-06 Thread Helmut Grohne
Hello,

I have just uploaded
 * base-files
 * bash
 * dash
 * glibc
 * util-linux
to unstable. These were the last remaining packages shipping aliased
files inside the package set relevant to debootstrap.

Once any of these packages has been built until the last of these has
been built, debootstrap (and other implementations) will fail to work.
I performed these uploads simultaneously to keep the disruptance
minimal. While the changes are already part of Ubuntu noble and I have
extensively tested them locally, I cannot entirely rule out mistakes and
will handle them in the next days. If you spot problems that look
related to these uploads, please X-Debbugs-Cc me in your bug reports or
contact me on IRC (e.g. #debian-usrmerge).

For trixie, there currently are migration blocks to prevent partial
migration of these changes. The release team will lift them once we are
satisfied with the situation in unstable such that the time where
debootstrapping trixie also is minimized.

Thanks for bearing with me and also thanks to all the people (release
team and affected package maintainers in particular) who support this
work.

Helmut



Re: Another usrmerge complication

2024-03-17 Thread Helmut Grohne
Hi Simon and Simon,

On Sun, Mar 17, 2024 at 12:08:21PM +, Simon McVittie wrote:
> On Sun, 17 Mar 2024 at 11:23:28 +0900, Simon Richter wrote:
> > When /bin is a symlink to usr/bin,
> > and I install two packages, where one installs /bin/foo and the other
> > installs /usr/bin/foo
> 
> My reading of Policy is that this situation is already a Policy violation:
> 
> To support merged-/usr systems, packages must not install files in
> both /path and /usr/path. For example, a package must not install
> both /bin/example and /usr/bin/example. —§10.1
> 
> and in the case of /{usr/,}{s,}bin in particular (which is the most likely
> place for this to happen), doubly so:
> 
> Two different packages must not install programs with different
> functionality but with the same filenames —also §10.1
> 
> (I'm interpreting that as "install programs into the PATH" which I hope
> is the intended reading.)

That is also my interpretation.

> So I think the precise way in which the system goes wrong in this
> situation is unimportant, because the situation already shouldn't
> exist?

Yes, and I can also tell you why such a situation does not exist.

> Until we reach the point where every package's data.tar contains only
> non-aliased paths (files below /usr, /etc and /var, plus additional
> top-level paths in base-files), it seems to me like the best way to
> handle this would be a QA tool that detects any such situations that might
> exist in the archive, and makes sure they have appropriate Conflicts to
> stop the bad scenario from occuring in practice.

https://salsa.debian.org/helmutg/dumat detects and files these issues.
Since we typically handle file conflicts with Breaks+Replaces and
Replaces may be rendered ineffective due to aliasing, it is important
that we correctly declare all Replaces relations (that involve paths
subject to aliasing). For this very reason, dumat causes reports (rc
bugs) for all such conflicts disregarding aliasing. Say package foo
contains /lib/foo and package foo2 contains /usr/lib/foo without a
package relation nor diversion, an rc bug would be created (and of
course this also holds for bin and sbin).

I think we're good in this regard.

> But, after we reach the point where every data.tar contains only
> non-aliased paths, by definition this situation cannot arise, because
> there will be no remaining packages with files /bin/foo, /sbin/foo
> or /lib*/foo. It seems like we are quite close to that point (mainly
> thanks to Helmut's efforts in that direction) after which this will be
> a non-issue, so maybe providing such a QA tool would be a wasted effort.

At that point we get an unpack error from dpkg rather than silent file
overwrite, yes, but that still warrants a QA tool. Andreas Beckmann
seems to file such issues fairly reliably.

Helmut



Re: Bug#1065022: libglib2.0-0t64: t64 transition breaks the systems

2024-03-01 Thread Helmut Grohne
On Thu, Feb 29, 2024 at 06:53:56AM +0100, Paul Gevers wrote:
> Well, officially downgrading isn't supported (although it typically works)
> *and* losing files is one of the problems of our merged-/usr solution (see
> [1]). I *suspect* this might be the cause. We're working hard (well, helmut
> is) to protect us and our users from loosing files on upgrades. We don't
> protect against downgrades.

As much as we like blaming all lost files on the /usr-move, this is not
one them. If you are downgrading from to a package that formerly has
been replaced, you always have lost files and you always had to
reinstall the package.

While t64 has quite some interactions with the /usr-move, I am closely
monitoring the situation have have been filing multiple bugs per day
recently about the relevant peculiarities. I don't think any of the
fallout we see here is reasonably attributable to /usr-move. The most
recent practical issues I've seen was related to image building tools
such as live-build and grml. When it comes to lost files, we're not
addressing them based on user reports (as there are practically none),
but on rigid analysis preventing users from experiencing them in the
first place.

Helmut



On merging bin and sbin

2024-02-28 Thread Helmut Grohne
Hi cacin,

I see that you are working on merging /bin and /sbin, for instance via
brltty bug #1064785. Again Fedora is pioneering this matter and their
documentation is at
https://fedoraproject.org/wiki/Changes/Unify_bin_and_sbin.

Please allow me to push back on this one as well by raising a few
concerns.

Fundamentally, turning sbin into a symlink pointing to bin is causing
aliasing problems that we currently have a hard time fixing up for the
/usr-merge. If doing this, I think we need a different technical
approach. Doing the aliasing mess again does not sound like a valid
option to me. In the /usr-merge discussion, alternatives have been
proposed. For instance, there as a proposal that would manage a symlink
farm via dpkg triggers until the aliased directory would become
unpopulated (by packages) and then turn the farm into a symlink. If
doing the symlink first, I think we need changes to dpkg before
creating such a symlink to make this approach viable.

Apart from the implementation side, this is a more user visible change.
As you complete programs in a user shell, more programs become
available. This can be good, but it can also be seen as a pollution of
your shell completion. I note that Fedora seems to have added /sbin to
the user $PATH by default, which is not what Debian has done. I do not
think we have consensus on this and would raise an objection of my own.

That said, I appreciate your work on analyzing the situation as it also
uncovers tangential problems e.g. where different packages put programs
with different functionality into bin and sbin. It is up to
interpretation of Debian policy whether that should be considered an
RC-bug (10.1 "same filenames"). In general, I think that having each
program name on either bin or sbin but not both is a desirable property
and it should be easier to gain consensus on this. As we've seen with
arcstat (zfs vs nordugrid), doing so will take a long time. Where we
expect downstreams to not have hardcoded paths to programs
/usr/sbin/foo, dropping a symlink from sbin/foo -> ../bin/foo probably
is reasonable, but needs to be reviewed case by case.

As we see, this is not a single change to be working on, but multiple
related and interdependent topics. Disentangling these matters and
making the intention clear is key to moving this forward.

Regardless of whether we (as a project) want sbin merged into bin or
not, reducing conflicts (different functionality but same name) between
sbin and bin is a hard prerequisite, difficult to achieve and probably
and agreeable goal.

Helmut



Splitting collectd dependencies [Was: Re: Another take on package relationship substvars]

2024-02-26 Thread Helmut Grohne
None of this is relevant to the substvars discussion, but the collectd
side is worth looking at on its own.

On Sat, Feb 24, 2024 at 01:36:33PM +0100, Gioele Barabucci wrote:
> On 24/02/24 11:26, Bernd Zeimetz wrote:
> > Absolutely. collectd for example - otherwise you would install *all*
> > plugin dependencies with collectd, which would be a big waste of space.
> > 
> > The other option would be to make one packe per plugin as redhat does,
> > but do we really want 20 packages with a single file?
> 
> Yes, please. So that installing package collectd-foo ensures that all the
> required dependencies are installed, instead of having to hunt them down (a
> task better left to the package maintainers rather than the end users).

There is a balance to be struck here. Adding one package per plugin is a
lot of plugins and you often install multiple plugins together. It is
not obvious that the benefit of splitting is worth the associated cost.

I think there is a middle ground here. Having one package per plugin
definitely does have advantage. However, consider the option of turning
those packages virtual. So you'd have tons of collect-plugin-foo
packages provided from collectd initially. Then, multiple plugins tend
to use the same dependencies and some plugins tend to use no additional
dependencies. The latter can just be left in the main package together
with their provides. The former can be grouped together to say
collectd-plugins-hardware or something that you wouldn't want on a
virtual machine. Together with the plugins, you'd also move the provides
and recommends (maybe upgraded to depends then). In particular, you can
later restructure the plugins provided that downstreams only depend on
your virtual collect-plugin-* packages rather than the underlying
physical packages.

An example where this has successfully been implemented with Depends and
a small installation footprint is lighttpd.

Helmut



Re: Bug#1063329: libselinux1t64: breaks system in upgrade from unstable

2024-02-07 Thread Helmut Grohne
Hi Andreas,

On Wed, Feb 07, 2024 at 03:47:37PM +0100, Andreas Metzler wrote:
> Package: libselinux1t64
> Replaces: libselinux1
> Provides: libselinux1 (= 3.5-2.1~exp1)
> Breaks: libselinux1 (<< 3.5-2.1~exp1)
> 
> Afaiui libselinux1t64 must not fullfill dpkg 1.22.4's dependency on
> "libselinux1 (>= 3.1~)". dpkg needs to be rebuilt and the rebuilt
> version gets a dep on "libselinux1t64 (>= 3.5)".

The *t64 libraries only break ABI on some architectures. Notably, on all
64bit architectures, i386 and x32, the ABI will not change. On the next
upload after the transition, library dependencies will move to the t64
variants, yes.

> Will ${t64:Provides} stop expanding to "libselinux1 = ${binary:Version
> for real t64-builds? (The ones in experimental are not.) If that is case
> this bug and this way of testing does not make sense.

No, the t64:Provides will remain that way for all architectures that do
not break ABI. In theory, we could have skipped the rename on those
architectures, but having architecture-dependent package names is
annoyingly hard. Hence, we rename them on e.g. amd64 as well even though
nothing changes there.

Hope this explains

Helmut



Re: Bug#1063329: libselinux1t64: breaks system in upgrade from unstable

2024-02-07 Thread Helmut Grohne
Hi Guillem,

On Wed, Feb 07, 2024 at 04:32:45AM +0100, Guillem Jover wrote:
> Yes, I'm not sure I understand either. This is what symbol versioning
> makes possible, even providing different variants for the same symbol,
> see for example glibc or libbsd.

I think symbol versioning is subtly different and glibc does not use
symbol versioning for e.g. gettimeofday selection. With symbol
versioning, you select a default version at release time and stick to
it. In other words, building against the updated libselinux does not
allow you to use the older 32bit variant of the symbol even if you opt
out of lfs and time64 and you always get the 64bit symbol. What glibc
does is a little more fancy than my simplistic #define in that it uses
asm("name") instead. Still this approach allows for selecting which
symbol is being used via macros (e.g. _FILE_OFFSET_BITS). Please correct
me if I am misrepresenting this as my experience with symbol versioning
is fairly limited.

> In any case, if going the bi-ABI path, I think upstream should get
> involved, and the shape of this decided with them. In addition
> the library should also be built with LFS by the upstream build
> system, which it does not currently, to control its ABI.

I agree that involving upstream is a good idea and my understanding is
that someone from Canonical is doing that already, which is why the
schedule was delayed.

My real question here though is what's the downsides of providing two
variants of this symbol (whether with symbol versioning or name
redirection). From my pov, this effectively is your option 3 and what I
sketched is the most stupid implementation of it. My sketch did assume
that libselinux would be built with LFS support everywhere including
i386. Enabling that on the upstream side definitely is even better,
because it gets us to not have a Debian-specific ABI.

> I think there are only three ways to go about this, excluding the t64
> attempt:

Thanks for confirming that I've reported a real problem.

> If you'd like assistance with trying to get a proposal for 3 to
> present upstream I could look into that. But I think they should be
> involved early on to see what they'd like to see and what they might
> outright reject.

>From my naive point of view, this option 3 is the clear winner. Though
it all depends on what upstream says. If upstream cooperates on any
option, that's better still as we avoid ABI deviation.

Going from here, I also looked a bit into whether we could additionally
use an upstream-cooperating approach for other packages central to
Debian to avoid t64 bumps.

pam seems difficult:
| extern time_t pam_misc_conv_warn_time; /* time that we should warn user */
| extern time_t pam_misc_conv_die_time; /* cut-off time for input */

We cannot symbol-version these in a reasonable way. All we could do is
ask upstream for a real soname bump. We have a slight advantage here: On
little endian (such as armhf), we can extend this to 64bit and 32bit
accesses will continue to work for small values. However, doing this to
m68k would break horribly. I also couldn't find any in-Debian users of
these symbols (super merely vendors pam source), so just bumping it and
accepting breakage (Guillems option 1) might be worth a go?

For libaudit1, I fail to understand why we bump it at all. Both reports
look fine to me:
https://adrien.dcln.fr/misc/armhf-time_t/2024-02-01T09:53:00/compat_reports/libaudit-dev/base_to_lfs/compat_report.html
https://adrien.dcln.fr/misc/armhf-time_t/2024-02-01T09:53:00/compat_reports/libaudit-dev/lfs_to_time_t/compat_report.html
This does not extend to libauparse0 where the report gives a reason, but
libaudit1 is the one that interacts with /usr-move and libauparse0 not,
so can we skip the dance for libaudit1?

For libtirpc, it is only about rpcb_gettime, which returns time via a
time_t* and can indicate success/failure via return. It seems fairly
simple to implement ABI duality here and libtirpc already does symbol
versioning. Maybe we can also approach upstream about this?

For libfuse2, I think the ABI analysis is broken. The base-to-lfs report
supposedly is ok
https://adrien.dcln.fr/misc/armhf-time_t/2024-02-01T09:53:00/compat_reports/libfuse-dev/base_to_lfs/compat_report.html
and then going lfs-to-time changes ino_t
https://adrien.dcln.fr/misc/armhf-time_t/2024-02-01T09:53:00/compat_reports/libfuse-dev/lfs_to_time_t/compat_report.html
while I would have expected ino_t to change with lfs already.  Are we
sure about this? In any case, this is more of an academic question as
adding ABI-duality would be more involved here. Moreover, I don't see
any ACC report for libfuse3-dev. Did that fail to analyze?

libiw30 only has one affected symbol:
iw_print_timeval ( char* buffer, int buflen, struct timeval const* time, struct 
timezone const* tz )
Providing ABI duality for this seems doable. Moreover, libiw30 already
has soname 30, so maybe upstream is open to bumping it again? The
resulting library transition is f

Bug#1063329: libselinux1t64: breaks system in upgrade from unstable

2024-02-06 Thread Helmut Grohne
Package: libselinux1t64
Version: 3.5-2.1~exp1
Severity: grave
X-Debbugs-Cc: vor...@debian.org, debian-devel@lists.debian.org

Hi,

I was looking into performing an upgrade test of libselinux1 with
piuparts and that didn't go well. I spare you the piuparts stuff and go
into crafting a minimal reproducer using mmdebstrap:

mmdebstrap --variant=apt unstable /dev/null "deb http://deb.debian.org/debian 
unstable main" "deb http://deb.debian.org/debian experimental main" 
--chrooted-customize-hook="apt-get -y install libselinux1t64"

This looks fairly innocuous. We create a minimal sid chroot and install
libselinux1t64 using apt. What could possibly go wrong? Well, apt thinks
that it would be a good idea to avoid coinstalling breaking packages and
first removes libselinux1 before proceeding to install libselinux1t64.
Unfortunately, libselinux1 is transitively essential and dpkg links it,
so this is what you get:

| Reading package lists... Done
| Building dependency tree... Done
| The following packages will be REMOVED:
|   libselinux1
| The following NEW packages will be installed:
|   libselinux1t64
| 0 upgraded, 1 newly installed, 1 to remove and 0 not upgraded.
| Need to get 75.2 kB of archives.
| After this operation, 4096 B of additional disk space will be used.
| Get:1 http://deb.debian.org/debian experimental/main amd64 libselinux1t64 
amd64 3.5-2.1~exp1 [75.2 kB]
| Fetched 75.2 kB in 0s (6067 kB/s)
| debconf: delaying package configuration, since apt-utils is not installed
| dpkg: libselinux1:amd64: dependency problems, but removing anyway as you 
requested:
|  util-linux depends on libselinux1 (>= 3.1~).
|  tar depends on libselinux1 (>= 3.1~).
|  sed depends on libselinux1 (>= 3.1~).
|  libpam-modules-bin depends on libselinux1 (>= 3.1~).
|  libpam-modules:amd64 depends on libselinux1 (>= 3.1~).
|  libmount1:amd64 depends on libselinux1 (>= 3.1~).
|  findutils depends on libselinux1 (>= 3.1~).
|  dpkg depends on libselinux1 (>= 3.1~).
|  coreutils depends on libselinux1 (>= 3.1~).
|  base-passwd depends on libselinux1 (>= 3.1~).
| 
| (Reading database ... 6230 files and directories currently installed.)
| Removing libselinux1:amd64 (3.5-2) ...
| /usr/bin/dpkg: error while loading shared libraries: libselinux.so.1: cannot 
open shared object file: No such file or directory
| E: Sub-process /usr/bin/dpkg returned an error code (127)

At that point stuff is fairly broken and we cannot easily recover as
both dpkg and tar are now broken. This is pretty bad. To make matters
worse, the situation arises from the combination of Breaks + Provides
and there is nothing libselinux1t64 could do in maintainer scripts to
prevent this from happening, because no libselinux1t64 maintainer script
has been run by the time damage has happened.

I also looked into whether I could reproduce a similar failure with
other packages such as libpam0t64 or libaudit1, but in no other case, I
was able to construct a comparable outcome.

I also looked into why libselinux was being time-bumped. Do I understand
correctly that libselinux is entirely unaffected by time64?
https://adrien.dcln.fr/misc/armhf-time_t/2024-02-01T09:53:00/compat_reports/libselinux1-dev/lfs_to_time_t/compat_report.html
It still is affected by LFS due to using ino_t in the public ABI of
matchpathcon_filespec_add:
https://adrien.dcln.fr/misc/armhf-time_t/2024-02-01T09:53:00/compat_reports/libselinux1-dev/base_to_lfs/compat_report.html
Since we also complete the LFS transition here, not bumping it would
result in an ABI break regarding this symbol. If we were to opt
libselinux out of the LFS transition (e.g. by removing the flags in
debian/rules), then other packages being rebuilt against libselinux-dev
with these flags enabled would be ABI-incompatible though.

An option I see here is to provide ABI-duality for libselinux:

-extern int matchpathcon_filespec_add(ino_t ino, int specind, const char *file);
+typedef unsigned long libselinux_ino_t;
+typedef uint64_t libselinux_ino64_t;
+extern int matchpathcon_filespec_add(libselinux_ino_t ino, int specind, const 
char *file);
+#if defined _FILE_OFFSET_BITS && _FILE_OFFSET_BITS == 64 && sizeof(unsigned 
long) < 8
+extern int matchpathcon_filespec_add64(libselinux_ino64_t ino, int specind, 
const char *file);
+#define matchpathcon_filespec_add matchpathcon_filespec_add64
+#endif

Looking at the implementation, it would be fairly possible to implement
this. Of course, doing this comes at its own cost: We are extending the
libselinux1 ABI in a Debian-specific way and thus programs built on
Debian will not run on non-Debian.

Another option of course is doing a proper soname bump of libselinux1 to
a Debian-specific soname.

I really hope, I am missing something.

Helmut



Re: build profile proposal: nogir (second try)

2024-01-24 Thread Helmut Grohne
On Wed, Jan 24, 2024 at 06:30:02PM +, Alberto Garcia wrote:
> - Are packages that ship gobject-introspection files supposed to have
>in the relevant build dependencies (gir1.2-*-dev,
>   gobject-introspection ?), or is the build profile handling this
>   automatically?

This is not automatic. Please annotate relevant Build-Depends manually.

> - Packages using dh_install may have a line with usr/share/gir-1.0 in
>   their debian/libfoo-dev.install. This will fail if the .gir files
>   are not generated. What's the recommended way to handle this?

There is no silver bullet. Options:

 * You may use dh-exec. When doing so, you may annotate lines with build
   profiles. For example, samba's debian/winbind.install uses this
   approach.

 * You may conditionally run dh_install from debian/rules passing
   affected files as arguments.

 * You may split the affected files into a separate binary package to
   avoid this annoyance.

Helmut



Re: build profile proposal: nogir (second try)

2024-01-21 Thread Helmut Grohne
Hi Simon,

On Sun, Jan 21, 2024 at 03:24:25PM +, Simon McVittie wrote:
> > How annoying would it actually be to split this to a
> > different source package?
> 
> Really quite annoying. [...]

You gave more than sufficient reason. I won't argue.

> If porters are interested in making bootstrap automatic despite cycles
> like this one, I think a better route would be to be able to have
> a list of suggested bootstrap steps and build-order considerations,
> either centralized in some sort of cross infrastructure or distributed
> among packages. I'd be fine with adding something like this to glib2.0,
> for example, if it helped:
> 
> Bootstrap-Before: dbus, gobject-introspection
> Bootstrap-Build-Profiles: nogir, nocheck, noinsttest

We effectively tried the approach of encoding bootstrap-info into
individual packages with stage profiles and that was a bad idea. What
stages are needed can (and does) change. For instance, we no longer need
glibc's stage1 profile and go to stage2 directly. Hence, we try to more
and more use profiles that change a particular aspect of a package in an
obvious and isolated way and externally maintain how these are to be
combined into a successful bootstrap.

> Or, if we separated the nogir build profile that I'm proposing here into
> two, something like this:
> 
> nogir-changing-content
> can change content: Y ("unreproducible"/"unsafe" profile)
> can change package set: Y
> nogir
> can change content: N ("reproducible"/"safe" profile)
> can change package set: Y
> 
> would that allow automatic bootstrapping infrastructure to figure out
> that it was both safe and desirable to build glib2.0 with nogir?

I've considered this option for other profiles already and did not find
it appealing. Often times, you are interested in enabling the profile
without caring about whether it changes package contents, but such a
split would require you to figure out which of the profiles you need (or
simply both?).

More and more I think that merely documenting which instances of these
profiles are reproducible would be a better approach. I've had this
float as a vague idea since a while:

XS-Reproducible-Profiles: nogir

It's a promise that a source package can issue about a subset of the
profiles it supports. It bears some similarity to "Multi-Arch: foreign"
in the sense that both are promises on how the interface behaves. In
particular, such a declaration would be machine-checkable. We could
simply run an autobuilder that verifies whether such declarations are
practically correct (on amd64).

Bootstrappers do not really need that separation into two different
profile names that you propose. Having the information of which profiles
are reproducible in which source packages (and which packages get
disabled when enabling the profile), is what is needed.

So this is what I prefer, but it still comes at a cost. We're up for
changing lots of packages to declare these headers. And we're up for
setting QA to verify these. I fear I cannot provide the capacity to do
all of this and hence I have not pushed this forward.

Manually ordering glib2.0 in the bootstrap tooling may be annoying, but
that's about it. It still is way less work than any of the alternatives.

> (I infer that there must be some sort of infrastructure that knows that
> it's safe to build packages with "nocheck,noinsttest", otherwise glib2.0
> and dbus are already in a cyclic dependency for their test suites.)

Not really. nocheck and noinsttest are issued by default and simply
assumed to do the right thing in all cases.

> [...] I'm sorry if that's
> causing extra work for your use-case.

Yes, that's causing extra work on my side, but that extra work is really
low compared to the extra work on your side for the alternative. That
makes the choice rather obvious to me. Also having this advance warning
further lowers the cost on my side. You answered my question in way more
detail than expected. Thank you.

Helmut



Re: build profile proposal: nogir (second try)

2024-01-21 Thread Helmut Grohne
On Wed, Jan 17, 2024 at 11:38:09PM +, Simon McVittie wrote:
> On Wed, 17 Jan 2024 at 23:15:03 +0100, Matthias Geiger wrote:
> > Does this mean we should should split out the .gir XML files from existing
> > source packages into a separate gir1.2-foo-dev (in the long run) ?
> 
> That's a good question, and I don't have an easy answer for it. The
> tradeoff is:
> 
> - having the GIR XML in libfoo-dev means fewer binary packages and
>   therefore smaller Packages metadata;
> 
> - having a separate (non-virtual) gir1.2-foo-1-dev means we can "cleanly"
>   turn off GIR/typelibs in cases when they're not needed, and means
>   libfoo-dev is a bit smaller and with fewer dependencies

Really, I think the main advantage of splitting them out into real
packages is the additional QA that we get. With the Provides-mechanism,
consumers will often miss the additional gir1.2-*-dev build dependency
that is required and adding those back will be a permanent duty of cross
build porters.

> It's analogous to the choice between one big -dev package (libcairo2-dev,
> libwayland-dev) or multiple smaller -dev packages (libportal*-dev) for a
> source package with more than one shared library.

The QA aspect is different there.

> The larger, more widely-used and lower-level the library is, the more I
> would be inclined to opt for the approach with extra binary packages
> - for example splitting out gir1.2-gtk-4.0-dev from libgtk-4-dev
> seems more desirable than splitting out gir1.2-shumate-1.0-dev from
> libshumate-dev. Separating out the GIR XML is more interesting for
> packages that are involved in bootstrapping, or for packages that someone
> will frequently want to cross-compile, particularly the ones you'll want
> to cross-compile early in the development of a new port when tools like
> qemu-user might not be able to target it.

In essence, you are arguing for deciding on a case-by-case way and I
concur with that. The provides mechanism seems easier for maintainers
and so I'd recommend doing that, then changing to the split mechanism
where we deem it useful.

> In the case where the GIR XML is in libfoo-dev, asking for it to have
> Provides: gir1.2-foo-1-dev means that dependent packages can depend on the
> systematic gir1.2-foo-1-dev name, and then will work correctly either way.

The real question becomes how we can continuously ensure that packages
correctly depend on these virtual facilities. I fear the simplest way is
actually splitting the binary packages. Does anyone have a better idea?

> The only package where I'm sure that I intend to separate out the GIR
> XML in the short term is src:glib2.0, where for historical reasons
> gir1.2-glib-2.0 has been built by src:gobject-introspection until
> now. I'm most of the way through preparing a version of glib2.0
> 2.79.x for experimental that takes over gir1.2-glib-2.0{,-dev}
> from src:gobject-introspection, and I definitely don't want to fold
> gir1.2-glib-2.0-dev into libglib2.0-dev, because GLib's position at the
> bottom of the GNOME stack makes it particularly important that we can
> still bootstrap and cross-compile it.

Thank you. How annoying would it actually be to split this to a
different source package? glib2.0 is involved with bootstrap at this
time and that works fully automatically *because* it is not involved
with gir. When you add gir, builders have to add the nogir profile (and
thus manually order glib2.0). If you were to split this into two
distinct source packages, you'd remove the need for applying a build
profile and thus automatic bootstrapping continues to work. Of course, I
cannot tell how that impacts the implementation, but given that it
formerly was part of src:gobject-introspection, it cannot be unworkable.
Quite definitely, such a split is not a requirement though.

Helmut



Re: 64-bit time_t: updated archive analysis, proposed transition plan with timeline

2024-01-06 Thread Helmut Grohne
On Fri, Jan 05, 2024 at 12:23:00AM -0800, Steve Langasek wrote:
> I am also attaching here the dd-list output for the packages that will need
> to be sourcefully NMUed for the transition, for your review.

I could readily identify a number of packages (incomplete) also affected
by DEP17. Whenever you face files in aliased locations (other than
systemd units), please go via experimental to let dumat judge your
upload. Check the bookworm package for files in aliased locations, not
the unstable one.

>uhd

DEP17-affected, probably harmless

>btrfs-progs

DEP17-affected


>pacemaker (U)

DEP17-affected

>libmtp

DEP17-affected, probably harmless

>openafs (U)

DEP17-affected, probably harmless

>samba (U)

DEP17-affected, probably harmless

>apt

DEP17-affected, probably harmless

>ceph

DEP17-affected

>util-linux (U)

Do not upload to unstable directly. Will need mitigations.

>libapogee3

Do not upload to unstable directly. Will need mitigations.

>libfishcamp

Do not upload to unstable directly. Will need mitigations.

>libplayerone

Do not upload to unstable directly. Will need mitigations.

>libricohcamerasdk

Do not upload to unstable directly. Will need mitigations.

>libsbig

Do not upload to unstable directly. Will need mitigations.

>boinc

DEP17-affected

>gcc-13

If this affects libgcc-s1, do not upload to unstable as you risk
deleting libgcc_s.so.1. Will need mitigations in that case.

>libosmo-sccp

DEP17-affected, probably harmless

>libgpod

DEP17-affected, probably harmless

>libselinux

Do not upload to unstable directly. Will need mitigations.

>zfs-linux

Do not upload to unstable directly. Will need mitigations.

>libguestfs (U)

DEP17-affected, probably harmless

>libtirpc

Do not upload to unstable directly. Will need mitigations.

>fuse
>fuse3

Do not upload to unstable directly. Will need mitigations.

>audit

Do not upload to unstable directly. Will need mitigations.

>zlib

Do not upload to unstable. Will cause file loss unless mitigated.

>readline

Do not upload to unstable. Will cause file loss unless mitigated.

>openrc (U)

Do not upload to unstable directly. Will need mitigations.

>krb5

DEP17-affected

As predicted, this is going to have annoying interactions and the list
here definitely is incomplete.

Helmut



Re: /usr-move: Do we support upgrades without apt?

2024-01-04 Thread Helmut Grohne
On Wed, Jan 03, 2024 at 08:07:53PM +0100, Wouter Verhelst wrote:
> Presumably the reason for this requirement in policy is that without it,
> debootstrap cannot function. That is, debootstrap first unpacks all
> Essential packages, without running any preinst or postinst scripts, and
> *then* runs all the maintainer scripts. If an Essential package would
> not function without its maintainer scripts being run, then debootstrap
> could fail halfway through.

The requirement you reference above probably is 3.8:

Essential is defined as the minimal set of functionality that must
be available and usable on the system at all times, even when
packages are in the “Unpacked” state.

I note that this does not apply to bootstrap as is later clarified:

Since dpkg will not prevent upgrading of other packages while an
essential package is in an unconfigured state, all essential
packages must supply all of their core functionality even when
unconfigured after being configured at least once.

The "at least once" was added precisely, because packages are not
required to work before having been configured at least once. What
happens during debootstrap is rather unspecified by policy. The
requirement really aims at upgrade scenarios where the other packages
are being configured when an essential package is unpacked but not yet
configured. This is precisely the situation we break here (if using dpkg
directly in unfortunate ways).

> Running debootstrap cannot trigger the issue, because it does not
> involve upgrades; and I do not believe that apt will special-case
> Essential packages other than that it refuses to remove them unless
> the user enters The Phrase[1], so we can consider that if it's something
> that would work for a regular package, it will work for an Essential
> one, too.

I agree: The file loss cannot be encountered with bootstrapping tools
and as long as we are interacting via apt (or some apt using tool), we
cannot create the broken situation (there actually is no proof of this,
just hope and having tried to break it) as long as there is no mutual
conflict.

> Perhaps if the above assumptions are correct, policy should be updated
> such that the requirement is relaxed to only apply for initial
> installation?

Policy has been updated via #1020267 to *not* apply to the bootstrapping
scenario.

Helmut



Re: /usr-move: Do we support upgrades without apt?

2024-01-03 Thread Helmut Grohne
Thanks for the feedback. Given the replies, I consider that most people
expect upgrades to be performed with apt (or some apt-using tool).
Upgrades using dpkg (directly) are at least partially unsupported. In
more detail:

On Thu, Dec 21, 2023 at 10:41:57AM +0100, Helmut Grohne wrote:
> ## Options (combinations possible)
> 
> When mitigating P3, we can avoid the mutual conflicts. For molly-guard
> that has been more involved, but it seems manageable. For other
> packages (that do not need to access diverted files), it becomes
> simpler.

We'll be doing this. It is implemented in molly-guard and submitted for
gzip #1059533 / zutils #1059534. Hence, upgrades with apt-dependent
tools will not experience the failure mode.

> We can restore lost files in a postinst. For this to work, we must
> duplicate (e.g. hard link) affected files in the data.tar.
> Example: #1057220 (systemd-sysv upgrade file loss)
> Note that this approach is not policy compliant for essential packages
> as they must work when unpacked and this is relevant for gzip being
> diverted by zutils for instance.

We'll be doing this anyway. It is implemented in systemd-sysv.postinst
and proposed in the gzip patch above. Yes, we are technically violating
policy for gzip then, but I don't really see a technical way not to
violate policy. I expect that we do not consider fixing this (unfixable)
policy violation release-critical.

> We can introduce "barrier" packages (one or more) and have them enforce
> conflicting packages removed before the conflictor being unpacked
> (thanks Julian).

We'll keep this as an option for later, but avoid implementing it now.

> We can - and this is the crux of the matter - argue that upgrading with
> bare dpkg is unsupported and you get to keep the pieces if you do so
> anyway.

release-notes already recommend upgrading with apt. In addition we'll:
 * Extend release-notes to do advise something like `dpkg --verify` post
   upgrade.
 * Mitigate file loss in postinst (such that it becomes temporary).

If you have any objections to these choices, please tell.

Helmut



Re: /usr-move: Do we support upgrades without apt?

2023-12-22 Thread Helmut Grohne
Hi Matthew,

On Thu, Dec 21, 2023 at 02:42:56PM +, Matthew Vernon wrote:
> On 21/12/2023 09:41, Helmut Grohne wrote:
> 
> > Is it ok to call upgrade scenarios failures that cannot be reproduced
> > using apt unsupported until we no longer deal with aliasing?

Let me thank David for clarifying what "using apt" means in exactly the
way I intended it.

As a result, I think the only "no" reply, I've seen thus far is from
Matthew here.

> I incline towards "no"; if an upgrade has failed part-way (as does happen),
> people may then reasonably use dpkg directly to try and un-wedge the upgrade
> (e.g. to try and configure some part-installed packages, or try installing
> some already-downloaded packages).

I incline to agreeing with the scenario you depict. This can reasonably
happen. I also think that David made a good case for it being unlikely
to manage oneself into the buggy situation that way. And then the
consequence is that you lost some possibly important files. If you ended
up fiddling with dpkg in a failed upgrade, would it be too much to ask
for running dpkg --verify? In the event you see missing files, you may
reinstall affected packages and thus have cured the symptoms for your
installation.

Say we extended release-notes saying that you should dpkg --verify after
the upgrade and more so if you happened to use dpkg directly in the
process and review the output. Would that address your concern?

> It may be that the mitigations necessary are worse than the risk, but I
> think the behaviour as described in #1058937 is definitely buggy.

I hope we all agree this is buggy. That's not the question. The question
at hand is whether this is a bug worth fixing or mitigating. We face a
lot of bugs in Debian and assign different severities. Here, the
preliminary analysis assigned a rc-severity which generally means it is
worth fixing. That's the thing I'm questioning here.

Also keep in mind that probably the majority of bullseye -> bookworm
upgrades have been performed already. In all those upgrades, nobody ran
into the issue and reported it. As David pointed out, it was encountered
by actively trying to make it break. It's the silent kind of failure, so
it may just have happened without people noticing.

Maybe we can all run dpkg --verify on our installations (in particular
those upgraded to bookworm or later) and report if they show anything
suspicious. Then we can better quantify how likely these issues happen
in practice.

I note that dpkg --verify does not currently work with --path-exclude.
I'm not sure whether that's a bug. Being a user of --path-exclude, I
note that I ran dpkg --verify on 5 very different systems and didn't
spot unusual things. This is anecdotal evidence and cannot prove the
absence of problems though. I'd be very keen to see at least one user
reporting such problems in a real upgrade rather than me trying to find
problems.

Helmut



/usr-move: Do we support upgrades without apt?

2023-12-21 Thread Helmut Grohne
Hi,

this installment serves a dual purpose. Let me first give an update of
the status quo and then pose a consensus question on how we want to deal
with a particular problem.

I Cc d-release@l.d.o as upgrades are an integral part of releases.
I Cc d-ctte@l.d.o for advisory feedback with experience due to earlier
decisions on merged-/usr.

# Status

As I detailed earlier, diversions have been proving more difficult than
anticipated. I spent significant time on molly-guard to get to a working
mitigation and thanks to Francois and Daniel, all of the diverters of
/sbin/halt and others have been updated in experimental for wider
testing. This is looking promising and passing all testing that has been
performed thus far.

Meanwhile Chris Hofstaedler and kind folks in Cambridge worked a lot on
M-A:same shared file loss (DEP17 P7) and got us down to one
(reintroduced) issue.  Pending further reintroductions, this aspect is
done. Cool! I've since uploaded debhelper and dh_installudev will now
also install to /usr. udevdir in udev.pc has been changed in a NEW
upload to experimental as well and is expected to hit unstable before
too long (thanks Michael and Luca).

Earlier, I requested a pause of /usr merges. Since we have a better
understanding and solutions that seem to be working now, I am happy for
you to move stuff again more widely. For moves involving diversions in
any way, consider having me review your change ahead of upload.

At the time of this writing, there are 1237 source packages in unstable
that still ship something aliased. This is the number we need to get
down to 0 for trixie. Of these 860 involve a systemd unit and of these
761 only have systemd units aliased many of which can be converted by a
no-change upload due to changed debhelper and systemd.pc behaviour.

# The problem with conflicts

The idea in DEP17 was to use Conflicts as a mitigation strategy in
agreement with a naive reading of Debian policy. As it turns out, that
doesn't exactly match reality (#1057199 debian-policy) and there are
situations where files can be lost despite Conflicts having been
declared. In theory, this subtlety should be irrelevant and
unobservable, but aliasing (which breaks dpkg's assumptions) makes this
observable.

We move a file from / to /usr in $PKGA.

AND one of

The file is also being moved to a different package (causing DEP17 P1).

OR

The file is being diverted (causing DEP17 P3).

AND

The mitigation involves declaring a Conflict for unpack ordering (i.e.
M7 for P1 or M18 for P3).

AND one of

The upgrade is being performed using a direct dpkg invocation
without apt in a way that unpacks the package declaring the conflict
before the conflicted package is removed. Example: #1058937 (Ben's
libnfsidmap1 bug)

OR

The involved packages declare a mutual conflict (or mutual conflicts
+ breaks) and therefore apt invokes dpkg as in the earlier point.
Example: An earlier version of the molly-guard mitigation declared
versioned Breaks for systemd-sysv.

This condition is complex, so let me try to break it down into something
simpler. We'll have somewhere between 20 and 100 instances of P1 + P3 I
guess and we aimed for mitigating most of them using Conflicts (i.e.
first two conditions). The horny part is the last one. It basically says
that as long as we only ever use apt and avoid mutual conflicts, the
issue is not practically observable.

That mutual conflict condition is delicate on its own. There are
basically two ways to trigger it. The way my molly-guard patch did it
was having two versioned Conflicts or Breaks declarations. I checked the
archive and there is no instance of any package combination doing this.
Hypothetically, another way to trigger this is unversioned Conflicts
combined with a package that drops Provides in a later version (thanks
David), but we haven't seen any practical instance and I haven't figured
a good way to gauge this problem yet.

## Options (combinations possible)

When mitigating P1, we can opt for protective diversions (M8) instead of
Conflicts (M7), though that is more fragile.

When mitigating P3, we can avoid the mutual conflicts. For molly-guard
that has been more involved, but it seems manageable. For other
packages (that do not need to access diverted files), it becomes
simpler.

We can restore lost files in a postinst. For this to work, we must
duplicate (e.g. hard link) affected files in the data.tar.
Example: #1057220 (systemd-sysv upgrade file loss)
Note that this approach is not policy compliant for essential packages
as they must work when unpacked and this is relevant for gzip being
diverted by zutils for instance.

We can introduce "barrier" packages (one or more) and have them enforce
conflicting packages removed before the conflictor being unpacked
(thanks Julian).

We can - and this is the crux of the matter - argue that upgrading with
bare dpkg is unsupported and you get to keep the pieces if you do so
anyway.

##

Re: Pause /usr-merge moves

2023-12-04 Thread Helmut Grohne
Hi developers,

On Fri, Dec 01, 2023 at 10:04:12PM +0100, Helmut Grohne wrote:
> Before we go, let me express sincere thanks to so many people that
> helped me track this down. In particular, the input of David
> Kalnischkies, Guillem Jover and Julian Andres Klode was invaluable.

I got more feedback mainly from David Kalnischkies and Enrico Zini this
time. Thank you for wrapping your head around this!

> Julian Andres Klode proposes adding a "barrier package" that we may call
> usrmerge-support (or repurpose usr-is-merged). Affected Conflicts can be
> moved to the barrier package and the conflicting package would then
> express Pre-Depends on the barrier package. When the barrier's postinst
> runs, any conflicting package definitely has been removed and due to
> using Pre-Depends, the conflicting package definitely has not been
> unpacked yet.

David Kalnischkies made me aware that we need to treat unversioned
Conflicts separately from versioned Conflicts here. With unversioned
ones, we typically manage the provider of a virtual facility. The
barrier package approach can be used here, but it requires each provider
to have its own barrier package rather than one central barrier package.
Also when changing providers of a facility, apt will usually remove one
provider explicitly before installing another without the
--set-selections dance that triggers the problematic behaviour. I have
not seen any way of invoking apt yet that would cause the problematic
behaviour, but that can be due to me not trying hard enough.

For versioned conflicts (where the conflicted version generally is in
bookworm and not in trixie), a single central barrier package might do,
but we can also start with one and split it up later if providing
multiple barriers from the same binary package initially. For instance,
the files diverted by molly-guard would require a barrier package that
also handles bfh-container and progress-linux-container on one side and
systemd, finit, kexec-tools, runit and sysvinit on the other. Such
grouping can significantly reduce the number of barrier packages for the
versioned case without severely limiting the options to the solver.

> Another option is duplicating affected files (e.g. using hard links) in
> the data.tar and then restoring lost files during postinst.

I note that in case of systemd (and finit and runit), all of the
affected files are symbolic links, so we can pull this off without
actually changing data.tar and just restoring the links in postinst. We
wouldn't actually need that many backup copies.  I'll look into drafting
a patch.

> Depending on what problem we are solving, we may also move to protective
> diversions (DEP17 M8).

If we have a choice between introducing a barrier package and protective
diversions, I think the protective diversions are the lesser evil. A big
chunk of the not-yet-conflicts can probably be done with protective
diversions instead. The thing that cannot be solved with protective
diversions is diverting an aliased location. We cannot fix diversions
with diversions unfortunately. Still that would lower the number
significantly.

It also is a little unclear how much effort we want to spend on avoiding
this kind of breakage. When using apt, we want things to work, but when
using dpkg directly and issuing dpkg --set-selections maybe that's used
rarely enough that we can point out the problems in release notes and
call it a day? In order to have apt not trigger this scenario, it seems
sufficient if molly-guard from sid were usable with (aliased)
systemd-sysv from bookworm (i.e. not having Breaks for bookworm's
systemd-sysv). Doing that allows apt to simply upgrade molly-guard
before systemd-sysv and then things would just work. You could still
reproduce the file loss if you poked hard enough, but we could call
those artificial cases than and ask people to reinstall affected
packages after having experienced the file loss. How do others feel
about reducing our support promise here?

The total number of cases is also still very much vague. Reasons:
 * Unversioned conflicts incur a barrier package per provider while
   versioned conflicts can be grouped. The grouping can be changed if we
   use virtual barrier packages.
 * We do not know in advance how many affected packages are restructured
   and when they are, we may choose to mitigate those cases with
   protective diversions rather than Conflicts.

I understand that this doesn't really answer Bastien's questions, but I
don't have better answers. At this time I'm looking more feedback as to
what our preferred trade-off is and reaching consensus about that
question. The list of issues (including trixie ones) was already
attached though.

Helmut



Pause /usr-merge moves

2023-12-01 Thread Helmut Grohne
Hi developers,

I have unfortunate news regarding /usr-merge. I uncovered yet another
problem that we haven't seen mentioned earlier. We do not yet know how
to deal with it and it may take some time to come up with a good
compromise. As a result, please pause further moves from / to /usr.
Exceptions:
 * With more uploads, more systemd units will move. While such moves may
   trigger the new problem, I expect that to be rare.
 * Continue fixing RC bugs, in particular those that are due to
   dh_installsystemd or systemd.pc having moved to /usr.
 * Continue applying DEP17P7 mitigations for udev rules. Patches for
   these have been sent by Christian Hofstaedler and a few people from
   the Cambridge miniconf. These are unrelated.

The rest of this mail is lots of funky details for those interested in
understanding what went wrong here. Others are encouraged to do
something more joyful :)

Before we go, let me express sincere thanks to so many people that
helped me track this down. In particular, the input of David
Kalnischkies, Guillem Jover and Julian Andres Klode was invaluable.

Fundamentally, Conflicts do not reliably prevent concurrent unpacking of
packages as policy §7.4 may suggest. I have reported this as #1057199.
Consequently, what we look at here is situations where Conflicts are
used to mitigate file loss in the face of aliasing changes. Debian
policy §6.6 is more precise and details that when unpacking a package,
conflicting packages may be deconfigured and removed after the unpack.
In theory, the difference should not be noticeable, because dpkg
accurately tracks ownership of files with respect to packages. Aliasing
changes this and can cause file loss. The situation arises when
installing or upgrading a package to a version that happens to be in
conflict with another package to be removed. A simple example is
upgrading a bookworm system with molly-guard and systemd-sysv to sid and
in the process deleting molly-guard. A similar issue happens when
upgrading a bookworm system with busybox-static to sid and in that
process installing busybox and thus removing busybox-static. The
situation is hard to come by, because apt tends to remove the package
that goes away early when it can. I have implemented a reproducer
without apt for systemd-sysv #1057220. There are also situations where
apt reproduces this available from the policy bug mentioned earlier. In
particular, when one package has versioned Conflicts for another and the
other has versioned Breaks for the former, this reproduces with apt.
This essentially breaks DEP17 proposed mitigations M7 and M18.

I have also locally extended dumat to produce a report of affected
Conflicts and am attaching it to this mail. The only packages that have
not yet migrated and have this problem are systemd-sysv,
busybox/busybox-static and resolvconf and I have filed RC bugs for them.
There are other instances in trixie already.

I welcome ideas for solving these problems. Let me summarize those I
already am aware of.

Julian Andres Klode proposes adding a "barrier package" that we may call
usrmerge-support (or repurpose usr-is-merged). Affected Conflicts can be
moved to the barrier package and the conflicting package would then
express Pre-Depends on the barrier package. When the barrier's postinst
runs, any conflicting package definitely has been removed and due to
using Pre-Depends, the conflicting package definitely has not been
unpacked yet.

Another option is duplicating affected files (e.g. using hard links) in
the data.tar and then restoring lost files during postinst.

Depending on what problem we are solving, we may also move to protective
diversions (DEP17 M8).

It also is not clear how easy it is to reproduce this bug class in an
actual upgrade. It took long to find the issue for a reason. Depending
on what files go missing, we may get away with asking users to dpkg
--audit and then apt reinstall affected packages.

That barrier package approach sounds relatively promising to me, but
there is no implementation of that approach as of this writing.

If you want to support finding a solution, please contribute to this
email thread of join #debian-usrmerge on oftc.

Helmut


ineffective_conflicts.yaml
Description: application/yaml


Re: DEP17 - /usr-merge - what has happened - what will happen - what you can do to help

2023-11-16 Thread Helmut Grohne
Hi Holger,

On Thu, Nov 16, 2023 at 01:22:05PM +, Holger Levsen wrote:
> feel free to reply in public (incl. quoting me). or reply in private. :)
> (well, or don't reply though that would make me a bit sad. :)

I think your question is relevant to others.

> On Thu, Nov 16, 2023 at 11:27:36AM +0100, Helmut Grohne wrote:
> > I now change focus away from systemd units towards the essential set.
> > This is a tricky affair as it risks breaking bootstrap and the
> > debian-installer. As such, I intend to send patches for affected
> > packages and have started doing so already. There are quite a few more
> > patches to come. 
> 
> I don't understand, the essential set involves 23 source packages,
> why do you expect "quite a few" patches and not a certain amount?
> the next paragraph also seems to suggest you're talking about
> a "bigger essential set" than what I have in mind.

What I actually meant was the set of packages used by debootstrap, but I
wrote essential. In essence, this is "Priority: required". I'm not sure
about "Priority: important" yet. debootstrap seems to reliably configure
all required packages before unpacking important packages and that may
be sufficient to be safe. Rule of thumb: If your package is in the
"Priority: required" set and has an aliased file, do expect me to send a
patch.

> > Within three months we need to reach the point where
> > essential is fully converted with the exception of those packages that
> > cannot be converted without breakage. At that point, I'll coordinate a
> > synchronized NMU of the remaining packages. Affected maintainers have
> > received a mail while ago.
> 
> a mail or a bug? is there a user tag?

A d-devel thread Cced to all relevant maintainers.
https://lists.debian.org/20230912181509.ga2588...@subdivi.de
We're talking about:
 * base-files
 * bash
 * coreutils maybe
 * dash
 * libc6
 * util-linux

> many thanks for all your work on this!

You are welcome.

Helmut



DEP17 - /usr-merge - what has happened - what will happen - what you can do to help

2023-11-16 Thread Helmut Grohne
Hello developers,

yeah, I know, this is annoying to many. Still I hope that we can close
this chapter by trixie with your help.

# What has happened?

Since the unstable buildds have been updated to be merged-/usr, the file
move moratorium has been officially delegated to
https://wiki.debian.org/UsrMerge and in that course has been reduced.

The debootstrap uploads to bullseye p-u and bookworm p-u have been
reviewed accepted and will be part of the next stable point release.

usr-is-merged now enforces a merged layout in trixie and unstable. If
you are faced with failures from debootstrap consider updating your
debootstrap from bullseye-updates or bookworm-updates.

dh_movetousr and dh-sequence-movetousr is a thing and can be used to
convert packages.

dh_installsystemd and dh_instalinit now install units to /usr. This
renders 11 packages rc-buggy and patches for all instances have been
filed. On the flip side, more than 500 source packages will complete the
transition in their next upload or binNMU.

While systemd.pc still points the unit directory below /lib, the change
is prepared and patches have been filed for all resulting bugs. The
change is still being deferred, because it would cause 19 rc bugs as of
this writing.

systemd (255) has removed support for the split layout in unstable.

I have sent patches moving shared libraries in essential from /lib to
/usr/lib.

# What will happen next?

I have spent quite some effort on ensuring that most of systemd units
would move with little impact. As a result, I hope that we can resolve
more than half of them using no-change uploads or binNMUs. These will
not be scheduled now, because there is little urgency yet and your
regular uploads will make such binNMUs unnecessary in many cases.

I now change focus away from systemd units towards the essential set.
This is a tricky affair as it risks breaking bootstrap and the
debian-installer. As such, I intend to send patches for affected
packages and have started doing so already. There are quite a few more
patches to come. Within three months we need to reach the point where
essential is fully converted with the exception of those packages that
cannot be converted without breakage. At that point, I'll coordinate a
synchronized NMU of the remaining packages. Affected maintainers have
received a mail while ago. And then we hopefully return to the simpler
pre-/usr-merge bootstrap protocol where packages describe what makes up
Debian. I hope to have this completed by the end of March 2024.

My next focus will be difficult cases. There are two problem categories
that I already know will require non-trivial patches. One is udev rules
in Multi-Arch: same packages and the other is diversions. These probably
all need patches and extensive testing.

What remains is converting the remaining packages and we ideally get
that done by trixie. This is the point where we consider binNMUs for
systemd units. As we progress, we'll also encounter more and more
problems caused by concurrent file move and restructuring (DEP17 P1) and
will get a better understanding of how the approach using Conflicts
impacts upgrades from bookworm to trixie. Possibly, we'll have to
convert some of those Conflicts (DEP17 M7) into protective diversions
(DEP17 M8) in order to unbreak upgrades.

As much as I'd like to say we're done then, there will still be a number
of tasks remaining. The release-notes will likely need an update.
External repositories need help adapting to Debian's changes.
Derivatives will need help setting up their own monitoring. We'll also
notice some broken pieces. For most of us though, problems should fade
away.

# What you can do to help?

Please do apply /usr-merge related patches in a timely manner. I try to
give you time by sending them in a useful order and leaving at least
two weeks before they become urgent.

Please move files from / to /usr yourself.
https://wiki.debian.org/UsrMerge has instructions on whether your
package is eligible and what to do. I will not be able to send patches
for each and every package. Neither from a funding pov nor does my day
have sufficient hours. Getting this done must be a collective effort.

Please upload restructuring changes to experimental. If you rename
binary packages (e.g. adding a "t64" suffix for the 2038 transition) or
move a file from one package to another, please upload to experimental.
This advice is valid for the entire trixie cycle. In doing so, dumat is
enabled to spot /usr-merge related problems in your package and report
them to you rather than you having to check for them. Alternatively,
run[1] dumat locally before upload.

If you want to support this effort beyond your own packages, help is
appreciated with writing patches. Especially the ones for udev rules are
relatively mechanical and just need someone doing them. I'm happy to get
you started and reviewing them. Consider stopping by in
OFTC#debian-usrmerge.


I hope that this all makes sense to you. In case it does no

Re: New Essential package procps-base

2023-11-14 Thread Helmut Grohne
Hi Craig,

On Tue, Nov 14, 2023 at 05:29:01PM +1100, Craig Small wrote:
> Hello,
>   For quite some time (since 2006!) there has been a discussion at[1] about
> changing from the sysvinit-utils version of pidof to the procps one. A
> quick scan of the various distributions shows that only Debian and Ubuntu
> (and I assume most other downstreams) use the sysvinit-utils version.
> 
> So to rehash some old drafts, here's the proposal.
> 
> What:
> Create a new package procps-base. This uses the existing procps source
> package and just enable building of pidof. procps-base will be an Essential
> package and only contain pidof.

I welcome the effort in general. Like Andreas, I question whether having
pidof remain essential is useful. A quick codesearch
https://codesearch.debian.net/search?q=%5Cbpidof%5Cb&literal=0 suggests
that we have less than 500 source packages that even mention it. Many
uses are in test suites or documentation, so the final number will be
lower still.

If we agree that pidof should not be essential, the next question is
whether we need that procps vs procps-base split. Andreas suggests "no".
I don't have a strong opinion on that one.

Let me suggest an alternative transition plan. We extend sysvinit-utils
with a new virtual package "pidof". Then we MBF packages using pidof to
add a dependency on pidof. Once a significant portion of those bugs is
fixed, we move pidof out of sysvinit-utils and have it drop that virtual
package. procps or procps-base can then add pidof (with Breaks+Replaces
for sysvinit-utils and Provides: pidof) moving it out of the essential
set in the process. Any remaining bugs would be bumped to rc-severity at
that point.

> Why:
> This would bring the pidof variant in line with other distributions.
> sysvinit-utils would no longer need to be Essential (though that's a
> separate issue) and would only have init-d-script, fstab-decode, and
> killall5.

I fear sysvinit-utils being essential is not separate (see below). It
really needs to be done together, so additionally there would have to be
another MBF for those other tools asking to add dependencies.

> The majority of usage of pidof is in init or pre/post scripts, which really
> should be using the LSB pidofproc function. That function in turn
> optionally uses pidof if the pidfile parameter is not given. That's
> probably a way forward for sometime in the future to not need procps-base
> Essential, but it is a way off.

For as long as sysvinit-utils contains /lib/lsb/init-functions, it'll
have to include pidof or depend on it. Therefore the pidof provider can
only become non-essential once sysvinit-utils is non-essential. If you
see the change in implementation as more urgent than making all of it
non-essential, then procps-base is needed indeed.

> sysvinit-utils requires only libc6 while procps-base require libproc-2 but
> this is the same library used for the ps,top,w etc tools which are
> installed on most systems.

Yeah, please don't increase the essential set. The addition would be
very unwelcome to embedded systems.

So in essence, you asked for changing the pidof implementation and
Andreas and me try to turn this into a much bigger quest of making it
non-essential. While these matters are related, they can be done
independently in principle and if you do not want to take on the
non-essential part, I fear I see little alternatives to that procps-base
proposal.

Pulling procps-base into the essential set also adds it to the bootstrap
set. That also adds numactl to the bootstrap set. I'd rather not have it
grown if possible. Both are currently cross buildable, so it's not the
end of the world.

Helmut



techniques for moving systemd units from / to /usr and RFH

2023-10-16 Thread Helmut Grohne
Hi,

This also is part of the larger /usr-merge + DEP17 context, but it goes
more into the direction of brain storming and request for help, so if
you're short on time, you should probably skip this entirely.

To get us started, let me get some numbers. All of them concern
unstable.
* 1443 source packages that produce binary packages that install at
  least one file into an aliased directory. This is the total count of
  packages affected by the transition.
* 1035 source packages include a systemd unit. There are two reasons to
  focus on systemd units:
  + They constitute a quite large fraction of the problem.
  + We can move them in all circumstances already (including essential
and udeb).
* At least 596 source packages are easy. We'll upload debhelper with a
  modified dh_installsystemd and friends, upload systemd with changed
  systemd.pc and then binNMU or no-change NMU these packages and they'll
  get their units moved. As of this writing, there are still 33 future
  FTBFS bugs short of moving forward.
* For 80 source packages, I have recorded bugs (either FTBFS or systemd
  unit conversion patches).
* That leaves around 400 source packages that don't just work. This is
  the packages of interest to this mail.

In most of these cases, we can go the short-cut: Wait for
dh_movetousr/dh-sequence-movetousr, enable it explicitly and be done.
This is easy and will be backportable. However there are a few
alternatives worth exploring:

Some configure scripts have --with-systemd-unit-dir or similar option
and we tend to pass the literal value `/lib/systemd/system` here.
Changing this to `$(pkgconf --variable=systemdsystemunitdir systemd)`
will make packages binNMUable. For some configure scripts that value is
also autodetected if passing `yes`. Beware that if your source packages
produce multiple binary packages, this method becomes annoying as you
also have to interpolate this path in `debian/*.install` files.

Often times the upstream source merely contains a unit and does not
install it. Packages tend to add `src/foo.service /lib/systemd/system`
to `debian/foo.install`. Again, we face the need for interpolation. I
see two possible alternatives to this method:
 * An `execute_before_dh_installsystemd` can copy the unit to the
   `debian` directory. Then `dh_installsystemd` will pick it up.
 * In 3.0 source packages, `debian/foo.service` can be a symbolic
   link pointing to the relevant upstream file. `dh_installsystemd` will
   resolve this link and install the actual file. Unfortunately,
   `debdiff` will also resolve the link and show a file addition.

While working on this, I also stumbled into another bug class. If a
package installs (via dh_auto_install or dh_install) a unit to
/lib/systemd/systemd and then also installs the same unit via
dh_installsystemd, that currently causes the latter to win. Once
dh_installsystemd instals to /usr, this will cause a policy violation,
because we then have a unit both in /lib/systemd/system and
/usr/lib/systemd/system. I filed patches for those cases I readily found
using codesearch, but I probably missed some. The reverse problem
probably exists with systemd.pc.

Given this, I plan to upload debhelper with dh_movetousr sooner rather
than later and defer the dh_installsystemd change until it causes fewer
RC bugs.

Questions:

Do you agree that avoiding dh_movetousr in favour of such alternatives
is worth spending time? Especially when files other than systemd units
are involved, dh_movetousr is probably more robust.

Is that `debian/*.$UNIT` as a symlink approach reasonable? It definitely
has quite some prior art (incomplete list) and therefore think it is:
 debug-me docker.io galera-4 input-remapper ipmiutil mender-connect
 osmo-ggsn osmo-msc pagure smartdns smartmontools sopel squid vblade
 vnstat writeboost xpra

What other ways do you see for making packages put their units to /usr
in trixie and / when backported to bookworm?

Is there anyone who'd help with sending patches for the missing
packages? If you send patches moving files from / to /usr by one of the
means, please usertag them helm...@debian.org + dep17m2 to avoid
duplication of work. I'm attaching a ddlist of packages needing manual
work that happen to not ship any other aliased files.

https://udd.debian.org/cgi-bin/bts-usertags.cgi?user=helmutg%40debian.org&tag=dep17m2

Helmut
A. Maitland Bottoms 
   libiio

Adam Borowski 
   ndctl

Adam Majer 
   openqa (U)

Adrian Alves 
   grokmirror (U)

Adrian Vondendriesch 
   corosync (U)
   fence-agents (U)
   patroni (U)
   sbd (U)

Alberto Bertogli 
   chasquid (U)
   dnss (U)

Alberto Garcia 
   filetea

Alessio Treglia 
   rtkit (U)

Alexander GQ Gerasiov 
   clickhouse

Alexander Sack 
   connman (U)

Alexandre Mestiashvili 
   gearmand

Alexandre Rossi 
   uwsgi (U)

Alexandre Viau 
   syncthing (U)

Alf Gaida 
   connman (U)

Alkis Georgopoulos 
   epoptes (U)

Aloïs Micard 
   syncthing (U)

Anders Waananen 
   nordugrid-arc (U)

Andrea Bolognan

Re: /usr-merge and DEP17 update: what happens next and how you can help

2023-10-09 Thread Helmut Grohne
Hi Andrea,

On Mon, Oct 09, 2023 at 02:10:27PM +0200, Andrea Bolognani wrote:
> For libvirt, the upstream build system actually installs systemd
> units under /usr/lib, and we move things around in debian/rules so
> that they end up under /lib in the Debian package:
> 
>   SRV_MONOLITHIC = libvirt-guests virtlogd virtlockd \
>libvirtd libvirtd-tcp libvirtd-tls virt-guest-shutdown
> 
>   set -e; for f in $(SRV_MONOLITHIC); do \
>   dh_install -p libvirt-daemon-system \
>  usr/lib/systemd/system/$${f}* \
>  lib/systemd/system/; \
>   done
> 
> I wouldn't be surprised if other packages did something similar.

This definitely is more common, yes.

> In this case, instead of throwing dh_movetousr into the mix, wouldn't
> it be more sensible to drop the rename part and just follow the
> upstream build system?

In the long run, I definitely agree. In the short term, there are
downsides.

> I guess this could theoretically be problematic for backports, as the
> dh_movetousr approach would guarantee that units still end up in /lib
> on bookworm and older but this wouldn't. On the other hand, hasn't
> systemd been able to load units both from /lib and /usr/lib for
> several releases now? So I would expect that to work somewhat
> transparently.

This is correct. systemd handles both locations since very long. 

> Am I missing something? I have to admit that, while I've tried to
> keep tabs on the discussion and all the great work you and other have
> been doing to push things forward, I never quite managed to fully
> absorb the problem space.

Yes, you are and what you are missing really is not obvious, so thanks
for asking!

For one thing, dh_installsystemd generates maintainer scripts for
restarting services. Before version 13.11.6, it did not recognize the
/usr location. If you were to backport such a package, bookworm's
debhelper would not generate the relevant maintainer scripts. You can
mitigate this by issuing "Build-Depends: debhelper (>= 13.11.6~)". Thus,
you'll be using a backported debhelper (unless the backporter carelessly
deletes this dependency).

For another, we have this generic file loss problem (DEP17 P1). If - in
addition to moving units to /usr - you also restructure your package
between bookworm and trixie (move units between binary packages), then
an upgrade scenario may delete those files even in the presence of
correct Breaks+Replaces. As long as you are sure that you do not rename
any binary packages nor move any units between packages from bookworm to
trixie, this won't apply. Such renames or moves are hard to predict
though.

So if you understand these limitations and are prepared to handle them
for backports, cleaning things up now is fine. If you are not, deferring
that cleanup until after trixie and using dh_movetousr in the interim,
may be the simpler option.

Helmut



/usr-merge and DEP17 update: what happens next and how you can help

2023-10-08 Thread Helmut Grohne
Hi,

Quite a bit has happened and we're more and more moving from discussion
into action. I'd like to use this opportunity to thank all the invisible
voices who've given me useful feedback. Your private messages, BoF
feedback, and other forms have reached me even if I did not answer all
of them individually. Please bear with one more mail that also is too
long. ;)

The file move moratorium issued by the CTTE still is in effect
technically. It is issued as a recommendation and that means, we
technically can violate it given reasons. I believe there are reasons
and hope we can formally lift it soon. A major blocker to lifting it has
been that buildd chroots have been unmerged, but those will hopefully be
merged as you read this, see
https://lists.debian.org/debian-devel/2023/10/msg00024.html.

On a practical level, just lifting the moratorium would break lots of
stuff. Instead, it needs to be lifted progressively. So the moratorium
will no longer be a CTTE moratorium, but a transition one and will cover
less and less aspects over time. Its state shall be tracked on
https://wiki.debian.org/UsrMerge.

Since my last episode, I've focused on moving systemd units. The reason
is that they constitute a big chunk of the problem (if you judge it by
the number of affected packages) and moving them technically poses
little problems in the face of unmerged buildds and unmerged d-i. In
particular, debhelper supports generating maintainer scripts for units
installed to /usr since version 13.11.6. As for moving files, I propose
three concurrent ways:
 * If your units are stored as debian/*.service and others and installed
   by dh_installsystemd or dh_systemd_start, please just wait. I want to
   change these helpers, see
   https://salsa.debian.org/debian/debhelper/-/merge_requests/113. Most
   affected packages can be converted using a binNMU.
 * If your units are installed to the location given by
   `pkg-config --variable systemdsystemunitdir systemd`, please also
   wait. I want to change systemd.pc, see
   https://salsa.debian.org/systemd-team/systemd/-/merge_requests/218,
   but this currently causes too many FTBFS.
 * For many other cases, I propose leaving the upstream install layout
   as is and performing the conversion using a new debhelper component
   that will be called dh_movetousr and can be enabled by depending on
   dh-sequence-movetousr, see
   https://salsa.debian.org/debian/debhelper/-/merge_requests/112.
   Note that this helper intentionally is opt-in. It should not be
   blindly applied to all packages, but used where we understand that it
   does not cause breakage. In a number of cases, using it will require
   mitigations from DEP17 to be implemented as well.

These changes have two significant implications for affected packages
that you should be aware of. For one thing, all of these violate the
CTTE moratorium and when affected packages are restructured, they may
incur the file loss scenario (DEP17 P1) that the moratorium was meant to
prevent. Indeed, the upload of dhcpcd5 to bookworm accidentally did
that. To avoid this from causing file loss, please upload restructuring
changes (i.e. those where packages are renamed or files move between
packages) to experimental first and wait three days. In case of
problems, https://salsa.debian.org/helmutg/dumat shall automatically
file an RC bug for your package. At the time of this writing, I still
proof read every bug before filing, but the intention is to let this
proceed without human intervention.

The other aspect is that such moves may break backports. Therefore
backports should continue shipping their units in the same location as
the relevant base suite (most probably not /usr). To ease this,
backports of debhelper shall revert the change in installation location
for dh_installsystemd and dh_systemd_start and they shall also turn
dh_movetousr into a noop. Also the systemd.pc change shall not be
backported. So if you use one these mechanisms, backports should just
work. However, if you change e.g. --prefix=/ to --prefix=/usr to move
all files, please pay attention to revert this for backports.

I know you love dd-lists, so I have prepared some:
 * nochange.ddlist lists 406 packages where a rebuild is sufficient.
   This is almost half of the source packages that ship files in aliased
   locations!
 * movetousr.ddlist lists packages that probably need to run
   dh_movetousr for some reason:
+ There are other files than systemd units shipped in aliased
  locations.
+ debian/*.install places units to /lib. In this case, consider
  installing units by placing them to debian/*.service if possible
  to benefit from the automatic conversion.
+ An upstream build system hard codes the systemd unit location to
  /lib.
+ ...

Once we got the ball rolling for systemd units, my next focus will be
transitively essential packages and risky corner cases. I shall be
sending patches for affected packages and move all files in the
es

[PATCH] schroot: created merged-/usr chroots for trixie and beyond

2023-10-07 Thread Helmut Grohne
The CTTE has ruled that from trixie onward, maintainers may rely on
systems being merged-/usr. This includes the build environment.
---
 modules/schroot/files/setup-dchroot | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Hi DSA,

would you consider applying the patch to dsa-puppet.git? It is meant to
convert trixie and newer chroots to merged-/usr. We also meant to
implement this change by updating debootstrap in bullseye and bookworm
via -pu, but the presence of --no-merged-usr spoiled this plan. As such
I propose moving forward with this.

d-devel Cced to let people know what's happening and to keep a public
record of this patch.

Helmut

diff --git a/modules/schroot/files/setup-dchroot 
b/modules/schroot/files/setup-dchroot
index d6e61f5a8..0f2330506 100755
--- a/modules/schroot/files/setup-dchroot
+++ b/modules/schroot/files/setup-dchroot
@@ -342,6 +342,16 @@ case "$suite" in
 ;;
 esac
 
+mergedusr=--merged-usr
+case "$suite" in
+  jessie|stretch|buster|bullseye|bookworm)
+mergedusr=--no-merged-usr
+;;
+  trusty|utopic|vivid|wily|xenial|yakkety|zesty|artful|bionic|cosmic)
+mergedusr=--no-merged-usr
+;;
+esac
+
 bindir=$(mktemp -d)
 cleanup+=("rm -r $bindir")
 cat > "$bindir/wget" << 'EOF'
@@ -357,7 +367,7 @@ PATH="$bindir:$PATH" \
 --include="$include" \
 --variant=buildd \
 --arch="$arch" \
---no-merged-usr \
+"$mergedusr" \
 "$suite" "$rootdir" "$mirror" "$script"
 echo "$tuple" > "$rootdir/etc/debian_chroot"
 echo "force-unsafe-io" > "$rootdir/etc/dpkg/dpkg.cfg.d/force-unsafe-io"
-- 
2.42.0




Re: debvm for autopkgtests with multiple host?

2023-09-29 Thread Helmut Grohne
Hi,

Quick followup given new insights.

On Sun, Sep 24, 2023 at 05:51:47PM +0200, Helmut Grohne wrote:
> Hi Johannes,
> 
> On Sun, Sep 24, 2023 at 10:27:37AM +0200, Johannes Schauer Marin Rodrigues 
> wrote:
> > There is really not much magic. The core of it is to pass this to your
> > mmdebstrap or debvm-create invocation:
> > 
> > --setup-hook='for f in /etc/apt/sources.list /etc/apt/sources.list.d/* 
> > /etc/apt/preferences.d/*;
> >   do [ -e "$f" ] && { echo; sed "s| file://| copy://|" 
> > "$f"; } | tee "$1/$f" >&2; done'
> > --hook-dir=/usr/share/mmdebstrap/hooks/file-mirror-automount
> 
> This sounds simple, but reality is a little more elaborate.
> 
> For one thing, there also is
> /usr/share/mmdebstrap/hooks/copy-host-apt-sources-and-preferences. This
> hook directory is similar but subtly different from the above setup
> hook:
>  * It does not perform the translation of file:// uris into copy://uris.

The gist is that accessing file:// URIs from within the mmdebstrap
chroot won't work out of the box. One can either turn the into copy://
URIs or use the file-mirror-automount hook to issue bind mounts for
them. According to Johannes, the latter is to be considered more
reliable.

> What seems to work is this:
> 
> debvm-create
> --skip=usrmerge
>   ...
>   --
>   --hook-dir=/usr/share/mmdebstrap/hooks/file-mirror-automount
>   
> --hook-dir=/usr/share/mmdebstrap/hooks/copy-host-apt-sources-and-preferences
>   --hook-dir=/usr/share/mmdebstrap/hooks/maybe-merged-usr
>   ""
> 
> That final empty string supplies the apt sources. Does this sound about
> right? If yes, I'd like to add this as a non-flaky autopkgtest to debvm.

This is subtly wrong. The file-mirror-automount hook must come after
copy-host-apt-sources-and-preferences or it may miss URIs to mount and
it must come before maybe-merged-usr or it won't have done its job in
time. So rather use this pattern:

debvm-create
--skip=usrmerge
...
--

--hook-dir=/usr/share/mmdebstrap/hooks/copy-host-apt-sources-and-preferences
--hook-dir=/usr/share/mmdebstrap/hooks/file-mirror-automount
--hook-dir=/usr/share/mmdebstrap/hooks/maybe-merged-usr
""

The sbuild autopkgtest uses something roughly like this and debvm's
autopkgtest also now use this (and actually pass that way).

Beware of one horny detail. When mmdebstrap fails resolving dependencies
(and that can happen during debci), it kills its process group as a
mechanism to get rid of its children. This works fine if your
autopkgtest does not have needs-root. If it does, this failure mode can
currently damage debci infrastructure (yes, really). So if you use this
together with needs-root, please also wrap it in "setsid -w" to keep
debci in a healthy state.

And with these instructions, I think I've also resolved #1036919!

Helmut



Re: debvm for autopkgtests with multiple host?

2023-09-24 Thread Helmut Grohne
Hi Johannes,

On Sun, Sep 24, 2023 at 10:27:37AM +0200, Johannes Schauer Marin Rodrigues 
wrote:
> There is really not much magic. The core of it is to pass this to your
> mmdebstrap or debvm-create invocation:
> 
> --setup-hook='for f in /etc/apt/sources.list /etc/apt/sources.list.d/* 
> /etc/apt/preferences.d/*;
>   do [ -e "$f" ] && { echo; sed "s| file://| copy://|" "$f"; 
> } | tee "$1/$f" >&2; done'
> --hook-dir=/usr/share/mmdebstrap/hooks/file-mirror-automount

This sounds simple, but reality is a little more elaborate.

For one thing, there also is
/usr/share/mmdebstrap/hooks/copy-host-apt-sources-and-preferences. This
hook directory is similar but subtly different from the above setup
hook:
 * It does not perform the translation of file:// uris into copy://uris.
 * It is more accurate in terms of following non-standard locations for
   the various configuration items.
 * Neither of these clear the sources.list created by mmdebstrap by
   default.
 * The latter one verifies that you have the same package versions
   inside and outside.

Did I accurately represent the differences? Which one would you prefer
in which situation?

Adding any of this to debvm-create will not just work, because
debvm-create also adds the maybe-merged-usr hook and any pass-through
arguments you add come later. Therefore the maybe-merged-usr hook would
come before this hook and it fails if you pass an empty sources.list,
which would be most useful. As a workaround, you may add

--skip=usrmerge -- --hook-dir=/usr/share/mmdebstrap/hooks/maybe-merged-usr

to the debvm-create invocation to reorder the hooks to actually work.

What seems to work is this:

debvm-create
--skip=usrmerge
...
--
--hook-dir=/usr/share/mmdebstrap/hooks/file-mirror-automount

--hook-dir=/usr/share/mmdebstrap/hooks/copy-host-apt-sources-and-preferences
--hook-dir=/usr/share/mmdebstrap/hooks/maybe-merged-usr
""

That final empty string supplies the apt sources. Does this sound about
right? If yes, I'd like to add this as a non-flaky autopkgtest to debvm.

Helmut



Re: debvm for autopkgtests with multiple host?

2023-09-23 Thread Helmut Grohne
Hi Ian,

On Sat, Sep 23, 2023 at 11:19:27AM +0100, Ian Jackson wrote:
> To summarise that discussion: at that time the best available solution
> that worked in ci.d.n seemed to be to write an ad-hoc script to run
> the tests in qemu; three packes had done that, each separately, with
> complex scripts with many moving parts.

In principle, debvm is supposed to target that particular use case.
There are two limitations that currently make this infeasible.

> I saw debvm, and wondered if it was suitable for this purpose.
> But, then I looked at its debian/test/control and I see that the tests
> are marked as flaky.[2]  So maybe it isn't reliable enough :-/.

The reliability of tests is ok. The reason for marking them flaky is
that they currently test the "wrong" packages. ci.d.n sets up chroots in
a delicate way to combine particular packages to see which combinations
cause breakage. Then debvm just creates an unstable system and tests
that. In effect, it currently tests unstable (inside those virtual
machines) rather than what it is supposed to be testing.

Johannes solved this problem on the mmdebstrap side and mmdebstrap's
tests no longer are flaky in this way. Therefore this should be solvable
on the debvm side. I just haven't gotten do figuring out the right runes
thus far. Roughly speaking, the hosts' apt configuration, pinning and
sources.lists should be used inside the created virtual machine.

> I have other questions too, particularly to do with the way I would
> need autopkgtest to be able to influence package selection in the
> nested testbeds.

Exactly. That's currently the missing piece to remove the flakiness
annotation.

There is another practical problem. None of the autopkgtest nodes
support kvm. Emulation will always use tcg. For one thing, tcg is slow.
It can be so slow on some architectures that RCU becomes unhappy as its
grace periods become too long. For another, tcg is buggy. It has
emulation bugs even on release architectures that make some expected
functionality fail. For instance, gdb reliably segfaults when run in
s390x tcg emulation.

That kvm aspect kinda seems like an unresolvable blocker. While most
autopkgtest machines are physical machines, they use kvm for running the
actual autopkgtest nodes and then lxc for individual test isolation.
We'd have to use nested kvm here and somehow get it through lxc.

> Everyone else: has there been any other progress on the multi-node
> autopkgtest problem ?

Disregarding these two aspects, debvm should get you quite far. You
probably need to take network management into your own hands. I expect
that vde2 would be a good way to implement this in an unprivileged way.
Your debvm invocations may become a little longer that way, but that
should be fine given that you store all of that in scripts.

Helmut



Re: [idea]: Switch default compression from "xz" to "zstd" for .deb packages

2023-09-18 Thread Helmut Grohne
Hi,

On Sat, Sep 16, 2023 at 10:31:20AM +0530, Hideki Yamane wrote:
>  Today I want to propose you to change default compression format in .deb,
>  {data,control}.tar."xz" to ."zst".
> 
>  I want to hear your thought about this.

I am not very enthusiastic about this idea. I skip over those arguments
already raised by others and add one that I haven't seen thus far. zstd
is quite optimized for 64bit CPUs and for amd64 in particular. amd64 is
the only architecture for which zstd provides a hufmann implementation
in assembly.

> ## More CPUs
> 
>  2012: ThinkPad L530 has Core i5-3320M (2 cores, 4 threads)
>  2023: ThinkPad L15 has Core i5-1335U (10 cores, 12 threads)
> 
>  
> https://www.cpubenchmark.net/compare/817vs5294/Intel-i5-3320M-vs-Intel-i5-1335U
>   - i5-3320M: single 1614, multicore 2654
>   - i5-1335U: single 3650, multicore 18076 points.

While the majority of CPUs in active deployments is amd64, I'd also like
to see numbers for 32bit CPUs and non-x86 ones. While I personally find
the trade-off by zstd fit for a number of my use cases, I was also
surprised just how slow it decompresses on armhf.

I found some arm board with some linux kernel package sized 36MB.

 algo | compressed size | decompression time
--+-+---
 xz   | 36MB|  14.7s
 zstd | 52MB|   5.2s
 zstd -9  | 48MB|   5.2s
 zstd -11 | 47MB|   5.4s
 zstd -19 | 41MB|   5.7s

Not as slow as I remembered apparently, but it still has a more than 10%
size overhead. The size ratio is consistent with Robert Edmond's
numbers, but we no longer see that 10-fold speedup. And this did not
look at decompression memory requirements.

I am decompressing a *lot* of .debs (dedup.d.n, multiarch hinter,
crossqa.d.n, dumat). All of these applications would benefit from zstd
compressed .debs in terms of decompression speed. Yet, that has never
been the bottleneck to me. To me, download speed matters more and
swapping out a 1GBit link for a faster one isn't that easy.

I'd vote against this given the data we have now.

Can we defer the discussion until there are more convincing numbers?

Helmut



Re: /usr-merge and filesystem bootstrap

2023-09-17 Thread Helmut Grohne
Hi Aurelien,

On Fri, Sep 15, 2023 at 12:02:35AM +0200, Aurelien Jarno wrote:
> Answering for the glibc package.

Thanks.

> On 2023-09-12 20:15, Helmut Grohne wrote:
> > Once the Priority:required set only has that exception set left
> > unconverted, I will prepare patches for the entire exception set and
> > upload it coherently in one dinstall window.
> > 
> > That exception set is:
> >  * base-files
> >  * bash
> >  * coreutils maybe
> >  * dash
> >  * libc6
> >  * util-linux
> 
> Do you mean you plan to upload source+binaries for all the above
> packages and for all architectures? How do you plan to handle ports
> architectures? 

My initial idea was doing source-only uploads and letting buildds
perform all of the builds. Of course that leaves the possibility of
buildds producing their packages "late" for the next dinstall. If that
happens, the relevant architecture will fail debootstrap unstable. That
is unfortunate, but it does happen occasionally and I think it is a
reasonable risk to accept here. Once all relevant builds are done,
debootstrap will work again. There are a number of things I can do to
minimize the risk. For one thing, I can ask DSA when the cronjob for
updating buildd chroots happens and align the uploads closely after such
a cronjob. For another, I can coordinate with the buildd people and ask
for help (e.g. prioritizing builds) on their side. Then I can mail
d-devel and announce a concrete point in time asking developers to not
upload packages on that particular day (e.g. using the delayed queue) to
temporarily reduce the buildd load. Quite probably, debootstrap will
temporarily break on some architectures. I hope that this is acceptable
and that minimizing the downsides is good enough.

Does that answer your question?

> >  * Are you fine in principle with me NMUing your package after having
> >reviewed the promised patch?
> 
> Yes, with the condition that help is provided to fix the bugs resulting
> from moving files from / to /usr in the glibc packages.

It is sad to see that this no longer goes without saying. Yes, I will
actively look for possible fallout and allocate time for dealing with
it.

> >  * Do you readily see any flaw in the proposed transition already?
> 
> I haven't looked at the details besides the changes you described above.

Thank you. We'll get into details once there are patches.

Unfortunately, testing patches right now is difficult, because this work
depends on all other (required) packages having been converted, which in
turn is blocked in buildds being merged. Hoping that this is less work
if done later, I've prioritized other matters for now and reach out to
you now as a means of informing and gathering consent.

There will certainly be a few more mails about this and there will be
time after me having sent patches and provided details on how I tested
them.

Helmut



/usr-merge and filesystem bootstrap

2023-09-12 Thread Helmut Grohne
Dear maintainers of relevant essential packages,

this /usr-merge transition covering multiple release has reached a point
where consensus has been reached about completing it by moving files
from / to /usr. The chosen approach also affects filesystem bootstrap
and an earlier discussion of this matter has resulted in consensus for
not changing the bootstrap protocol from the pre-/usr-merge state and
rather returning to that state. To this end, debootstrap has been
updated in trixie and unstable such that it performs the initial unpack
before performing the merge (mirroring how usrmerge does it on existing
systems). This change is being backported to bookworm and bullseye by
Simon McVittie. In order to remove the usrmerge package eventually, we
want base-files to install the aliasing symlinks via its data.tar.
Unfortunately, we cannot just do that, because doing so would break
debootstrap or cdebootstrap/mmdebstrap or both.

The way we get there is to first convert all packages from the
Priority:required set but some exceptions to install no files into
aliased locations such as /bin, /sbin or /lib*.  Getting there will
consume months. I hope we can reach this state in early 2024.

Once the Priority:required set only has that exception set left
unconverted, I will prepare patches for the entire exception set and
upload it coherently in one dinstall window.

That exception set is:
 * base-files
 * bash
 * coreutils maybe
 * dash
 * libc6
 * util-linux

base-files will install /bin, /sbin and /lib as symbolic links in its
data.tar. libc6 will install multilib /lib* where needed as symbolic
links in its data.tar:
 * /lib64: amd64, loong64, mips64el, ppc64, ppc64el
 * /libx32: x32
Since the relevant architectures share their libc soname, these packages
will remain multiarch co-installable. Multilib symlinks that are not
essential to a Debian architecture are not installed in a data.tar and
managed using maintainer scripts instead. For instance, /lib64 will not
be a symlink in libc6-amd64:i386, because /lib64 is not essential to
i386 nor is libc6-amd64:i386 essential to i386.

If we were to upload base-files before other packages, then a
bootstrapping tool could first unpack that other package thereby
creating e.g. /bin as a directory and then extracting the symbolic link
from base-files' data.tar would fail.

If we were to upload any other package from that set before base-files,
the aliasing links would be absent and merging via usrmerge.postinst
would not work due to missing the dynamic linker, /bin/sh, /bin/bash,
or /bin/cp. Running util-linux.postinst before usrmerge.postinst could
fail for the absence of /bin/more.

Therefore changes need to be uploaded concurrently. I intend to perform
these uploads in coordination with you. I request permission to NMU your
package for the purpose of completing the transition in this way.
Before actually performing such a NMU, I will prepare and send the
to-be-NMUed patches to all affected maintainers for review. The purpose
of having these NMUed is meeting the concurrency requirement. If you
insist on performing the upload yourself, you could arrange handing me a
signed .dsc.

These packages also need to migrate to testing together or we will
temporarily break bootstrappability of testing. We can either ensure
that by temporarily adding suitable Breaks or using special release team
powers.

I will coordinate a suitable time avoiding e.g. a glibc transition or a
time64 transition.

I will try to remove coreutils from the exception set by changing
usrmerge to no longer require a particular location of cp.

I request that affected maintainers reply to this mail:
 * Are you ok with the proposed changes in principle?
   + Moving all files from / to /usr leaving no files in aliased
 locations
   + Installing aliasing symbolic links in base-files and libc6
 * Are you fine in principle with me NMUing your package after having
   reviewed the promised patch?
 * Do you readily see any flaw in the proposed transition already?

Thanks for your cooperation

Helmut



Re: Enabling branch protection on amd64 and arm64

2023-08-31 Thread Helmut Grohne
Hi Guillem,

On Thu, Aug 31, 2023 at 02:12:51AM +0200, Guillem Jover wrote:
> So this happened, and Johannes reported that this seems to be breaking
> cross-building. :(
> 
> The problem, which is in fact not new, but is made way more evident
> now, is that the flags used are accepted only per arch, so when
> passing for example CFLAGS (the host ones) into CC_FOR_BUILD, then
> that one will not know about them and fail. (We have had this problem
> up to now as we set flags per arch as some are broken in some arches,
> but it being a problem depends on the host and build arches involved.)

I agree that the problem is not new. In general, stuff we compile with
the build architecture compiler is not installed into any .deb. We only
build such things for running them during build. Most flags do not
influence the behaviour of the resulting executables. tim64 may be a
notable exception here. So for a lot of cases, you can just pretend that
for build tools you don't need any Debian-specified compiler flags. All
you need to do here is deleting the flags when invoking build tools.
I've hit such cases in the past and done just that.

> I'm thinking about uploading later today a workaround to disable these
> flags for now when cross-building. And then for the next release after
> that support for _FOR_BUILD which can then take into account
> these arch differences. I think some upstream code can already make
> use of these, but this might need going over packaging or upstream
> build systems to adapt/fix stuff. :/

I'd rather not. These have always been bugs and some of them have
patches or have been uploaded. I don't expect them to be that many. It's
rather few packages that use the build architecture compiler at all. Of
those, a portion manages to keep host flags out. Let's just fix the
others.

In case you really want to pass the correct flags, you may use

CFLAGS_FOR_BUILD=$(dpkg-architecture -a$(DEB_BUILD_ARCH) -f -c dpkg-buildflags 
--get CFLAGS)

Guillem pointed out that these are still affected by
DEB__MAINT_ and DEB__, so we should eventually
rely on dpkg providing more support here, but I don't see the lack of
such support as a blocker here.

> And until that's done I don't think the workaround can be lifted,
> and cross-compiling will generate different binaries than
> non-cross-compiling. Another option would be to revert this change
> until we can add it safely, but that would also be unfortunate.
> OTOH, upstream code that uses stuff like CFLAGS with things like
> CC_FOR_BUILD are already broken in regards cross-building, so perhaps
> this can be an opportunity to flush them out?

Such cross vs native differences are very bad from my point of view,
because we have very little tools to detect them. It's an area where we
lack QA. Let's not make that worse.

In the grand scheme of things breaking cross builds, I think this is a
drop in the bucket.

Helmut



Re: /usr-merge status update + next steps

2023-08-22 Thread Helmut Grohne
Control: forwarded -1 
https://salsa.debian.org/debian/debhelper/-/merge_requests/108
Control: tags -1 + patch

On Sun, Aug 20, 2023 at 11:19:56PM +0200, Michael Biebl wrote:
> Related to that:
> dh_installsystemd (and the old, deprecated dh_systemd_enable) currently only
> consider systemd unit files that are installed to lib/

Thank you Michael and Niels (who privately pointed at the same thing).
This is the kind of review that I was hoping for.

> One could trick dh_installsystemd by running dh_usrmerge after
> dh_installsystemd, but this approach obviously doesn't work, if you change
> your package to build with --prefix=/usr, so the files are already in the
> canonical location when dh_installsystemd runs.
> 
> So this would need a corresponding change in dh_installsystemd. I guess for
> the time being, it would make sense if the tool looked in both paths, at
> least as long as the transition is ongoing.

You are spot-on. Even before we released bookworm, we had a group of
people (including Sebastian Ramacher and myself) advocating for doing
this change. As far as I understand the discussion, the main argument
against it was that it could encourage people to violate the moratorium.
In reality, our refusal to fix this earlier did cause "reverse
violations" of the moratorium where files previously shiped below
/usr/lib in bullseye would be moved to /lib in bookworm. That happened
to boinc-client, cfengine3, nvme-cli, podman, and powerman (see
https://subdivi.de/~helmut/dumat.yaml). So I argue that the reasoning
was wrong even back then.

Keep in mind that Niels clarified that he wasn't really objecting the
change, but didn't want to handle its fallout if any.

Speaking of fallout, we now have DEP17 and dumat which allow as to
quantitaively estimate what may break.
 * P1 is the main category we see here. This problem arises if we
   restructure packages and move files between / and /usr. Since we are
   rather early in the release cycle, not much restructuring has
   happened yet and all of the restructuring that would cause P1-style
   file loss, happened for bullseye->bookworm with nothing yet for
   bookworm->trixie (as of this writing). And since dumat.yaml is
   updated four times a day, we learn about such problems quickly.
 * P2 is a problem, but I've files patches for all in-archive instances
   already. No key packages are affected, so we can upgrade those bugs
   to RC-severity when problems arise.
 * P3 has had one instance that Luca Boccassi removed before bookworm,
   so for systemd units, no P3 problems are left in trixie and beyond.
 * P4/P5/P6/P7/P8/P9/P10 do not apply.

If we were to lift the moratorium just for systemd units right now,
we're likely to run into P1 problems due to later package restructuring,
but there is little else that may go wrong. Due to these P1 problems, we
still have the moratorium and I have repeatedly argued for an opt-in
approach to moving files from / to /usr.

Let me also put this into numbers. Across all suites, we have around
2200 binary packages shipping files in aliased locations. If you
disregard systemd units, we're left with 1030 packages. In other words,
more than half of the binary packages shipping files in aliased
locations do so only via systemd units.

I recognize that various people have repeatedly asked me to consider an
opt-out approach and to look at these numbers. Thanks for your
persistence. Does that also convince others to treat systemd units
separate from the rest? It seems plausible to move systemd units in an
opt-out fashion while moving other files in an opt-in fashion. The main
benefit here is that we could use binNMUs to canonicalizes 1/3 of the
archive. (This is less than half, because a number of packages shipping
systemd units are Arch:all.)

To me, the risks and cost savings for forcefully moving systemd units
bear a trade-off that is worth considering (despite me earlier having
argued otherwise). Unfortunately, evaluating risk is a subjective
process to some extent and I know that we have quite some disagreement
on how severe these risks are. How can we move forward here? In this
instance, I welcome +1 and -1 style responses and you may send them
directly to me if you want to save the list from such traffic.

In any case, I implemented the changes to debhelper to recognize units
in /usr. The change does not yet move units to /usr (as that is still
prohibited by the moratorium and I don't think we have consensus on that
aspect just yet). I am willing to handle the fallout of this change and
have implemented the dumat service to quickly diagnose such fallout.

Nevertheless, I welcome reviews of the debhelper MR referenced above.

Niels already replied to the MR. He'll not interact
(review/merge/upload) with the MR and authorized me to do those things
(provided I handle possible fallout). Thank you.

Helmut



/usr-merge status update + next steps

2023-08-19 Thread Helmut Grohne
Hi,

Yeah, I know we have too many /usr-merge discussions. Still, there is
reason to continue posting. My last summary/status was
https://lists.debian.org/20230712133438.ga935...@subdivi.de from July
12th and I'm giving an update of what happened since then here and
explain how I want to move forward.

# DEP17

I've updated the DEP17 MR at
https://salsa.debian.org/dep-team/deps/-/merge_requests/5 and continue
to provide a rendered version at https://subdivi.de/~helmut/dep17.html.
Notable changes:
 * We learned that in bookworm and beyond, the rootfs of the
   debian-installer is not /usr-merged. If we start moving files in
   packages, we'd likely break the installer. (This is DEP17-P10.) The
   kinda obvious mitigation is performing the merge there (DEP17-M22)
   and I've submitted a MR to that end.
   https://salsa.debian.org/installer-team/debian-installer/-/merge_requests/39
 * The heated discussion around debootstrap and the bootstrap protocol
   was tracked down to be rooted in a misunderstanding. The proposal
   changing debootstrap (DEP17-M19) has gained consensus and has been
   uploaded to unstable as debootstrap/1.0.130 by Luca Boccassi. Thanks.
   Please test this version of debootstrap and report bugs as we intend
   to backport this change and the change merging --variant=buildd in
   trixie and beyond to bookworm and bullseye.
 * The issue of loosing empty directories (DEP17-P6) has been clarified
   to also loose such directories in plain upgrades when they are
   canonicalized. Therefore all empty directories in aliased locations
   are affected regardless of whether other packages ship files within.
 * The issue of loosing empty directories (DEP17-P6) has formalized two
   known mitigations.
   M20: Restore empty directories in maintainer scripts
   M21: Add placeholder files
 * I have recorded the perceived consensus in a new "Proposal" section.

# Archive work

 * I currently manually file RC bugs for file conflict bugs that affect
   the /usr-merge. If you also file such bugs, please add the
   debian...@lists.debian.org usertag "fileconflict" and ensure that you
   set appropriate affects or assign the bug to multiple packages.
 * I have filed bugs with patches and MRs for deleting empty directories
   (DEP17-P6) that I was as unused. After all, a directory that is not
   installed, cannot be lost. In case of lib32lsan0 and libx32lsan0,
   this resulted in deletion of the entire package (thanks Matthias).
 * I have filed bugs for trigger interests on aliased locations
   (DEP17-P2) and asked maintainers to duplicate them (DEP17-M12).

# Continuous monitoring of problems

The dumat service introduced earlier continues to detect problems
related to /usr-merge. Its detection has been slightly improved and it
also associates reported bugs now.  If bugs are correctly assigned,
usertagged and have the right affects, dumat associates them and
displays them in the output file https://subdivi.de/~helmut/dumat.yaml.
At the time of this writing 40 of 111 issues are associated with bug
reports. MRs are not associated.

# Next steps

## Continuous MBF

I intend to implement automatic bug filing in dumat for some classes of
bugs. The bug association mentioned above has been implemented to avoid
filing duplicates. I intend to extend the service to automatically (with
no human intervention) file bugs for issues that are not yet associated
with a bug number. If you close such a bug without fixing the cause, a
new one will be filed. I intend to implement this one category at a time
and most of these categories will result in bugs of severity serious. In
general, no bugs will be filed for "risky" issues (52 at the time of
this writing) and bugs will only be filed for versions in unstable or
experimental. In order to limit possible damage, I intend to limit the
service to filing one bug per run (i.e. maximum 4 bugs per day). The
categories that shall receive automatic rc bugs are:
 * undeclared file conflicts (all existing ones filed manually)
 * P1: ineffective replaces (none at present due to the moratorium)
 * P2: ineffective trigger interest (all existing ones filed manually)
 * P3: ineffective diversions (none at present due to the moratorium)
 * P6: loss of empty directories (some filed)
 * P7: loss of m-a:same shared files (none at present due to the
   moratorium)

The first is usertagged under the qa umbrella:
https://udd.debian.org/cgi-bin/bts-usertags.cgi?user=debian-qa%40lists.debian.org&tag=fileconflict
The others will be usertagged with my email helm...@debian.org and tag
name dep17pN.

Why are undeclared file conflicts on this list? They typically are
mitigated with Breaks+Replaces or diversions. Both of these mechanisms
have potential breakage with aliasing. Therefore, an undeclared file
conflict may hide a problem related to /usr-merge.

The empty directory loss issues are currently not filed at RC severity
even though they tend to be reproducible on bookworm. The moratorium
does not pre

Re: another problem class from /usr-merge [Re: Bug#1036920: systemd: please ship a placeholder in /usr/lib/modules-load.d/]

2023-08-15 Thread Helmut Grohne
Hi,

I fear we are not done with empty directory loss yet. This is a
technical update for future reference.

On Wed, May 31, 2023 at 11:59:58AM +0200, Helmut Grohne wrote:
> On Tue, May 30, 2023 at 11:53:00AM +0200, Helmut Grohne wrote:
> > In effect, this bug report is an instance of a bug class. I am in the
> > process of quantifying its effects, but I do not have useful numbers at
> > this time. As an initial gauge, I think it is about 2000 binary packages
> > that ship empty directories (which does not imply them to be affected,
> > rather this is to be seen as a grossly imprecise upper bound).
> 
> I did some more analysis work here and have to admit that I know my data
> model has a weakness that may result in false negatives. I'd have to do
> a complete reimport of packages and eventually will, so for now I'm
> dealing with incomplete data here. I note that content indices do not
> cover empty directories, so you really have to download loads of .debs
> to find these.

We're much further in understanding the problem now.
> Anyway, to gauge the problem, we're effectively looking for a
> combination of packages A and B such that:
> 
>  * A ships an empty directory.
>  * That empty directory is a path affected by aliasing (either in /usr
>or /).
>  * B also ships that directory (e.g. non-empty) in the "other"
>representation of that path.

This earlier representation is incomplete. Moving a file from / to /usr
(which currently is prohibited by the moratorium) as part of a simple
package upgrade also triggers the loss behaviour.

In any case, https://subdivi.de/~helmut/dumat.yaml now reports the
affected instances (both when multiple packages are involved and upgrade
scenarios) and it's not that many.

> So yeah, this bug class is clearly not one to panic about. As we move
> files from / to /usr, I expect this bug class to gain more occurences. I
> am not aware of a generic solution and it seems diversions won't cut it.
> If you can propose any generic workaround or recipe for this situation,
> I'm all ears. The placeholder file sounds ugly, but might work.

We've progressed somewhat since. At this time, my favourite mitigation
is deleting affected directories and I've sent a number of patches for
applicable situations. Unfortunately, some empty directories exist with
reason and we need another mitigation for them.

The options seem to be as follows:

# M9

While dpkg does not allow diverting a directory, you can trick it into
doing that by temporarily moving the directory out of the way. So
diversions are an option here though they're really ugly.

# M17

A way to not loose these directories is to keep them in both aliased and
canonical location. While this works fairly reliably (until we attempt
to remove the aliased location), it is incompatible with other
mitigations such as shipping the aliasing symlinks in a package (M11).

# M20

We can also accept the temporary loss of these directories and restore
them using maintainer scripts. In upgrade scenarios, this is trivial as
the postinst script runs after the loss event. In the package removal
scenario, the package being removed can activate an interest-noawait
trigger to restore the directory.

I've implemented and attached a prototype for this approach.

# M21

The obvious way to avoid loss of empty directories is to make them
non-empty. Adding a placeholder file is sufficient here.


All of these mitigations can be selected per-occasion (though selecting
M17 once precludes a number of other ones). I don't think there is a
clear winner here. We can leave the choice of mitigation to the affected
maintainers.

Rough outline on a per-package basis:
 * cockpit-tests: deleted #1043322
 * firmware-b43-installer: /lib/firmware/b43 triggerless M20
 * firmware-b43legacy-installer: /lib/firmware/b43legacy triggerless M20
 * fwupd: delete #1041752
 * gretl: deleted #1041835
 * lib32lsan0 : deleted #1042482
 * libjte-dev: deleted #1041753
 * libmpeg3-dev: deleted #1041756
 * libswe-dev: delete #1041757
 * libx32lsan0: deleted #1042482
 * netplan-generator: multiple, TBD
 * openrc: TBD, maybe triggerless M20?
 * pcp: delete #1041754
 * pkg-config/pkgconf/pkgconf-bin: TBD, maybe M21?
 * printer-driver-foo2zjs: /lib/firmware/hp triggerless M20
 * python3-expeyes: deleted #1041755
 * systemd,udev: multiple, some deleted, others TBD

The takeaway here is that the majority of cases are handled via
deletion and we'll be finding solutions with the maintainers of those
four packages.

If you want to help mitigate these instances, you may add placeholder
files or delete empty directories at any time. Adding diversions,
maintainer scripts or triggers is something we need to consider at a
later stage.

Helmut


test.sh
Description: Bourne shell script


Re: Request for review of debootstrap change [was: Re: Second take at DEP17 - consensus call on /usr-merge matters]

2023-08-11 Thread Helmut Grohne
Hi Holger,

On Fri, Aug 11, 2023 at 09:28:51AM +, Holger Levsen wrote:
> On Fri, Aug 11, 2023 at 09:38:02AM +0100, Luca Boccassi wrote:
> > > This is implemented in
> > > https://salsa.debian.org/installer-team/debootstrap/-/merge_requests/96
>  
> what about cdebootstrap?

cdebootstrap (and mmdebstrap) never implemented a merging step[1] and to
this date rely on the usrmerge package doing it at postinst time. Once
base-files ships the aliasing symlinks, both will produce /usr-merged
trees without any modifications. The reason that we need a change to
debootstrap is that its current merging implementation breaks when
base-files ships aliasing symlinks.

So the main reason for doing this change to debootstrap is that it
enables us to continue supporting cdebootstrap and mmdebstrap without
any changes there.

Helmut

[1] For mmdebstrap there is a merged-usr hook that can do it. Johannes
will migrate it to the same post-merging approach I am proposing for
debootstrap here.



Request for review of debootstrap change [was: Re: Second take at DEP17 - consensus call on /usr-merge matters]

2023-08-10 Thread Helmut Grohne
Hi,

This is picking up on the debootstrap matter and is kinda crucial.

On Thu, Jul 13, 2023 at 01:31:04AM +0100, Luca Boccassi wrote:
> > After having sorted this out, what part of your safety concerns with 3C
> > do remain?
> 
> Nothing, as that stemmed from a misunderstanding of what the
> implementation would have required, and that's cleared now.

So we finally removed the misunderstanding with Luca and I imply that
this also removes Sam's concern (as he was inheriting the
misunderstanding from Luca).

Let me briefly recap the most important pieces. The proposal at hand is
changing debootstrap in unstable, testing, stable and oldstable. Rather
than merging /usr before the initial unpack, it will merge after the
initial unpack but before running maintainer scripts. Therefore
base-files can ship aliasing symlinks without triggering tar errors from
debootstrap and once it does, the merging step in debootstrap
automatically becomes a noop. With this change in place, we can move
forward without changing cdebootstrap nor mmdebstrap.

This is implemented in
https://salsa.debian.org/installer-team/debootstrap/-/merge_requests/96
and reviewed by Luca Boccassi and Simon McVittie. Thank you two. I have
tested this change for bootstrapping buster, bullseye, bookworm and
trixie on amd64 without hitting regressions.

Do we have any more disagreement with this approach or implementation?
If you review the MR, don't hesitate to leave a positive or negative
comment on it. We want to make sure that this doesn't break stuff as its
exposure is high.

I intend to merge and NMU this change before too long and Simon McVittie
intends to prepare stable and oldstable uploads with this change and the
change to make --variant=buildd /usr-merged for trixie and beyond.
Having these changes in oldstable (and thus affecting buildds) is a
precondition for lifting the moratorium, so we'd like to move forward
soon.

Helmut



Re: /usr-merge: continuous archive analysis

2023-08-10 Thread Helmut Grohne
Hi Andreas,

On Sun, Aug 06, 2023 at 06:44:47PM +0200, Andreas Metzler wrote:
> Somehow related: If I introduce a new systemd unit should I work
> around dh_installsystemd and ship it in /usr/lib/systemd/system/?

Doing this is extra work now. If done correctly, it is compatible with
the file move moratorium. Some packages declare a trigger interest for
the aliased location and will have their triggers missed as you move to
/usr, but I've already filed bugs for all affected packages so this is
temporary at best. In general, I am in favour of this.

> At first glance it seemed like a good idea (not adding to the problem)
> but doubt there is real benefit. - Another binary package in the same
> source already ships a unit that will need to be moved so we will need
> to use $magic anyway. FWIW I would have used something like this:

I also agree with this with a little caveat. Quite a number of available
mitigations incur a cost per file. So by moving that secondary unit now,
you may be lucky and avoid a mitigation for it later.

> override_dh_installsystemd:
> dh_installsystemd
> mv debian/foo/lib/systemd/system \
> debian/foo/usr/lib/systemd/

Consider execute_after_dh_installsystemd. Other than that, this is the
way to go. If you were to move before dh_installsystemd you'd miss
maintainer scripts activating/starting your unit.

> (I am assuming dh_installsystemd would not start installing stuff into
> /usr/lib without a dh_compat bump.)

We don't have consensus on this yet, but I agree with you here. My
preferred way of implementing the merge in debhelper is adding a new
dh_usrmerge that would perform the merge. It would come with a sequence
addon "usrmerge" which would be enabled in a new compat level. Once the
moratorium is lifted, you can:
 * opt in: Explicitly call dh_usrmerge
 * opt in: Build-Depends: dh-sequence-usrmerge
 * opt out: Bump compat level and pass --without=usrmerge to dh

The downside of this approach (and why people disagree with it) is that
we need at least one upload with source changes for every affected
package. Yes, this does mean 2000 uploads.

This is not backed by code yet, but you may disagree with it already.

Helmut



Re: Potential MBF: packages failing to build twice in a row

2023-08-10 Thread Helmut Grohne
Hi Wookey,

On Wed, Aug 09, 2023 at 02:30:43PM +0100, Wookey wrote:
> I have never tried Helmut's suggestion of removing this stuff in the
> clean target. It does seem to me that removing it from the tarball
> makes a lot more sense than cleaning it later.

I do see all the advantages of repacking that you and Simon presented.
We don't have to argue about them. Simon also pointed at a severe
limitation though: When repacking, the upstream signature becomes
useless and external parties can no longer verify it at ease. Including
that upstream signature increases trust in the source shipped by Debian
being good.

For cases where we repack anyway (e.g. for licensing reasons), we have
broad consensus that we should also delete generated files at the
repacking stage. I also see a shift here where we may recommend
repacking just for deleting unused files in the absence of an upstream
signature. The arguments are convincing to me.

Does anyone see a way to enable upstream signature verification with
repacked sources? This seems technically incompatible: In order to
verify the signature, we really have to ship the original tar and thus
get into the licensing mess. So the best we might do here is point at
the original tar and signature (hoping that it does not go away) and
providing a tool that verifies the signature and establishes that the
repacked source really corresponds to the verified tar. Is anyone aware
of such tooling?

In the absence of such tooling, I continue to see clean-before-build as
a valid strategy for dealing with generated files and vendored sources.

Helmut



Re: Potential MBF: packages failing to build twice in a row

2023-08-08 Thread Helmut Grohne
On Sat, Aug 05, 2023 at 05:29:34PM +0100, Simon McVittie wrote:
> I think it's somewhat inevitable that code paths that aren't frequently
> exercised don't work. If a majority of maintainers are doing all of
> their builds with git-buildpackage, or dgit --clean=git, or something
> basically equivalent to one of those, then `debian/rules clean` will
> never actually be run against a built tree. For teams with a strongly
> preferred workflow (like the Perl, Python and GNOME teams consistently
> using git-buildpackage), this seems particularly likely.

As a minor data point, I also do not rely on `debian/rules clean` to
work for reproducing the original source tree, because too many packages
fail it.

Let me point out though that moving to git-based packaging is not the
property that is relevant here. I expect that most developers use either
sbuild or pbuilder for the majority of their builds. Both tools create a
.dsc, copy that .dsc into a chroot, unpack, build, and dispose of it. So
we effectively have at least three ways of cleaning source packages:

a) `debian/rules clean`
b) Some VCS (and that's probably just git)
c) Copy the source before build and dispose the entire copy

That last approach may be annoying for large source packages, but it
works reliably for the entire archive.

> For me, the main purpose of `debian/rules clean` is being able to do
> incremental builds while debugging something - but if I want to do
> incremental builds, it's quite likely that I'll also be using
> `debuild -b -nc` to make the builds genuinely incremental (and then a fully
> clean build from first principles at the end, to verify that whatever issue
> I'm debugging is really fixed).

I see that the purpose of `debian/rules clean` is evolving and that we
should clarify which of the purposes we as a project consider important.
Given the state of discussion, I think we should drop the idea of using
it to construct a source package after build.

> One way to streamline dealing with these generated files would be
> to normalize repacking of upstream source releases to exclude them,
> and make it easier to have source packages that genuinely only contain
> what we consider to be source. At the moment, devref §6.8.8.2 strongly
> discourages repacking tarballs to exclude DFSG-but-unnecessary files
> (including generated files, as well as source/build files only needed on
> Windows or macOS or whatever[1]), and Lintian strongly encourages adding
> a +dfsg or +ds suffix to any repacked tarball, which makes it less
> straightforward to track upstream's versioning. Is it time for us to
> reconsider those recommendations?

With this you touch another purpose of `debian/rules clean`: Removing
generated files. Since we currently discourage repackaging and
`dpkg-source -b` is vaguely happy about deleted files, a common
technique for dealing with generated files is really shipping them in
the source tree and then deleting them via `debian/rules clean` while
relying on build tools (and our buildds do this) to clean before build.
>From my point of view, this is the main purpose of the clean target at
this time.

Do others see this strategy of dealing with generated files as viable
and is it compatible with git-based workflows?

Are we ready to call for consensus on dropping the requirement that
`debian/rules clean; dpkg-source -b` shall work or is anyone interested
in sending lots of patches for this?

Helmut



Re: autodep8 test for C/C++ header

2023-08-08 Thread Helmut Grohne
Hi Sune,

On Tue, Aug 08, 2023 at 06:46:38AM -, Sune Vuorela wrote:
> I don't think this is a important problem that some headers might have
> special conditions for use. I'd rather have our developers spend time
> fixing other issues than satisfying this script.

A while ago, I've performed a similar analysis for Python and given my
experience there, I disagree with you here. As far as I understand both
you and Peter, you argue that such an autodep integration would fail for
some package for various (valid) reasons. What Benjamin seems to propose
is adding support for an automated check that is opt-in (not opt-out).
As a developer, you have to explicitly enable it in order to use it.
Given the numbers presented by Benjamin and the examples presented by
both Peter and you, I expect that Benjamin's script would just work for
at least half of the packages. And for those where it just works, I see
it as a useful safety measure.

> Is it a problem that Qt -dev packages also ships windows, mac or android
> specific addons headers? I don't think so. Rather the opposite. When
> doing cross platform work, it is nice that grepping across the includes,
> I also see some of the platformspecific functions in separate files.
> 
> Is it a problem that a header file is also shipped that provides
> integration with other-big-thing but 99% of developres/downstream users
> don't care about and no Depends is in place? I don't think that's a
> problem. I'd rather have the header available for the 1% than having to
> create an extra -dev package just for that.
> 
> Debian development resources is a finite resource, so let's not waste
> it.

This goes both ways. The team at Canonical is currently dealing with
lots of failures that are tangential to time64. Let's not waste their
time either. I'm experiencing a similar issue with my work on
/usr-merge. In order to complete that transition in a safe way, I need
file conflicts to be precisely declared, but people frequently introduce
new ones and some even argue about their severity. That's also "wasting"
my time.

So from my point of view, we need to strike a balance here. Benjamin is
offering an opt-in tool to reduce his waste time. Having half of the
packages opt into it, would already reduce QA work significantly, so
that sounds like a very good measure to me.

Can we agree on moving forward with this while not forcing it onto each
and every package?

Helmut



Re: /usr-merge: continuous archive analysis

2023-07-31 Thread Helmut Grohne
Hi Alexandre,

On Mon, Jul 31, 2023 at 01:37:12PM +0200, Alexandre Detiste wrote:
> [systemd-cron]: after a carefull review, I took a third option:
> these scriptlets belong neither in /lib nor /usr/lib but in /usr/libexec .
> 
> This is now implemented this way in the upstream repository.
> 
> The debian postinst will be adapted to
> remove the old override in a next upload.

This is halfway good and halfway bad.

In moving crontab_setgid from lib to libexec, you effectively evade the
moratorium and are entitled to also move from / to /usr. This is an
action you can do right now. The move from /lib to /usr/libexec prevents
the file loss scenario that spurred the moratorium.

In
https://github.com/systemd-cron/systemd-cron/commit/45e82678f62f523417b0c7f84d40ec7fcb1b864d,
you move the generators from / to /usr.  This action is prohibited by
the moratorium and will have to be temporarily reverted in the
packaging (or the upload of the upstream release needs to be delayed).

Helmut



New "fileconflict" usertag for debian...@lists.debian.org

2023-07-28 Thread Helmut Grohne
user debian...@lists.debian.org
# discodos
usertags 966115 + fileconflict
affects 966115 + mono-devel
# firebird-utils
usertags 1040321 + fileconflict
affects 1040321 + firebird3.0-server
# kodi-addons-dev
usertags 1040319 + fileconflict
affects 1040319 + kodi-addons-dev-common
# libocct-data-exchange-dev
usertags 1035009 + fileconflict
affects 1035009 + liboce-modeling-dev liboce-visualization-dev
# libreoffice-uiconfig-report-builder
usertags 1041899 + fileconflict
affects 1041899 + libreoffice-report-builder
# libsequoia-octopus-librnp
usertags 1041832 + fileconflict
affects 1041832 + thunderbird
# nex
usertags 1022957 + fileconflict
affects 1022957 + nvi
# nfs-ganesha-ceph
usertags 1040362 + fileconflict
affects 1040362 + nfs-ganesha
# python3-notebook
usertags 1036996 + fileconflict
affects 1036996 + cadabra2
thanks

Hi Andreas and Ralf,

On Tue, Jul 18, 2023 at 09:02:08PM +0200, Helmut Grohne wrote:
> Is this convincing enough to move forward with the generic
> debian...@lists.debian.org usertag fileconflict rather than something
> more detailed? Is this also convincing enough to extend it to cover
> non-file conflicts or do you want a different tag for that? Should the
> tag also cover m-a:same file conflicts?

Given the lack of further input, I went ahead and documented the new
fileconflict usertag at:

https://wiki.debian.org/qa.debian.org/usertagging

| fileconflict: bugs identifying a file conflict between packages. Such
| bugs can be filed against multiple packages if the causing package is
| not known. Otherwise, the other packages should be listed as affected.
| This covers all kinds of conflicts including symlink vs directory
| conflicts and Multi-Arch: same file conflicts. It also covers file moves
| between packages that lack suitable Replaces.

Let me know if you want a change to this.

The above list of tags is the subset that affects the /usr-merge
transition.

Helmut



Re: debci / salsa ci: support for qemu runner

2023-07-28 Thread Helmut Grohne
Hi,

On Tue, Jul 25, 2023 at 09:37:35PM +0200, Paul Gevers wrote:
> For ci.d.n, the issue is not money, but the required work to integrate it
> into the infrastructure. We need volunteers (or pay people to do the work),
> but unless they can and want to figure out everything from source [1], the
> bottleneck remains that the current volunteers would need to help those
> people understand the setup and guide them coming up with good solutions.

I second this on another level. While the lxc backend is exercised very
often, the qemu backend evidently experiences rare use. The default
--ram-size is 1G and that happens to be too little for a number of
packages already. This soon will be configurable (#1037245). I expect
that there are more aspects where qemu and lxc differ in a way that
causes test failures as most of the existing tests only ever ran on lxc.
Some of these aspects will have to be fixed in tests, but others (like
the --ram-size) will need addressing in infrastructure. Please expect
more work in this area.

Helmut



Re: /usr-merge: continuous archive analysis

2023-07-21 Thread Helmut Grohne
Hi,

TL;DR: dpkg-statoverride detection cannot be automated, but there are
only 5 affected packages.

On Wed, Jul 12, 2023 at 03:34:38PM +0200, Helmut Grohne wrote:
>  * DEP17-P5: dpkg-statoverrides not matching the files shipped.
>Possibly, I can extend dumat to cover unconditional statoverrides.

In retrospect, this feels like a lie. As usual, the story is more
complex than it initially seems. A really big chunk of users just
queries a path for a (local) override. We cannot capture these by
looking how a chroot was modified during maintainer scripts. Another
significant chunk is conditional statoverrides that depend on either
debconf answers or failure to apply filesystem capabilities. Observing
the intended outcome in these cases is next to impossible. Actually
parsing shell scripts and extracting those calls is what I tried for
diversions first, but that runs afoul variable interpolation and the
halting problem before too long. So really, I don't see a good way to
implement the promised detection without a high error rate.

Then on the flip side, there's about 1500 maintainer scripts matching
dpkg-statoverride found by binarycontrol.d.n. Since most have postinst
and prerm and most are in all suites, that's about 250 packages. Of
these, the vast majority only ever deals with canonical paths or paths
unaffected by the /usr-merge.  Checking all of these manually as a
one-shot effort definitely sounds more plausible to me. To validate this
claim (after having made a wrong one), I actually performed the analysis
for unstable and found only five affected packages. I intend to move
this forward by supplying the necessary patches.

Changes needed:
 * fuse (queries only, can be duplicated now)
 * fuse3 (queries only, can be duplicated now)
 * ntfs-3g (queries only, can be duplicated now)
 * systemd-cron (needs to be updated when moving files)
 * yp-tools (needs to be updated when moving files)

Nontrivially unaffected:
 * nfs-common (removes an aliased statoverride)

Unaffected:
 * activemq
 * amavisd-new
 * apt-cacher-ng
 * asterisk
 * asterisk-config
 * autofs-ldap
 * ax25-apps
 * backuppc
 * balboa
 * biboumi
 * bird
 * bird2
 * boinc-client
 * boxbackup-server
 * bucardo
 * ca-certificates
 * cado
 * ceph-base
 * ceph-common
 * ceph-mds
 * ceph-mgr
 * chrony
 * clamav-unofficial-sigs
 * cockpit-ws
 * corekeeper
 * coturn
 * courier-authdaemon
 * courier-authlib-ldap
 * courier-authlib-mysql
 * courier-authlib-postgresql
 * courier-base
 * courier-faxmail
 * courier-imap
 * courier-ldap
 * courier-mta
 * courier-pop
 * courier-webadmin
 * cron
 * cron-daemon-common
 * cubemap
 * cups
 * cups-daemon
 * cups-tea4cups
 * cups-x2go
 * cw
 * cyrus-common
 * davfs2
 * davmail-server
 * dbus
 * deluged
 * dodgindiamond2
 * dokuwiki
 * dovecot-core
 * durep
 * ejabberd
 * eviacam
 * exim4-base
 * exim4-config
 * fdutils
 * ferm
 * forked-daapd
 * fping
 * gammu-smsd
 * ganglia-monitor
 * ganglia-webfrontend
 * geki2
 * geki3
 * geoclue-2.0
 * gerbera
 * glhack
 * gmetad
 * gnokii-cli
 * gnunet
 * graphite-api
 * graphite-carbon
 * graphite-web
 * gravitywars
 * groonga-httpd
 * groonga-server-common
 * gvmd
 * gweled
 * h2o
 * haserl
 * hplip
 * i2p
 * icinga2-common
 * icingadb
 * icingaweb2-common
 * ilisp
 * im
 * inadyn
 * incron
 * john
 * json2file-go
 * kea-common
 * kgb-bot
 * kismet
 * knot
 * libvirt-daemon-system
 * libx2go-server-db-perl
 * libzeroc-ice3.7
 * lldpd
 * lmarbles
 * logdata-anomaly-miner
 * login-duo
 * lprng
 * lyskom-server
 * man-db
 * mandos
 * mandos-client
 * matrix-sydent
 * matrix-synapse
 * mgetty-fax
 * milter-greylist
 * minidlna
 * mlocate
 * mon
 * monsterz
 * mpd
 * mpd-sima
 * mpdscribble
 * muse
 * nagios4-cgi
 * nagios4-common
 * nagvis
 * nbsdgames
 * netdata-core
 * nethack-common
 * netselect
 * nginx-common
 * notus-scanner
 * nsca-ng-server
 * nsd
 * onak
 * open-infrastructure-compute-tools
 * opendkim
 * opendmarc
 * opendnssec-common
 * opensmtpd
 * openssh-client
 * opentracker
 * openvas-scanner
 * pacemaker-common
 * pawserv
 * pconsole
 * pdns-ixfrdist
 * phog
 * php-common
 * phpmyadmin
 * pkexec
 * plocate
 * pmount
 * polkitd
 * polkitd-pkla
 * postfix
 * powermanga
 * prayer
 * prometheus
 * prometheus-alertmanager
 * prometheus-apache-exporter
 * prometheus-bind-exporter
 * prometheus-blackbox-exporter
 * prometheus-haproxy-exporter
 * prometheus-ipmi-exporter
 * prometheus-mysqld-exporter
 * prometheus-node-exporter
 * prometheus-postfix-exporter
 * prometheus-postgres-exporter
 * prometheus-process-exporter
 * prometheus-pushgateway
 * prometheus-redis-exporter
 * prometheus-smokeping-prober
 * prosody
 * puppet-agent
 * puppetdb
 * puppetserver
 * pure-ftpd-common
 * pyracerz
 * qpsmtpd
 * radosgw
 * radvd
 * redis-sentinel
 * redis-server
 * redis-tools
 * roundcube-core
 * rtpengine-daemon
 * rtpengine-recording-daemon
 * samba-common
 * sasl2-bin
 * sendmail-bin
 * shibboleth-sp-utils
 * smstools
 * smtpprox-loopprevent

Re: usertagging file conflicts [Was: Re: /usr-merge: continuous archive analysis]

2023-07-18 Thread Helmut Grohne
Hi Andreas and Ralf,

On Mon, Jul 17, 2023 at 04:08:48PM +0200, Ralf Treinen wrote:
> > Moving the usertag to the qa namespace sounds like a good idea.
> 
> I agree

Thank you

> Sounds like a good idea. However, I do not think that usertags support 
> a hierarchy of tags. So maybe different specific usertags with a common
> prefix, like
> 
> fileconflict-installation (error occurs when one tries to install two
>   packages togther)
> fileconflict-upgrade (error occurs when upgrading, due to missing
>   breaks/replaces)
> fileconflict-directory (error occuring due to /usr merge)

Can either of you elaborate on the need to further classify the kind of
conflict (file / directory / symlink / ...) or the kind of cause
(installation / upgrade / ...)?

Are you ok with explicitly excluding issues that only arise as a result
of /usr-merge. These have a temporary cause and will vanish before too
long. Due to the automatic bug filing that I hope to be doing, I need
very precise tagging for them.

Often times, it is initially not trivial to figure out whether a
conflict only arises from installation or upgrades. Rather I propose to
have a grab-bag tag for all of them. That allows us to move forward with
less complexity and makes it easier to understand for everyone. Most of
these issues result in an unpack error one way or another, but the
symlink vs something else conflicts may result in unpack-dependent
behaviour.

I think we have consensus on using the debian-qa list, but I've seen
file-overwrite and fileconflict-* as proposals with varying
subclassification now. While we don't have a tag hierarchy on a
technical level, Paul indicated that we may establish a hierarchy using
processes. Using fileconflict makes it easy to establish a
fileconflict-* subclassification later (by having the qa bot
automatically add the super tag when it sees a sub tag).

Is this convincing enough to move forward with the generic
debian...@lists.debian.org usertag fileconflict rather than something
more detailed? Is this also convincing enough to extend it to cover
non-file conflicts or do you want a different tag for that? Should the
tag also cover m-a:same file conflicts?

I certainly won't object to doing a subclassification and I'm happy to
add the subclass tags if doing so does not incur unreasonable
implementation cost, but I don't want to participate in designing them
nor updating existing tags. My need here is automatically ignoring
detected issues that already are reported and the generic variant is
sufficient for doing that.

Helmut



usertagging file conflicts [Was: Re: /usr-merge: continuous archive analysis]

2023-07-16 Thread Helmut Grohne
Hi,

On Wed, Jul 12, 2023 at 03:34:38PM +0200, Helmut Grohne wrote:
> ## Usertagging bugs
> 
> In order to avoid filing duplicates, I need a usertagging scheme for
> bugs. Are there opinions on what user I should use for this? In the
> simplest instance, I can use my DD login. Roughly speaking every issue
> type shall translate to an individual usertag. Is there a common usertag
> for undeclared file conflicts to reuse?

I did not see any replies towards this aspect. I researched the
situation and found that a longer while ago a...@grinser.de and later
a...@debian.org used qa-file-conflict. I also discovered that Adreas uses
debian...@lists.debian.org together with replaces-without-breaks, which
is not what I'm looking for, but closely related.

Then I found trei...@debian.org using edos-file-overwrite. That latter
one seems like what I need here. Should we move it to the qa space and
drop the edos part? I suggest debian...@lists.debian.org usertags
file-overwrite.  Otherwise, Ralf are you ok with me reusing your tag?

What are the precise semantics of the tag? I imagine that it should only
be filed against binary packages (one or more) and never with source
packages. In case the causing package is known, it should be filed with
the causing package and a version while other binary packages should be
listed in affects. Otherwise, the bug should be filed against all
involved binary packages. It is ok to group related conflicts by filing
against multiple binary packages. These bugs should normally be release
critical.

These semantics allow machine consumption and facilitate avoiding
duplicates in an automatic bug filing (to be agreed to).

Does anyone have any objections to using this tag with these semantics?
It would be most useful if other people filing such bugs would start
using this usertag of course. :)

I'm adding as possible metadata update at the end of this mail. It only
handles conflicts involving possibly aliased paths though as those are
my primary interest here.

Helmut

user debian...@lists.debian.org
# android-libnativehelper/bullseye-backports vs 
android-libnativehelper-dev/bullseye
usertags 1040323 + file-overwrite
affects 1040323 + android-libnativehelper-dev
# cadabra2/bullseye vs python3-notebook
usertags 1036021 + file-overwrite
affects 1036021 + python3-notebook
# discodos/unstable vs mono-devel
usertags 966115 + file-overwrite
affects 966115 + mono-devel
# firebird-utils/experimental vs firebird3.0-server
usertags 1040321 + file-overwrite
affects 1040321 + firebird3.0-server
# kodi-addons-dev/bullseye-backports vs kodi-addons-dev-common/bullseye
usertags 1040319 + file-overwrite
affects 1040319 + kodi-addons-dev-common
# occt vs oce mess
usertags 1037067 + file-overwrite
affects 1037067 + liboce-modeling-dev liboce-visualization-dev 
# rawloader
usertags 1041299 + file-overwrite
affects 1041299 + libplucene-perl graphicsmagick-imagemagick-compat
# qt6-base-dev/experimental vs libqt6opengl6-dev
usertags 1041300 + file-overwrite
affects 1041300 + libqt6opengl6-dev
# nex vs nvi
usertags 1022957 + file-overwrite
affects 1022957 + nvi
# nfs-ganesha-ceph/bullseye-backports vs nfs-ganesha/bullseye
usertags 1040362 + file-overwrite
affects 1040362 + nfs-ganesha



Re: /usr-merge: continuous archive analysis

2023-07-13 Thread Helmut Grohne
Hi Ted,

On Wed, Jul 12, 2023 at 10:23:08PM -0400, Theodore Ts'o wrote:
> For those packages that are likely to be backported, would ti be
> possible provide some tools so that the package maintainers can make
> it easy to have the debian/rules file detect whetther it is being
> built on a distro version that might have split-/usr, or not, or
> whether we the package needs to do various mitigations or not?

Please allow me to go into a little more detail as to why we get into a
problem for backports and then circle back to your question.

I currently imagine (and this has been vaguely circulated on d-devel a
number of times) to facilitate the canonicalization using debhelper. We
have minor disagreements on how exactly that should work. Let me give my
preferred version while keeping in mind that this is not yet consensus:

debhelper gains a new addon. It could be called usrmerge or something.
If you enable usrmerge, debhelper would perform the path
canonicalization for you. Your dh_auto_install could install to
canonical and aliased paths, but the .deb would be canonicalized. Thus,
you can easily opt into it by saying Build-Depends:
dh-sequence-usrmerge. We may also add this addon to a new compat level
as we expect that most packages in trixie will need it. Thus we're
changing it from opt-in to opt-out.

While you can merge like that, a number of packages will notice that you
can simplify your packaging by e.g. changing --prefix=/ to --prefix=/usr
or something similar and doing that canonicalization at dh_auto_install
time. In doing this, they loose the information about how files were
previously being split to / and /usr. For instance systemd needs extra
effort to support the split layout and that support is going to be
deleted soon. I expect this to happen for most packages. And this is the
part that makes backporting hard in a way that honours the moratorium
for bookworm-backports.

I'm sorry for not having considered the use case of using a single
debian/ directory tree for multiple distributions and releases, but it
is fairly obvious in hindsight. Is checking for the presence of
usr-is-merged good enough for your case?

What I imagine you doing here is generally supporting split-/usr in
e2fsprogs (for as long as you want to support building e2fsprogs on any
system that needs such support) and then telling debhelper to enable the
usrmerge addon whenever you don't need to support split-/usr. A fairly
obvious candidate check would be checking for the presence of
usr-is-merged, but while bookworm always contains that, we effectively
want it to support split-/usr to facilitate upgrades. Some of the
mitigations require the addition of a usrmerge-support package whose
preinst will unconditionally reject unmerged systems. Would that be a
suitable condition?

> The point is before we lift the freeze, perhaps we can provide some
> tools that make it easier for package maintainer to only "make
> split-/usr support vanish" conditionally, so as to make life easier
> for people who are doing the bookworm and bullseye backports?

If going with the debhelper addon and keeping split-/usr support in the
particular package otherwise, the one backporting can simply pass
--without usrmerge to dh and be done. If using the usrmerge-support
package as condition (could even be done inside debhelper), that would
become automatic.

> I don't mind keeping some buster and bullseye and bookworm schroots
> around, and doing test-builds of the packages I build, and then making
> minor adjustments as necessary to make sure things still work.
> Combined with some test automation so that we can test to see whether
> a package about to be uploaded to bullseye-backports won't break on a
> split-/usr machine, and this should be quite doable.

The real problem I see with such backports is a different one though.
Consider the case where we reorganize a package (move files between
packages) during the trixie cycle. In the normal scheme of things we
have this sequence:

 * bookworm v1: split-/usr + original file layout
 upgrade
 * trixie in progress v2: merged-/usr + original file layout
 upgrade
 * trixie in progress v3: merged-/usr + reorganized package
 upgrade
 * trixie in progress v4: merged-/usr + reorganize again

That reorganization may trigger the need for applying a mitigation and
the main plan is to only apply such mitigations as-needed. Now when you
backport this, you'd revert the merged-/usr part, so instead you end up
with this:

 * bookworm v1: split-/usr + original file layout
 upgrade
 * bookworm-backports v2~bpo: split-/usr (reverted) + original file layout
 upgrade
 * bookworm-backports v3~bpo: split-/usr (reverted) + reorganized once
   package
 upgrade
 * trixie v4: merged-/usr + reorganized again

This upgrade sequence may require a different mitigation, because we
swapped the order of canonicalization and reorganization. I have not yet
come up with an actual test case where this breaks, so maybe I'm really
wrong worr

Re: /usr-merge: continuous archive analysis

2023-07-13 Thread Helmut Grohne
Hi Luca,

On Thu, Jul 13, 2023 at 01:38:16AM +0100, Luca Boccassi wrote:
> On Wed, 12 Jul 2023 at 14:35, Helmut Grohne  wrote:
> > "risky" ones from becoming practically relevant). There is one kind of
> > issue that may be actionable right now and that's the class of "empty
> > directory" issues. If you notice that such an empty directory actually
> > is not necessary (which probably is the case in the majority of cases),
> > please go ahead and delete it from your package[2] or a file a bug with
> > a patch. Also, please declare Conflicts or Replaces as usual as some
> > forms of undeclared file conflicts also show up in the analysis. Another
> > thing that helps now is cleaning up really old and unversioned Replaces
> > as those may result in false negative detections.
...
> > [2] List of packages that *may* be actionable: fwupd, gcc-snapshot,
> > gretl, lib32lsan0, libjte-dev, libmpeg3-dev, libswe-dev, libx32lsan0,
> > pcp, pkg-config, pkgconf, pkgconf-bin, python3-expeyes, systemd
> 
> Does this mean I can nuke that empty directory from the systemd
> package right now, without waiting for the rest of your proposal to be
> implemented?

Thanks for asking. We have empty directories in binary packages in lots
of cases. Some of them are there for some technical need. In other
cases, the empty directory is a oversight and entirely unnecessary. I am
fairly convinced that the lib*-dev packages shipping an empty
/usr/lib/pkgconfig fall in that latter category and can delete their
/usr/lib/pkgconfig at no loss of functionality. On the flip side,
my understanding is that at least pkgconf and systemd ship their empty
directories on purpose as intentional integration points and that these
are not unused and should therefore not be deleted. So while dumat can
tell whether a particular empty directory participates in a /usr-merge
issue, it cannot tell you what the right action is. In some of the
cases, the action is delete and that can happen at any time while in
other cases the action is to protect them from accidental deletion,
which is something we have to defer until there is consensus on the
chosen mitigation.

Helmut



/usr-merge: continuous archive analysis

2023-07-12 Thread Helmut Grohne
Hi,

I'm doing yet another /usr-merge thread even though we already have too
many, I know.

The first one was about general discussion and problem analysis. In that
first thread, I posted a number of scripts for analyzing problems and
snapshot analysis data. In that second thread, we tried to gather
consensus around some of the views expressed.

# Continuous monitoring for problems

This thread hopefully becomes more of a FYI than a discussion. I've
turned those hacky scripts into some Python code that continuously (4
times a day) analyzes the archive for some of the problems summarized in
DEP17. Interested parties may find the code at
https://salsa.debian.org/helmutg/dumat. The results of the analysis[1]
(around 2000 lines) are updated at
https://subdivi.de/~helmut/dumat.yaml. While it says "issue" there, most
of them are *not* actionable right now. Please don't panic. Keep in mind
that the moratorium is still active (and that it prevents any of the
"risky" ones from becoming practically relevant). There is one kind of
issue that may be actionable right now and that's the class of "empty
directory" issues. If you notice that such an empty directory actually
is not necessary (which probably is the case in the majority of cases),
please go ahead and delete it from your package[2] or a file a bug with
a patch. Also, please declare Conflicts or Replaces as usual as some
forms of undeclared file conflicts also show up in the analysis. Another
thing that helps now is cleaning up really old and unversioned Replaces
as those may result in false negative detections.

What does dumat not detect?
 * DEP17-P4: Disagreeing alternatives are not a problem as long as we
   don't canonicalize alternatives (DEP17-M13). I think we can defer
   this without causing extra pain.
 * DEP17-P5: dpkg-statoverrides not matching the files shipped.
   Possibly, I can extend dumat to cover unconditional statoverrides.
 * DEP17-P8: Filesystem bootstrap. The test matrix is really small, so
   we'll probably notice when it gets broken.
 * DEP17-P9: Loss of aliasing symlinks. We can reliably address this
   centrally e.g. via DEP17-M4.

# A rough outlook

Let me give a rough idea on how I would like to move forward with this
and hope you agree with it.

For one thing, we need an agreement on the mitigations that we apply.
Except for the bootstrapping aspect, this seems relatively clear. That
discussion will likely continue and conclude eventually.

## A proposed process

Some of the mitigations are non-trivial to implement and cannot be done
e.g. by the janitor, but the dumat.yaml file tells us that the number
of occasions where we need them will be fairly low. It's the exceptional
case that goes wrong badly. Since the matter is fairly complex and since
breakage is rare, I would like to move forward in a way where we do not
ask maintainers to pay attention to /usr-merge problems proactively,
but reactively. That works with two mechanisms:
 * We generally ask maintainers to upload some classes of changes to
   experimental first. Those include:
+ Moving files from / to /usr.
+ Moving files between packages.
+ Changing diversions.
+ Changing path-based trigger interest.
+ When in doubt.
 * You allow me to turn that dumat.yaml tool into an automatic rc bug
   filing service.
The big benefit of this approach is that it lifts the mental load of the
matter from individual maintainer's brains. You have a simple rule (use
experimental when in doubt) and if your change poses any issue, you
receive an rc bug (for that experimental package if you uploaded there
or possibly for sid where it acts as migration blocker). My expectation
is that such bugs are rare events. If you don't receive an rc bug within
two days, assume that your change is fine.

I am aware that having an automatic bug filing service with no human
supervision (ahead of filing) is something we never had before. Much
less for rc bugs. Before enabling this, you definitely want to see a bug
template. Are there general objections to this idea? I already talked to
Paul Gevers and he sees this as the preferred interface to implement a
migration blocker.

## Usertagging bugs

In order to avoid filing duplicates, I need a usertagging scheme for
bugs. Are there opinions on what user I should use for this? In the
simplest instance, I can use my DD login. Roughly speaking every issue
type shall translate to an individual usertag. Is there a common usertag
for undeclared file conflicts to reuse?

# Other aspects

## Informative MBF

When I discussed this in Hamburg, there was a request to actively reach
out to affected maintainers by sending a minor bug for every source
package that is somehow involved. The bug would ask maintainers to move
their files from / to /usr, how to do so, the need to do this via
experimental first and the chances of getting an rc bug in the process.
Do people who have not been to Hamburg Reunion 2023 confirm this?

## No good solution for bookworm-b

Re: Second take at DEP17 - consensus call on /usr-merge matters

2023-07-11 Thread Helmut Grohne
Hi Luca,

On Tue, Jul 11, 2023 at 12:27:04AM +0100, Luca Boccassi wrote:
> You have said in the original mail and on the writeup that this option
> requires all the affected packages to be upgraded at the same time,
> and in the correct order, or things will break. What happens if any of

This definitely is a misunderstanding. At this time, I am not sure where
that misunderstanding originated and I don't even think it is worth
figuring out, but I wont mind a reference either.

I meant to say that the option requires the involved (~10) packages to
be *uploaded* (rather than upgraded) at the same time. And that
requirement originates from the bootstrapping aspect rather than an
upgrading aspect. Bootstrapping will be broken from the point in time of
the first upload of that set until the last package from that set has
been built.

I would not dare propose a solution that'd require upgrades to happen in
a specific order or way that is not expressible to apt. My impression is
that we have significant consensus on smooth upgrades being a core
feature of Debian. This should have failed your plausibility filter.

> those packages are held back, for whatever reason? This is the
> fragility aspect that I am worried about, and that is not an issue at
> all if we just fix mmdebstrap to do the right thing as debootstrap
> already does.

There are two mechanisms that shall ensure that any (valid according to
relationships) order and and held packages (up to not performing the
operation) will work. One is the Pre-Depends on usrmerge-support. If you
pin that package as absent for otherwise force its absence, apt will
simply refuse to upgrade everything else and your system will be stuck
at upgrading entirely. If you hold back any other package, it may keep
shipping files in aliased locations. The protective diversion mechanism
(DEP17-M4) will ensure that this does not cause the aliasing links to
disappear if you upgrade it later.

After having sorted this out, what part of your safety concerns with 3C
do remain?

I see that Sam and Guillem dislike my proposal of abusing diversions in
this way, but I honestly see little alternatives for doing this in a
safe way. In essence, we are introducing a symlinks vs directory
conflict for the aliasing symlinks and if anything goes wrong you may
end up without /bin/sh or the ELF loader. While diversions were not
meant for this situation, reasoning about them is relatively straight
forward for the purpose of what we need here. Critically, we don't need
any properties about the renamed location and we only need the property
that dpkg leaves the diverted location unmodified. I fully acknowledge
that I propose using diversions outside their specification. However, we
already use dpkg outside its specification in general (due to having
merged /usr). As such, we already have entered the land of relying on
implementation-defined behavior. Raphael's mitigation of making dpkg
more careful about deleting aliased files (DEP17-M3) also is about
temporarily extending dpkg in such an implementation-defined way. The
reason why I see diversions as favourable here is that they are a
similarly ugly mechanism that is readily available in the upgrade to
trixie and that all of them are of varying temporary nature:
 * DEP17-M4: We need these from the point in time where we start moving
   files out of aliased locations until the point in time where
   base-files has recorded all aliasing links in the dpkg database. From
   an unstable pov, this is probably is less than a year. From a
   bookworm->trixie pov, the diversions will be added and removed during
   the upgrade.
 * DEP17-M6: The diversion of dpkg-divert is not a protective diversion
   and is probably needed in trixie and forky.
 * DEP17-M8: These protective diversions are short lived during an
   individual package upgrade from preinst to postinst.
 * DEP17-M9: These protective diversions are longer lived. They probably
   need to be present in trixie. I hope that the majority of cases can
   rather delete an unnecessary empty directory than set up a diversion.
 * DEP17-M10: These protective diversions are short lived during an
   individual package upgrade from preinst to postinst.
 * DEP17-M14: The diversion of update-alternatives is not a protective
   diversiion and is probably needed for trixie and forky.

You see that the majority of these diversions is short lived. Since I
propose introducing them on-demand rather than automatically, their
number should be low. If it were not the case that this abuse of
diversions were temporary, I would be opposed to it. What makes it
attractive to me is that the alternatives also seem to be abusing dpkg
and the diversion abuse works right now.

Helmut



Re: proposal: dhcpcd-base as standard DHCP client starting with Trixie

2023-07-10 Thread Helmut Grohne
On Sun, Jul 09, 2023 at 05:58:07PM +0100, Luca Boccassi wrote:
> On top of that, a minimal installation chroot doesn't need a
> fully-featured dhcp client. As Simon said already, busybox is there
> for any reason for a minimal one. For the rest - installer and whatnot
> - the installer and tasklets should pull in the required stack as
> needed.

I contend that currently a debootstrap includes a dhcp client and this
is more of a migration from one dhcp implementation to another. Since
dhcp is the most common way of configuring a network, supporting it in
ifupdown by default also seems like a reasonable choice.

> So I think not only we should not bump the priority of dhcpd-base, but
> we should also change ifupdown's down to optional.

I don't quite see consensus on this yet, but I already see significant
interest in changing the default network configuration method. I hope
that it is out of question that we'd demote the priority of the
recommended dhcp client when demoting the priority of ifupdown. Demotion
of ifupdown needs to come with a proposed replacement and/or with
changes to the debian-installer. I do hope that we can get that
discussion going and implemented before trixie. However, this is about
changing the default dhcp client for use with debootstrap and moving the
priority from one package to another seems like an incremental
improvement that is not blocking the bigger goal of changing the default
network configuration tool in any way.

I expect that dhcpcd will not be important in trixie, but for now that
move makes sense to me, because it is as easily reverted as it is
implemented. This is an instance of "The perfect is the enemy of the
good."

And yeah, please work on changing that ifupdown by default.  I'm faced
with having to uninstall it from more and more systems. In case, you
do a straw poll, I vote for systemd-networkd, which happens to be
installed by default. Would there be any volunteers doing the d-i
integration?

Helmut



Re: Second take at DEP17 - consensus call on /usr-merge matters

2023-07-09 Thread Helmut Grohne
 Is that really less complex and less risky?

> * Your debootstrap changes seem overly complicated and would in and of
>   themselves push me against 3C.  First, you don't seem to be thinking
>   about buster, which also needs to bootstrap usrmerged, doesn't it.

I'm not sure in which way you think about buster here. Do you refer to
using a buster system for bootstrapping a trixie chroot? I think we do
not have to support this. Very likely, a buster kernel will not run a
trixie glibc at all. Do you concur here? Do you refer to using a
trixie/sid system for bootstrapping buster? That is supposed to work in
the very same (merged) way as bootstrapping bullseye and bookworm (by
merging after unpack).

>   Second, is there a way we could simply change how debootstrap calls
>   tar?

I captured this possibility above due to your question here. Thank you.
Additionally, I dug into the history of debootstrap to figure out why
and when it was added. The first commit mentioning it was
https://salsa.debian.org/installer-team/debootstrap/-/commit/6b79352a205a96cee441ae0c6247c4616097a517

Pass -k to tar when extracting packages

When installing with a merged /usr, the symlinks in / should not be
replaced with real directories when extracting the packages.

in 2016. As far as I understand it, dropping -k for any of buster,
bullseye or bookworm would be broken. In the absence of -k, tar would
replace the aliasing symlinks with actual directories. In the new world
proposed by 3C, this aspect no longer is a problem as no package
installs any directory onto an aliasing link anymore. So we must pass -k
as long as any (essential) package ships any aliased location and we
must not pass -k whenever base-files ships the aliasing symlinks in its
data.tar.  While it would probably be possible to detect this somehow it
feels very wrong.  The alternative of merging after unpack works with -k
both before and after.

Do you see some other way to fix debootstrap for this that I don't see
here?

>   I think asking debootstrap to not create the symlinks before is a big
>   ask.

Would you be able to go into detail as to why you think so? The way I
currently see it, this is a very logical consequence of shipping the
symlinks in base-files, which in turn gained quite some agreement.

And really, you got me hooked as to how hard it could be. So rather than
arguing about the feasibility of modifying debootstrap in the proposed
way, a patch seems to be the easiest way to settle that question. Hence
I'm attaching one and note that the post-merging approach also removes
the complexity of having a per-architecture list of aliased directories.

At this point, I'm really interested in understanding that additional
complexity and the involved risk that is attributed to 3C. The more I
dig into this this approach, the more it seems to be the safest approach
that also removes complexity in my (biased) view.

Helmut
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,11 @@
+debootstrap (1.0.128+nmu3) UNRELEASED; urgency=medium
+
+  * Non-maintainer upload.
+  * Implement merged-/usr by merging after initial extraction to allow
+shipping the aliasing symlinks in a binary package's data.tar.
+
+ -- Helmut Grohne   Sun, 09 Jul 2023 22:13:37 +0200
+
 debootstrap (1.0.128+nmu2) unstable; urgency=low
 
   * Non-maintainer upload.
--- a/functions
+++ b/functions
@@ -1358,15 +1358,40 @@
esac
 }
 
-# Find out where the runtime dynamic linker and the shared libraries
-# can be installed on each architecture: native, multilib and multiarch.
-# This data can be verified by checking the files in the debian/sysdeps/
-# directory of the glibc package.
-#
-# This function must be updated to support any new architecture which
-# either installs the RTLD in a directory different from /lib or builds
-# multilib library packages.
-setup_merged_usr() {
+merge_usr_entry() {
+   local entry canon
+   canon="$TARGET/usr/${1#"$TARGET/"}"
+   test -h "$canon" &&
+   error 1 USRMERGEFAIL "cannot move %s as its destination exists 
as a symlink" "${1#"$TARGET"}"
+   if ! test -e "$canon"; then
+   mv "$1" "$canon" >/dev/tty 2>&1
+   return 0
+   fi
+   test -d "$1" ||
+   error 1 USRMERGEFAIL "cannot move non-directory %s as its 
destination exists" "${1#"$TARGET"}"
+   test -d "$canon" ||
+   error 1 USRMERGEFAIL "cannot move directory %s as its 
destination is not a directory" "${1#"$TARGET"}"
+   for entry in "$1/"* "$1/."*; do
+   # Some shells return . and .. on dot globs.
+   test "${entry%/.}" != "${entry%/..}&

Re: Second take at DEP17 - consensus call on /usr-merge matters

2023-07-08 Thread Helmut Grohne
Hi Sam,

On Fri, Jul 07, 2023 at 08:50:49AM -0600, Sam Hartman wrote:
> 
> TL;DR:
> It looks like if we do not want to encode merged /usr into the bootstrap
> protocol, we must  keep some aliases and cannot move all files in
> data.tar.

Reading both of your recent mails left me very confused. It is now
obvious to me that we have a misunderstanding (at least one) and I am
not exactly sure how we can get to a point where we talk about the same
things.

In your second mail, you classify 3C (the one about not changing the
bootstrap protocol by adding aliasing symlinks to base-files) as a
category 1 solution (where we leave some files in their aliased location
to facilitate bootstrap). In reality, 3C is fully incompatible with
category 1 as the premise of 3C is that every essential package has all
of its files moved. This directly contradicts your TL;DR here.

Let me try to ignore much of the past conversation and instead explain
the bootstrap-relevant part of the transition plan that I see as
favourable (i.e. 3C). Consider this an opinionated presentation for one
of multiple ways to move forward.

We first move towards a category 1 solution where we move files to their
canonical locations as much as possible without breaking the current way
of bootstrapping that relies on either pre-creating the symlinks
(debootstrap) or usrmerge.postinst (mmdebstrap, cdebootstrap). This
process has lots of non-obvious details that I'll skip for the sake of
the bootstrap topic here. Let us for a moment assume that we'd manage to
get to this category 1 solution where most files (in essential packages)
have moved their files to canonical locations and we're left with some
packages (e.g. libc6, dash, util-linux, ...) that could not move their
files.

There is one aspect, I want to get into more detail as it is partially
relevant to bootstrapping and this is protecting the aliasing symlinks
(DEP17-P9). Here, I am selecting DEP17-M4 as the relevant mitigation.
That amounts to adding a usrmerge-support package that creates
protective diversions for the aliasing symbolic links and assigns them
to base-files. Both base-files and any package that moves files out of
aliased locations has to gain Pre-Depends: usrmerge-support in order for
this method to be effective. So we'll likely see debhelper's
${misc:Pre-Depends} that we added for multiarch-support return.

Until this point, we can still decide whether we do 3B or 3C and I am
explaining what happens in the 3C choice now.

Concurrently with the earlier changes, we modify debootstrap.
debootstrap requires uploads to bookworm and bullseye anyway, because we
will have to change --variant=buildd to become merged for trixie and
forward while currently debootstrap always creates --variant=buildd as
unmerged. On top of this necessary change, we add a change relevant to
3C. Currently, debootstrap creates the aliasing symbolic links prior to
the initial package unpack. I want to swap these operations. In doing it
afterwards, debootstrap cannot just create the symlinks but may have to
perform an actual merge in much the same way that usrmerge does now
except for dropping the atomicity requirement (as we don't need that in
the bootstrap setting). Other than that, this change is fully backwards
compatible (since bootstrapping as unmerged and the installing usrmerge
also works) and will continue to debootstrap a merged bullseye, bookworm
and trixie in the same way as before. I'll get into why we do this in a
moment.

Then, we assume that updating debootstrap has happened, usrmerge-support
exists in trixie and the number of essential packages shipping files in
aliased locations is about 10. We now move away from category 1. At this
point my original categorization becomes a little difficult, so I defer
explaining what we move to. We prepare changes to canonicalize all those
remaining packages (only essential ones) to canonicalize their paths and
also prepare a change to base-files to change the type of the aliasing
symlinks from directories to symlinks. While it technically is a
directory-to-symlink conversion from a dpkg point of view, that
conversion has already happened on the filesystem level, so this
practically is a change to the dpkg database only. Now we upload all of
these packages concurrently. This is when mmdebstrap and cdebootstrap
temporarily break. Once all of the binary packages have been built and
installed into the archive, things should work again.

Due to having modified deboostrap, we arguably have moved into category
4. Since that change is backwards-compatible and has been uploaded to
bullseye and bookworm, we can kinda pretend that the bootstrap protocol
never was different and therefore say that we move into category 2, but
without the reasons originally given for why this cannot work. Had we
skipped that change to debootstrap, unpacking base-files would make
debootstrap fail, because it passes -k to tar and when tar would try to
create the /bin -> usr/bin symlink include

Re: Second take at DEP17 - consensus call on /usr-merge matters

2023-07-07 Thread Helmut Grohne
Hi Sam,

On Thu, Jul 06, 2023 at 01:51:04PM -0600, Sam Hartman wrote:
> BUT I don't think it matters.
> If we have a consensus we're unwilling to wait for a patch, it doesn't
> matter whether that's because:

Indeed, this is not how I looked at it. It also is a view that I don't
subscribe to, because I think commercial backing of the issue changes
the equation. So if I were to see changing dpkg as a viable way now (I
did earlier), I would be willing to wait for it, because we have
recently made significant progress on defining what those semantics
should be.  While you try to remove the reasoning for this point of view
from the consensus, the "willing to wait" language implies a reason of
"too slow" already.

> 1) Some of us think the patch would be a bad idea

Additionally, people disagree on why it is a bad idea.

> 2) Some of us think the patch will not happen because of politics
> 
> 3) Some of us think the patch won't happen because no one cares enough
> to write it
> 
> 4) Some of us think the patch will eventually get done
> 
> 5) Some of us think the problem is too constrained and if we really
> wanted to make progress we could incrementally move toward it.

We also have quite some disagreement on what "the patch" is in terms of
what semantics would help here.

> Helmut effectively asked us to agree with 1.

I disagree here. For reference, I am quoting the proposed consensus
item:

| The primary method for finalizing the /usr-merge transition is
| moving files to their canonical locations (i.e. from / to /usr)
| according to the dpkg fsys database (i.e. in data.tar of binary
| packages).  dpkg is not augmented with a general mechanism
| supporting aliasing nor do we encode specific aliases into dpkg.

I carefully avoided adding reasoning to the proposed consensus as I was
seeing our disagreement on reasoning. I now see how that second sentence
could be read as precluding a dpkg patch in general. Would it help to
add "+For the purpose of finalizing the transition,+ dpkg is not
augmented ..." to leave open the possibility to add such a change but
still implying that we are not waiting for it to happen?

> And I don't think there is a consensus on this.

Even though I disagree on that's what I asked for, I agree that we don't
have consensus on patching dpkg being a bad idea in general.

So how do we get towards an agreeable consensus item? Evidently, the one
I proposed does not work out and the "willing to wait" variant you
proposed garners rough consensus, but not more. Would someone else be
able to propose wording that passes muster?

> 
> 
> I have reviewed the  DEP 17 draft at
> https://subdivi.de/~helmut/dep17.html
> 
> 
> Helmut asked for consensus on   the problems and mitigations or at least
> I think he did.
> I think we don't need that.
> I think we need consensus on decisions and confirmation that everyone
> feels heard.

Heh. I see how you get there. I agree with that latter part and tried
to use the agreement on problems and mitigations as a vehicle for
ensuring that everyone feels heard. Evidently, that does not work out
either.

In any case, the rough consensus on moving forward without major changes
to dpkg (for reasons we disagree about) paves the road for selecting a
subset of mitigations and proposing that as decision. The major missing
piece to perform this selection is an agreement on how we protect
aliasing symlinks from deletion (aka P9), because we still have quite
strong disagreement on the mitigations selected there.

> WRT the problems, I confirm that the list of problems does (in my
> reading) accurately describe the problems people have brought up.

Thank you!

> I don't think we have (or should try to get) a consensus on which
> problems need to be fixed except in so far as that affects our consensus
> on a proposal.

I was trying to imply that we need to address (more or less) all of
these nine problems. I say address rather than fix, because we may
choose to only fix them in certain environments and skip others (e.g.
derivatives, addon repositories, backports, skip upgrades etc.).

> I will admit that even though I've followed the discussion fairly
> closely, I don't have a good feeling about the mitigations.
> 
> I think that once a reasonable set of the mitigations have been applied,
> we'll be in a reasonably good place.
> 
> My concern is about upgrades and about unstable.
> I would like to see a set of instructions that I could follow for moving
> files in my packages in the data.tar to their canonical locations.

To me, that set of instructions is a later step, because those
instructions strongly depend on the selection of the mitigations and the
selection varies wildly with our disagreement of symlink protection. If
I were to present instructions now (one way or another), people would
rightfully disagree (different ones depending on the selection).

> I'd like instructions that clearly allowed me to reas

Re: Replaces without Breaks or Conflicts harmful?

2023-07-06 Thread Helmut Grohne
Hi Thorsten,

On Thu, Jul 06, 2023 at 05:26:43PM +, Thorsten Glaser wrote:
> Helmut Grohne dixit:
> 
> >   openjdk-8 (U)
> 
> Should be convered by the Depends lines in the respective
> binary packages, e.g:
> 
> Depends: openjdk-8-jre (>= ${source:Version}),
>   openjdk-8-jdk (>= ${binary:Version}),
>   ${misc:Depends}
> Replaces: openjdk-8-jdk (<< 8u20~b26-1~)

Yes, this is the kind of fpos I was mentioning as expected.

> >   rng-tools-debian
> 
> Also false positive:
> 
> Replaces: intel-rng-tools, rng-tools
> Breaks: rng-tools (>= 5migratf), rng-tools (<< 5migrate)
> Conflicts: intel-rng-tools

This is *not* a false positive, but a real issue. It replaces any
rng-tools, but breaks only a subset. This would have to be fixed to
either drop the version constraint from Breaks (probably wrong) or add
it to Replaces. Can you handle that?

Helmut



Re: Second take at DEP17 - consensus call on /usr-merge matters

2023-07-02 Thread Helmut Grohne
Hi Simon,

On Fri, Jun 30, 2023 at 08:06:15PM +0900, Simon Richter wrote:
> I think "backports" are missing as a user story.

I fully agree. What a serious omission. As a first step, I have updated
DEP17 to indicate which mitigations happen to work when being
backported. For instance, changing Replaces to Conflicts is something
that happens to just work for backports. Diverting dpkg-divert less so.

So in effect, the most likely outcome is that backports become very
fragile, because they essentially have to undo all the moves performed
in unstable. What we can do with relative ease is detect the problems
after they have been introduced.

I note that the quality of backports already leaves wishes. For
instance, kodi-addons-dev takes over /usr/bin/dh_kodiaddon_depends in
bullseye-backports from kodi-addons-dev-common in bullseye without
conflicts nor replaces. Likewise, nfs-ganesha-ceph takes over
/usr/lib/ganesha/libganesha_rados_*.so in bullseye-backports from
nfs-ganesha in bullseye.

Is making backports more fragile a reasonable trade-off for moving
forward?

> Most packages should be harmless, and the Contents file for
> bullseye-backports doesn't have too much in any of the affected directories.

Yeah, the number of affected cases should be relatively low, but when
things go wrong, it's bad. Is that ok if we can automatically detect it
(after it happened)?

> but the list of packages installing files into /lib is longer and includes
> all the kernel backports, so I guess that is another potential source of
> problems.

Without having looked, I'd expect the majority of practically affected
files below /lib to be systemd units and udev rules. These should not be
moved in backports, and as long as debhelper is being used, that might
just work.

> There might be an easy solution here, I have not investigated this very
> deeply because it is a workday and 11 hours out of every day are already
> spoken for.

I don't think there is an easy solution, but maybe it happens to work by
chance 90% (number made up) of the time and we shall be able to detect
the remaining 10%.

> > Stating a goal has been quite difficult, but I think that most of the
> > participants agree that we want to eliminate the file move moratorium
> > without getting problems in return.
> 
> I'd even widen that to "no more special handling needed in any place for the
> transition", with the moratorium being an example of that.

What other kind of special handling do you have in mind? I probably
agree.

> > When we get into mitigations, consensus is hard to come by. My
> > impression is that we have roughly two camps. One is in favour of major
> > changes to dpkg and the other is not.
> 
> It's difficult to summarize the situation in one sentence, because neither
> group is really objecting to dpkg changes, so I'd put the fault line at
> whether the transition should be facilitated through dpkg or not.

It's not clear to me, whether you'd consider M3 (a minor and revertable
mitigation in dpkg) to be covered by this or not.

> >   * Even if dpkg were fixed in trixie, trixie's libc6 would still be
> > unpacked using bookworm's dpkg. At least for this package, we cannot
> > rely on dpkg changes in trixie. Therefore we need workarounds or
> > spread the transition to forky. For other packages, even a
> > Pre-Depends on dpkg is not a guarantee that a changed dpkg will
> > perform the unpack.
> >   * Changes to dpkg will not fix bootstrapping.
> 
> The dpkg changes will fix bootstrapping, but we can't finish the transition
> until forky this way, because we need to be able to bootstrap with a libc6
> package that can be installed from bookworm.

Can you elaborate why? As far as I can see, debootstrap may perform the
initial unpack without help from dpkg. Then we invoke the unpakced dpkg
to configure essential packages. If dpkg plays any role in setting the
aliasing symbolic links, their creation will be late for running dpkg.
Maybe I'm missing something?

> We need to be careful here to not conflate the goal of the transition with
> the method for reaching it. We have consensus on the goal (basically,
> data.tar and filesystem matching is the definition of "done" I use for this
> transition).

I agree that we need to be careful about that conflation, because I
think you are conflating them. I disagree that the goal is having
data.tar and filesystem match up. You earlier argued that we are done
when special handling is no longer necessary and I see that as the goal.
Having all the files moved is one method to get there. We also seem to
have rough consensus that this is the preferre method.

> We do not have consensus on the technical implementation because there are
> people who believe the technical implementation proposed is not actually
> feasible. In my opinion, it is a 95% solution, which is very tempting but we
> need the remaining 5% as well. To a large extent, we are having this
> discussion because the usrmerge package l

Re: Second take at DEP17 - consensus call on /usr-merge matters

2023-06-29 Thread Helmut Grohne
Hi Russ,

On Thu, Jun 29, 2023 at 11:51:57AM -0700, Russ Allbery wrote:
> I think I fall into the category that Sam is thinking of.  I don't agree
> that aliasing support in dpkg is useful only for this transition.  I think
> there are other cases where two directories could end up being aliases for
> the same directory.  However, I have been convinced that changing dpkg to
> properly support this will take longer than I'm willing to wait given the
> problems that the /usr-merge transition is causing right now, and
> therefore I agree with a consensus that we shouldn't wait for dpkg
> aliasing support (even though I disagree, I think, about the long-term
> usefulness of that support).

To me, this leaves more question marks than earlier. What applications
of aliasing do you envision that would benefit here? Anything concrete?

Why did you see waiting for a dpkg patch as a reasonable approach
earlier? I think we have roughly three categories of dpkg changes on the
table:
 * Minimal hacks that would paper over some of the effects without
   causing a generally applicable aliasing support
 * A mechanism where dpkg is being told about aliasing symlinks
   explicitly and thus would work with them
 * An extension of the dpkg filesystem database wherein it would be able
   to determine the filetype for every filesystem object. In particular,
   this would allow it to tell directories from symlinks apart and
   handle the overlap in a meaningful way.

Which of these do you consider "aliasing support"? I expect that you
would not consider the first category. The third category is quite
involved. Since it changes the internal layout of dpkg's database, a
prerequisite for doing this is removing direct accesses to the database
by other tools. Guillem has been doing that work and you can find more
information about it at
https://wiki.debian.org/Teams/Dpkg/Spec/MetadataTracking. Note that this
work goes beyond what is needed for aliasing as it also considers
extending the metadata format in binary packages to be able to represent
e.g. extended attributes. Did you expect Guillem to implement this
faster? The second category seems fairly narrow and tailored to the
/usr-merge application. Do you consider it to support aliasing in the
generic way that you had in mind? Since Guillem is working on the third
category, whom did you imagine to be working on the second one?

> I am very disappointed that we have not successfully added aliasing
> support to dpkg, which to me is the obviously correct way of addressing
> this problem architecturally.  To me, this says sad and depressing things
> about the state of the project that we have not been able to do this.
> What Sam is trying to say, I think, is that a different phrasing offers a
> way to avoid that discussion and agree to disagree on the best
> architecture in the abstract by pointing out an additional constraint: how
> long we're willing to wait and how uncomfortable we're willing to make
> things for ourselves to try to force the dpkg changes to appear.

Please bear in mind that a general form of aliasing support entails all
sorts of crazy corner cases. In the earlier thread, Simon Richter
highlighted some of them. We'd need to come up with reasonable semantics
for the case where a package wants to install a symbolic link to a
location that is already occupied with a directory for instance. We also
need to figure out what happens when we remove that package. What
happens when the symbolic link is diverted? What happens if the target
of an aliasing symlink itself is an aliasing symlink? In assuming that
files have one and only one canonical location, we are saved from this
complexity on multiple levels. Why do you see adding such complexity as
obviously correct?

The more I read your and Sam's mails, the more I have the impression
that I miss some important aspect.

Helmut



Re: Second take at DEP17 - consensus call on /usr-merge matters

2023-06-29 Thread Helmut Grohne
Hi Sam,

On Wed, Jun 28, 2023 at 02:55:28PM -0600, Sam Hartman wrote:
> I have read the mail, not the full updated DEP, so I cannot yet ack
> this.

Hmm. Do you intend to do that? If you are short on time, I think the
problem section is more important than the mitigation section.

> Helmut> When we get into mitigations, consensus is hard to come
> Helmut> by. My impression is that we have roughly two camps. One is
> Helmut> in favour of major changes to dpkg and the other is not.
> 
> I think you might get a more clear consensus if you phrase that in terms
> of whether people are willing to wait for major changes in dpkg.
> If the dpkg maintainer were to merge aliasing support, I haven't seen
> anyone objecting strong enough to try and override that maintainer
> action for example.

I think this is a misrepresentation. There is no readily mergable patch
for aliasing support. The most complete one is the tree from Simon
Richter and he considers that work incomplete still. At this time, it's
no a question of merging or not really.

>From my point of view, the question really is whether we want to
permanently extend dpkg's features in a way that we only really see
useful for one transition while still having to maintain it for
backwards compatibility.

In any case, your reply demonstrates why it is so difficult to get to a
good consensus item on this.

> Ah, this was really helpful, because it allowed me to understand that at
> least you and I haven't even been thinking about the problem space the
> same.

Cool. I've also learned something about how debootstrap works
differently than I originally thought from Ansgar. Both of these are
indications that this discussion is constructive.

> Let's explore whether debootstrapping tools need to have release
> knowledge at all.

Yes, that's an important question. I implied it in #3a vs #3b/#3c.

> Support for creating a merged /usr installation was first added in
> debootstrap 1.0.83 in September of 2016.
> It was enabled by default in  1.0.85 in October of 2016.
> So, everything since stretch has supported merged /usr.

In essence, you are proposing to permanently make the setup of symbolic
links part of the bootstrap protocol. Given that installing usrmerge has
even worked on older releases, we could simply treat them as
retroactively merged.

> 1) Do we actually still need to support boostrapping things older than
> stretch at all using modern bootstrap tools?

A customer of mine (the one that makes me work on /usr-merge :) still
works with jessie and developing updates for jessie still tends to
happen on unstable systems. So I argue that around 10 DDs probably still
have a need for doing this.

> If you're really installing something that old, can't you use a
> container image to use an older bootstrapping tool?

So after having answered the question of whether some people still need
this, I think installing an older debootstrap would be far more annoying
than the workaround you propose:

> 2) Even if we do, I think it's okay to say that you need to specify
> --no-merged-usr when installing something older than stretch, just as
> you need to specify that if you want a buster, stretch, or bullseye
> version that is not merged /usr.

While that workaround seems simple enough, plugging an option into
debootstrap can be difficult at times. I have also been working on
setting up autopkgtests and learned that debci wraps the debootstrap
call in quite some layers. Stuffing --no-merged-usr through those layers
is a non-trivial effort. I know this, because I recently sent multiple
MRs for debci to get --keyring through those layers.

Quite obviously, I am in a really special situation of actually working
with jessie and working on autopkgtests for jessie. No sane person would
do that and I cannot expect that Debian goes through extra hoops just
for me. That said, it's not like your strategy is without cost. It just
happens elsewhere.

> So my proposal is to modify the bootstrap protocol, and unless an
> administrator specifically requests a non merged /usr system, then merge
> /usr.

This sounds as if we'd just have to patch mmdebstrap and cdebootstrap
(and remove multistrap) while keeping debootstrap the way it is. I think
that we will have to touch debootstrap in any case. If you specify
--variant=buildd, you get an unmerged chroot even when you do it on
trixie or unstable. This already surprised some users and probably needs
changing. So debootstrap is the thing that definitely needs an update
(even in stable) while the others may not need an update if we end up
picking #3c.

> My initial analysis is that you're making this more complicated than it
> needs to be.

I disagree with my personal and Freexian collaborator hats, but I see
how you get here and that my use cases are all but representative. I
find it plausible that we'd get a majority for the opinion that having
to specify --no-merged-usr for old releases would be an acceptable
workaround.

> My ass

Re: Second take at DEP17 - consensus call on /usr-merge matters

2023-06-29 Thread Helmut Grohne
Hi Luca,

On Thu, Jun 29, 2023 at 12:49:16PM +0100, Luca Boccassi wrote:
> Essentially, this boils down to risks vs benefits. The risk of going
> 3c is that every single Debian installation in existence breaks in
> some interesting ways, as fixing the bootstrapping corner case is
> delegated to the package upgrade workflow. The sole benefit is that
> one of the two bootstrapping tools in widespread use keeps its
> internal code a bit 'cleaner' from the point of view of some
> technically unnecessary and self-imposed design constraints (yes
> there's 2 more tools as pointed out yesterday, but they appear to be
> at least under-maintained judging from tracker.d.o).

I'm not sure you understand what 3c is about. I think it is safe to say
that you are in favour of moving all of the files to their canonical
location (i.e. from / to /usr). This is half of the picture for 3c. The
other half is shipping the symbolic links in base-files rather than
having them created in some way not tracked by dpkg. If you plug these
two together, you have made /usr-merged bootstrapping work without
having changed the protocol.

So what is the risk involved here? I think there are three main risk
categories at play:

1. The risk of effects from moving files from / to /usr. This is a risk
   that you clearly see as worth taking regardless of the bootstrap
   case.

2. The risk of effects from shipping the symbolic links in base-files. I
   see that you'd rather not do this, but not shipping them in any
   package poses a deletion risk of its own, so shipping them
   effectively is a risk mitigation and is what allows us to drop
   protective diversions eventually. It stills risks breaking
   debootstrap's behind-the-back approach of merging, so we'll likely
   have to do a stable upload of debootstrap.

3. The risk of unstable becoming temporarily non-bootstrappable. This is
   where I see the main fragility of the approach. As is evident from
   your next paragraph, you don't really care about this either.

Given this, it seems rather evident that you have a different risk in
mind that I do not see at this time. Can you elaborate?

Then, software maintainers tend to say "no" when a feature poses a
non-trivial cost to permanent maintenance. We see this all the time and
you shrug it off, because it's not your package. However, when people do
the reverse (e.g. diverting systemd's units poses a non-trivial
maintenance cost to systemd), you take it for granted that you can
unilaterally say "no". Why is it ok for you to say "no", but not for
other maintainers to say "no"?

> I don't see how it's worth the risk. This is essentially a problem in
> the bootstrapping tools, so solving it in the bootstrapping tools is
> not only the safest approach - worst case scenario, creating a new sid
> chroot might not work for a couple days, not a big deal given it
> happens all the time for various reasons as we've seen this week -
> it's also the right approach.

You seem to have missed Johannes reasoning entirely. He sees Debian as a
component-based system. He argues that this is not a problem in the
bootstrapping tools, but a problem in the components being bootstrapped.
In effect, the usrmerge binary package currently implements it in a
component-oriented way. Since it is a problem with that component,
solving it there makes most sense, no? That alone makes it obvious, that
this is not a problem limited to bootstrapping. We have now duplicated
this mechanism to usrmerge and debootstrap and you are proposing to
duplicate it again. I argue that a maintainable implementation should
centralize this aspect into (preferably only) one component.

Helmut



Re: Replaces without Breaks or Conflicts harmful? [was: Re: Policy consensus on transition when removing initscripts.]

2023-06-29 Thread Helmut Grohne
Hi Bas,

On Thu, Jun 29, 2023 at 08:19:51AM +0200, Sebastiaan Couwenberg wrote:
> On 6/28/23 21:49, Helmut Grohne wrote:
> > Debian GIS Project 
> > postgis
> > qgis
> 
> Why is postgis on this list?
> 
>  $ grep -c Replaces debian/control*
>  debian/control:0
>  debian/control.in:0

Thanks for asking. You identified another source of false positives that
slipped my mind when doing the analysis. The underlying data source did
not use unstable, but every suite from bullseye to experimental
including -security and -backports. As it happens, bookworm's
postgresql-15-postgis-3-scripts has versioned Replaces that are not
matched with Breaks or Conflicts. I don't think we are going to fix that
in bookworm and you've fixed it in unstable. So yeah, this list has more
false positives than originally assumed.

I could improve the numbers, but to me the numbers I've given being a
tight upper bound seems good enough and lintian.debian.org will give us
precise and current numbers once my patch is merged. Does that seem
sensible to you as well?

Helmut



Second take at DEP17 - consensus call on /usr-merge matters

2023-06-28 Thread Helmut Grohne
ies. Please indicate whether you want to stay anonymous in this
case.

I also hope that this mail results in detailed disagreements that I can
use to refine DEP17 and to base further research on.

Helmut

dep17.mdwn follows:

[[!meta title="DEP-17: Improve situation around aliasing effects from 
`/usr`-merge"]]

Title: Improve situation around aliasing effects from `/usr`-merge
DEP: 17
State: DRAFT
Date: 2023-03-22
Drivers: Helmut Grohne 
URL: https://dep.debian.org/deps/dep17
Source: 
https://salsa.debian.org/dep-team/deps/-/blob/master/web/deps/dep17.wdwn
License: CC-BY-4.0
Abstract:
 This document summarizes the problems arising from our current `/usr`-merge
 deployment strategy. It goes on to analyze proposed solutions and analyzes
 their effects and finally proposes a strategy to be implemented.

Introduction


Debian has [chosen](https://lists.debian.org/8311745.knc49ya...@odyx.org) to 
implement merged `/usr` by introducing symbolic links such as `/bin` pointing 
to `usr/bin`.
In the presence of such links, two distinct filenames may refer to the same 
file on disk.
We say that a filename aliases another when this happens.
The filename that contains a symlink is called the aliased location and the 
filename that does not is called a canonical location.

At its core, `dpkg` assumes that every filename uniquely refers to a file on 
disk.
This assumption is violated when aliasing happens.
As a result, we exercise undefined behavior in `dpkg`.
This is known to cause problems such as unexpected file loss and is currently 
mitigated by a [file move 
moratorium](https://lists.debian.org/debian-devel/2021/10/msg00190.html).

We currently prohibit most situations that may provoke problematic behavior 
using policy.
This mitigation is not without cost and we want to eliminate it.
Shipping files in their canonical locations tends to simplify packaging.
Once files are moved to their canonical locations, a number of aliasing 
problems are effectively mitigated.
The goal of this work is to reduce the impact of these matters to the typical 
package maintainer.
It aims for removing the cognitive load of having to keep in mind which files 
must be installed to aliased locations and which files must be installed to 
canonical locations.

Regardless of what strategy we end up choosing here, we will likely keep some 
of the temporary changes even in the `forky` release to accommodate external 
repositories and derivatives.

Problems


P1: File loss during canonicalized file move


When moving a file from its aliased location to a canonical location in the 
`data.tar` of a binary package and moving this file from one binary package to 
another, `dpkg` may unexpectedly delete the file in question during an upgrade 
procedure.
If the replacing package is unpacked first, the affected file is installed in 
its canonical location before the replacing package is upgraded or removed.
`dpkg` may then delete the affected file by removing the aliased location - not 
realizing that it is deleting a file that still is needed.

This problem was originally observed in 
[#974552](https://lists.debian.org/974552) and is the one that motivated the 
issuance of the moratorium.
Since the moratorium came into effect and file moves have been prevented, no 
new cases surfaced.
Had the moratorium been lifted for the bookworm release, we know that problems 
would have been caused in a small two-digit number of cases.
[For instance](https://lists.debian.org/20230426223406.gb1695...@subdivi.de), 
`/lib/systemd/system/dbus.socket` could have been canonicalized while it has 
been moved from `dbus` to `dbus-system-bus-common`.
There is an [artificial test `case1.sh` demonstrating the 
problem](https://lists.debian.org/20230425190728.ga1471...@subdivi.de).

P2: Missing triggers


When packages declare a `dpkg` file trigger interest in a location that is 
subject to aliasing without also declaring interest in the other location, a 
trigger may not be invoked even though that was expected behavior.
No issue arises when a file trigger is declared on a canonical location and all 
packages are shipping their files in that canonical location.
However, when the trigger is declared for an aliased location and packages move 
their files to the canonical location, triggers can be missed.

This problem is also currently being prevented by the moratorium.
Had the moratorium been lifted for the bookworm release, we know that problems 
would have been caused in two cases.
The `runit` and `udev` packages declare an interest to aliased locations and 
would start missing trigger invocations when canonicalizing files in other 
packages.

P3: Ineffective diversions
--

When a package uses `dpkg-divert` to displace a file from another package, the 
diverted location may have become aliased due to the 

Re: booststrapping /usr-merged systems

2023-06-10 Thread Helmut Grohne
Hi Sven,

On Sat, Jun 10, 2023 at 08:35:44AM +0200, Sven Joachim wrote:
> > Unfortunately, any
> > external package that still ships stuff in /bin breaks this. In effect,
> > any addon repository or old package can break your system.
> 
> You lost me.  We have converted /bin to a symlink already, have many
> packages that ship files there and yet our systems do not break.  Could
> you please elaborate?

I'm sorry. I see how I am mixing up use cases all the time. What is
broken here is smooth upgrades (or package removal). Let me add detail.

dpkg has two kinds of filesystem resources. These are owned objects and
shared objects. A regular file usually is owned by one and only one
package. A directory is often shared between multiple packages. A
regular file can also be shared between multiple (Multi-Arch: same)
instances of the same package. So whenever a package removes a shared
object from a package (due to upgrading or removing the package), dpkg
checks whether this shared object now is unreferenced. If that happens,
it actually deletes it from the filesystem.

So we kinda need to distinguish the actual filesystem view from the dpkg
database view in this discussion. While the filesystem can now (since
bookworm) be assumed to always have the symlinks, dpkg has a (shared)
object there. It doesn't track the type yet (though Guillem is
working[1] on that).

Now we imagine a situation where we managed to get past this transition
somehow and the end state is that no package in trixie ships /bin other
than base-files, which ships it as a symlink. Or maybe we finished the
transition by having no package ship /bin and we modified the bootstrap
protocol to create the symlinks in another way. There is two use cases
that are at risk now:

 * You have some old bookworm package around that still ships a file in
   /bin. You no longer need this package and remove it. Since this was
   the last package (on your system) to contain /bin (in data.tar), dpkg
   observes that /bin can go away and deletes your symlink. Boom.

 * You have some external repository that contains a package which still
   ships something in /bin. At some point the vendor got the message
   about moving files and moves them to /usr/bin and this - again - is
   when your /bin symlink vanishes during the package upgrade.

So at this time, I think we basically have three ways of dealing with
this:

 1. Add a protective diversion for the affected locations (and keep that
until forky at least).
 2. Ship the affected symlinks as directories in some essential package
until we are sure that no package ships these directories (even in
external repositories).
 3. Modify dpkg in some way to handle this case.

I hope this made things more clear. Also note that this mail is purely
concerned with dpkg package operations and entirely ignores the
bootstrap use case.

My takeaway here is that while I see the protective diversion as the
"obviously superior solution", this clearly is not consensus at this
time. It also means that when rewriting DEP 17, I need to spend quite a
bit of text on rationale. Thank you.

Helmut

[1] https://wiki.debian.org/Teams/Dpkg/Spec/MetadataTracking



Re: booststrapping /usr-merged systems (was: Re: DEP 17: Improve support for directory aliasing in dpkg)

2023-06-09 Thread Helmut Grohne
Hi,

On Fri, Jun 09, 2023 at 09:57:21PM +0200, HW42 wrote:
> Did you consider just having one package keep one dummy file in /bin?
> While this isn't elegant it sounds much less complex than diversions and
> tricky pre-depend loops, etc.

The dummy file is not necessary. Debian packages can ship empty
directories. Having any package ship /bin (empty or not) is fully
sufficient to prevent dpkg from removing it.

> I might be very well missing something here (for example maybe it's
> really essential that no files remain in /bin, even not a dummy file).
> But in the other branch of this thread you welcomed "dumb" questions, so
> here you go ;]

Yeah, I consider the property that nothing ships anything in aliased
locations an important one. So let us go down for the consequences of
not doing that.

So some package will keep shipping /bin. It does't really matter which,
but clearly this package must be part of the essential set (otherwise
you could remove it and with it /bin would be deleted). This is cool for
upgrades, but less so for bootstrapping tools.

One of the approaches to making bootstrapping work was adding the
symlinks to some data.tar. That has been category 2 from my earlier
mail. We definitely cannot add /bin as a directory to one package and
/bin as a symlink to another (unless using diversions), because the
resulting behaviour is dependent on the unpack order when used with
dpkg. Also any bootstrap tool that unpacks with tar -k (such as
debootstrap) requires changes to support this. So this pretty much
precludes completing the transition in a way that just unpacking all
data.tar of essential packages gives you a working chroot. In effect,
this requires a proposal to change the bootstrap protocol (category 4)
in order to make sense.

There is a loop hole that I ignored here. While /bin cannot be both a
directory and a symlink at the same time, we can upgrade it. So if we
somehow managed to get one and only one package to contain /bin as a
directory, we could upgrade that to a symlink. Unfortunately, any
external package that still ships stuff in /bin breaks this. In effect,
any addon repository or old package can break your system.

The other way of seeing us keep /bin as a directory is to not
canonicalize (i.e. category 1). Then we'd simply keep (wlog) /bin/sh in
/bin and not move it to /usr.

Can you elaborate in what way you see protective diversions as adding
complexity? It can be as simple as:

dpkg-divert --add /bin --divert someplacewedontcare --package base-files 
--no-rename

We'd add this to one package and everyone else can issue Pre-Depends.

The benefit we gain from keeping /bin is not clear to me (beyond
avoiding a diversion). At this time, it seems to me that doing that
either requires changing all bootstrapping tools (in yet unspecified
ways) or never canonicalizing all paths (according to the dpkg
database).

Now I'm wondering what I am missing here.

Helmut



Re: booststrapping /usr-merged systems (was: Re: DEP 17: Improve support for directory aliasing in dpkg)

2023-06-09 Thread Helmut Grohne
Hi Richard,

On Fri, Jun 09, 2023 at 02:42:25PM -0500, Richard Laager wrote:
> Is the broader context here that this is an alternative to teaching dpkg
> about aliasing? That is, we just arrange the transition correctly such that
> we get out of the aliased situation as part of upgrading to trixie?

Yes, I think the idea that we are mostly exploring now is not teaching
dpkg about aliasing and rather move all the files to their canonical
location such that there no longer is any aliasing that dpkg would have
to deal with.

I don't think this is complete consensus at this time, but the majority
of discussion participants appear to favour this approach.

> Because you want to support non-usr-merged systems, e.g. for derivatives?

dpkg is used in any different contexts. A very simple example of a
non-merged system would be Debian stretch. For another dpkg really is
being used for things that are not based on Debian. While it is the
Debian package manager, it has uses beyond dpkg has (thus far) stayed
away from imposing policy on the filesystem layout.

> They aren't going to want to delete /bin either, so I don't see how a
> special-case preventing deletion of /bin would be problematic.

Indeed. However, if you actually manage to trigger this, it can be very
surprising. Your hard coded list would also contain /lib32, /libx32 and
/libo32. Then you install some mipsen packages, remove them and wonder
why /libo32 does not go away. piuparts is definitely unhappy at this
point. I am quite sure that this behaviour would break something. What I
am not sure about is whether accepting such breakage is a reasonable
trade-off.

> Am I understanding the problem correctly?

I confirm.

> What would happen if, for trixie only, bin:libc6 shipped two identical
> copies of ld-linux-x86-64.so.2, one in each of /lib64 and /usr/lib64?

That's an interesting idea. Do note that we don't actually have to ship
ld-linux in both locations. We can actually move it in a safe way
(unless we also move it between packages, which we don't). So let me
change that to: We keep /lib64 (the directory) in addition to
/usr/lib64. Keeping the directory prevents dpkg from deleting the
symlink (as it doesn't know about the filetype).

> Then at step 2, /lib64 does not get deleted and nothing breaks.

Confirmed (with the simplified variant).

> Later, whatever replaces /lib64 with a symlink needs to deal with this, but
> that's not significantly different than whatever it was going to do anyway,
> right? Just do this:
> 
> 1. Whatever safety checks are appropriate.
> 2. Unless already verified to be identical by #1, hardlink
> /lib64/ld-linux-x86-64.so.2 to /usr/lib64/ld-linux-x86-64.so.2. This might
> be just a particular instance of the more general case of hardlink
> everything from /lib64 into /usr/lib64.
> 3. Unlink everything from /lib64.
> 4. Unlink /lib64.
> 5. Symlink /lib64 to /usr/lib64

I think we start from the premise that /lib64 already is a symlink and
as long as libc6 actually ships /lib64 (even if empty), dpkg won't
delete it. What we will not get here is getting rid of the aliasing and
we will also be unable to ship /lib64 as a symlink in any data.tar
(since that would be a directory vs symlink conflict, which has
unpack-order-dependent behaviour, which is bad).

> However, note that this cannot be a shell script, as then step 3 would
> delete /lib64/ld-linux-x86-64.so.2 and everything after that would fail.

Non-issue since we assume that bookworm is merged already.

> At that point, everything is fine, EXCEPT that dpkg now thinks it has a
> /lib64/ld-linux-x86-64.so.2 file installed, but really that is aliasing
> /usr/lib64/ld-linux-x86-64.so.2. When bin:lib6:amd64 is later upgraded (e.g.
> in forky) to a version that stops shipping /lib64/ld-linux-x86-64.so.2, dpkg
> will unlink /lib64/ld-linux-x86-64.so.2 and then everything breaks.

Simplified: dpkg thinks that it has /lib64 while it should not. When we
drop that in forky, stuff breaks.

> The fix to that is either whatever separate general case fix is being done
> for aliasing, or if the whole point is we are trying to avoid having that
> sort of thing at all, then just put in a special case that dpkg will not
> unlink /lib64/ld-linux-x86-64.so.2.

Yes, if we add that special casing to dpkg, we can remove /lib64 in forky.

> So we end up with something roughly like this in dpkg (please excuse
> syntax/pointer errors):
> 
> Wherever file deletions are handled, make this change:
> - unlink(pathname);
> + special_unlink(pathname);
> 
> to use this:
> 
> char *SPECIAL_PATHS[] = {
> "/bin",
> "/lib",
> "/lib64",
> "/lib64/ld-linux-x86-64.so.2",
> "/sbin",
> NULL,
> }
> 
> void special_unlink(const char *pathname) {
> const char **special;
> for (special = SPECIAL_PATHS ; *special ; special++) {
> if (strcmp(pathname, special) == 0) {
> return;
> }
> }
> unlink(pathname);
> }

Might work, but the list of SPECIAL_PATH

Re: booststrapping /usr-merged systems (was: Re: DEP 17: Improve support for directory aliasing in dpkg)

2023-06-09 Thread Helmut Grohne
Hi Richard,

On Fri, Jun 09, 2023 at 01:07:13PM -0500, Richard Laager wrote:
> On 2023-06-09 11:26, Helmut Grohne wrote:
> > When upgrading (or
> > removing that package), dpkg will attempt to remove /bin (which in its
> > opinion is an empty directory and the last consumer is releasing it).
> > However, since dpkg has no clue about file types, it doesn't actually
> > know that this is a directory and takes care of the /bin -> /usr/bin
> > symlink using unlink(). And this is where /bin vanishes. Oops.
> 
> This might be a dumb question, but could we just special-case this? That is,
> dpkg would simply not remove /bin specifically? If the list of directories
> is small, known, and relatively fixed (e.g. /bin, /usr/bin, /lib), that
> might be workable.

Even if this was a dumb question, it's these kind of questions that -
surprisingly often - lead to new insights. So thanks for asking.

I caution that this protection mechanism of symlinks is a property of
the installation and not of dpkg. Depending on what dpkg is operating
on, we expect it to handle this or not. So we'd need a way to tell
whether an installation needs this kind of special handling. Anyway,
let's for now just assume that magically dpkg would magically save those
symlinks when we want to save them.

Now any package that moves files from / to /usr, needs to ensure that
the dpkg doing that move is recent. That's a dependency we cannot
express in theory. David Kalnischkies spent some time going into
detail[1] about this aspect. I think the bottom line is that for all
practical purposes we're probably fine if everything that moves also
gains a Pre-Depends on dpkg. Except that dpkg Pre-Depends on libc6,
which would now Pre-Depends: dpkg and we're back to our Pre-Depends
loop. So now we say "screw it" and let libc6 get a pass without this
Pre-Depends, because so many packages already have a Pre-Dependency on
dpkg, it'll probably get upgraded early and what could possibly go
wrong? On amd64, we'd upgrade libc6 before dpkg and then /lib64 would go
missing, because libc6 already is the last package that ships files in
/lib64. So really, libc6 is one of the few packages that really must
depend on that fixed dpkg. It also is one of the few packages that
really cannot.

As we cannot get out of this loop, we consider stretching the transition
over two releases. For trixie, we just update dpkg without moving files
and then for the trixie -> forky upgrade, we know (since we forbid skip
upgrades) that dpkg is fixed and then it actually works out without this
mess of Pre-Depends on dpkg.

My impression is that we'd like to have this done sooner rather than
later. At this time, the protective diversion seems like a fairly easy
and reliable mitigation with little downsides (except for having a new
transitively essential package) that helps us move forward faster.

Please don't stop asking. The chances that something about this is wrong
or missing something is significantly non-zero. I hope that this kind of
peer-review will get us to a solution that actually works in practice.

Helmut

[1] https://lists.debian.org/20230503130026.ixu4zlymo4fykdru@crossbow



Re: booststrapping /usr-merged systems (was: Re: DEP 17: Improve support for directory aliasing in dpkg)

2023-06-09 Thread Helmut Grohne
Hi Johannes,

On Fri, Jun 09, 2023 at 05:47:56PM +0200, Johannes Schauer Marin Rodrigues 
wrote:
> if I understand that plan correctly, the usrmerge-support package setting up
> diversions is only necessary because you want to avoid having to do the move 
> to
> /usr of *all* affected packages in the essential set in a single dinstall? Is
> that correct?

This is not correct. In bookworm -> trixie upgrade scenario, we intend
to move all the files from / to /usr. Now we look into how this happens
focus on one particular symlink, without loss of generality choose /bin.
Since /bin is no longer in the dpkg database at the end of the upgrade,
some package must be the last one to contain /bin. When upgrading (or
removing that package), dpkg will attempt to remove /bin (which in its
opinion is an empty directory and the last consumer is releasing it).
However, since dpkg has no clue about file types, it doesn't actually
know that this is a directory and takes care of the /bin -> /usr/bin
symlink using unlink(). And this is where /bin vanishes. Oops.

So the idea here is to add a protective diversion for /bin such that
removing /bin instead removes some path we don't care about. The
important thing now is that every package that moves stuff from /bin to
/usr/bin needs to ensure that this diversion exists. We can achieve that
in one of two ways. Either that some package (and with that I mean every
package that ships stuff in /bin, because we cannot predict which
package will be last) gains a preinst that sets up this diversion (on
behalf of base-files) or it Pre-Depends on some package that handles
setting up this diversion. It seems rather obvious that we might just
have a versioned "Pre-Depends: base-files (>= version that introduces
the diversion)", but then we get a pre-dependency loop from base-files
via an awk implementation to libc6 and then (via this new Pre-Depends)
back to base-files. So base-files cannot be the package that we list in
Pre-Depends here. And this is where usrmerge-support comes into the
picture. Any package that moves stuff out of one of the aliased
directories gains a Pre-Depends: usrmerge-support to protect the
aliasing symlinks from deletion.

Please note that this hasn't been obvious to me at all. I totally didn't
see this pre-dependency loop coming until dpkg told me when I actually
tried this.

So this usrmerge-support package very much is not for reducing that set
that we have to upload in one dinstall, but for making smooth upgrades
work at all.

This really is an important detail and I'm sorry for having missed it in
my previous mail. Thanks for asking.

> If yes, how many source packages are we have to be modified part from
> base-files, dash, bash, libc6, and util-linux?

Given the above, I think this no longer is relevant.

> Would it be too much to prepare patches for all of these, test that everything
> works with some QA setup and then NMU all 22 source packages with pre-approved
> patches in a single dinstall? Would that avoid having to temporarily go via a
> usrmerge-support package?

I have considered this approach and if it gains us something I
definitely see it as something to consider, but given the above, it
doesn't save us from usrmerge-support.

The other thing that we need usrmerge-support for is the dpkg-divert
wrapper. Any package that contains an aliased diversion or moves a
diverted file from an aliased location will likewise have to gain a
pre-dependency on usrmerge-support. Again, we cannot do this in
usr-is-merged, because that would kick usrmerge out of the default
essential set and thus break mmdebstrap.

Helmut



Re: booststrapping /usr-merged systems (was: Re: DEP 17: Improve support for directory aliasing in dpkg)

2023-06-09 Thread Helmut Grohne
Hi Raphaël,

On Thu, Jun 08, 2023 at 10:46:24AM +0200, Raphael Hertzog wrote:
> In the same spirit, I'd like to throw an idea... could we decide that
> base-files is the first package to be configured as part of the bootstrap
> protocol and change base-files maintainer's scripts into statically linked
> executables so that they can work even if we don't have the library loader
> on the ABI-compliant path?

Thanks for putting effort into this questions. You already figured that
this poses a problem to dpkg calling the maintainer script. Let me add
two further observations to further understand the solution space here.

dpkg has a --root flag and can be called externally. This is something
mmdebstrap already uses in some modes. That way, we avoid the issue you
presented for dpkg itself. Unfortunately, we cannot assume presence of
dpkg outside the chroot as debootstrap supports running on non-Debian
systems, so the --root flag doesn't actually help us here.

The other aspect is that maintainer scripts that are not interpreted
break chrootless foreign architecture bootstrap as the
base-files.preinst would be an executable that the processor cannot
execute.

In a vague reply to the other messages as well: I repeatedly got the
feedback that I have not sufficiently exploited the solution space. I
hear you. My lack of replies here shall indicate that I'm not done.

I have a vague sketch that seems to kinda work out for everything, but
maybe it still has some problems that I don't see yet. Let me summarize
it even though this very much is unfinished. Given my earlier
categorization of the solution space, this is a category 2 solution
addressing many of the problems mentioned there.

Update debootstrap (in bookworm and unstable) to create the symbolic
links after unpacking rather than before while still doing it before
running any maintainer scripts. This enables us to ship the symbolic
links in some data.tar while keeping bootstraps of bookworm and earlier
working as before.

Add a new package usrmerge-support (or whatever). It is a bit similar to
multiarch-support: It must not have any dependencies or
pre-dependencies. It will not have files, but maintainer scripts. Those
scripts set up protective diversions on behalf of base-files for the
symbolic links that cause aliasing. Then base-files will issue a
Pre-Depends on usrmerge-support (but not yet ship symlinks). I initially
thought, this could be part of usr-is-merged, but then base-files would
pull that and standard mmdebstrap would no longer pull usrmerge and
break. So it really needs to be a separate package. Anyway, once we have
protective diversions, we can move files without risking that dpkg
deletes the symbolic links.

Then we can actually perform that move of files to their canonical
locations except for a small set of locations including dash, bash,
libc6, and util-linux (maybe not exhaustive). [There is a lot of missing
detail about non-bootstrap aspects here.]

Once all essential packages (but the exceptions) have no files left in
aliased locations, we can upload base-files adding the symlinks together
with the packages previously kept unmodified in one dinstall. Before
that dinstall, things will continue to work normally. The protective
diversions will not affect unpacking, because dpkg only performs exact
matches on diversions. After that dinstall, base-files will create the
symlinks and things will hopefully work (because the patched debootstrap
only creates them after the initial unpack).

This still is a lot of wishful thinking. I've prototyped parts of this,
but not the entire story. I'm pretty sure it'll not work out as written
here, but maybe some adaption of it will unless insurmountable issues
pop up. For instance, debootstrap --variant=buildd (which currently
implies --no-merged-usr) will need a second thought.

You may now tell me why this is utter nonsense and why it cannot work at
all. Thanks.

Helmut



Re: 64-bit time_t transition for 32-bit archs: a proposal

2023-06-08 Thread Helmut Grohne
Hi Steve,

On Tue, Jun 06, 2023 at 12:45:42PM -0700, Steve Langasek wrote:
> I have a different read on the consensus here.  While there has been a lot
> of discussion about whether to continue supporting i386 as a host arch,
> almost everyone participating in the thread who said they want this is not a
> voting member of Debian.  The lone exception that I can recall from the
> thread was Guillem, who, as dpkg maintainer, is certainly a stakeholder in
> this decision (and since we don't really have an "i386 porting team",
> probably the most important individual stakeholder).

I concur. Given Simon's analysis and the replies even when combined with
earlier messages, I now see significantly more voices for the opinion:

i386 primarily exists for running legacy binaries and binary
compatibility with legacy executables is more important than correct
representation of time beyond 2038.

I'm inclined to call this consensus now and therefore ask those that do
not agree with it to reply here - even if your reply is only stating
that you disagree. As such, I think we can skip the GR part unless we
get (5?) disagreeing replies here.

Guillem, I understand that you see things differently, but that now
seems like a minority opinion to me. Are you ok with moving forward with
the proposed consensus-or-GR process? My understanding is that you
disagree with the opinion stated above, correct?

While Gunar also raises the question of whether i386 should continue as
a full or partial architecture, I do not think this is influences the
time_t bits decision. The default for now is keeping it as a full
architecture and the time_t migration does not benefit from changing
this. Therefore, I propose restricting the potential GR to the binary
way that Simon presented.

> Since my read is that Guillem was in the "rough" of "rough consensus", I
> asked him directly how we should move forward on a decision.  A GR is one
> option, and I think it's definitely a better option than going through the
> TC: while there is a decision to be made here about a "technical" detail of
> what dpkg-buildflags will do, you're right to point out that it's really a
> decision about what we want to support as a project.

Yes, dragging this question on is - as usual - the worst of options.

> Hmm, I don't share this particular concern.  PIE is a change to compiler
> behavior.  32-bit time_t is a change to defines that modify types (and
> prototypes) used in header files.  Maintaining a compiler is hard,
> maintaining a library ABI is "easy" - glibc has avoided breaking ABI for 25
> years so far.

The similarity is that both is changing flags. I expect that some
packages will need special handling for time64 and some of them may fail
to handle i386 correctly when they only match on bits. If we can get
maintainers to match on the resulting dpkg-buildflags rather than bits,
that's a non-issue probably.

> I am not keen to try to drive a GR on this, but if you raised one I'm likely
> to second it.

Cool. From my point of view consensus is better than GR is better than
deferring or invoking the CTTE. Hope consensus works out. :)

Helmut



Re: 64-bit time_t transition for 32-bit archs: a proposal

2023-06-06 Thread Helmut Grohne
Hi Steve,

On Tue, May 16, 2023 at 09:04:10PM -0700, Steve Langasek wrote:
> * … but NOT on i386.  Because i386 as an architecture is primarily of
>   interest for running legacy binaries which cannot be rebuilt against a new
>   ABI, changing the ABI on i386 would be counterproductive, as mentioned in
>   https://wiki.debian.org/ReleaseGoals/64bit-time.

I've been reading the discussion around i386 a bit and found the
direction it has taken a little unproductive. I hope we can agree that
there is no consensus on keeping or changing the time ABI for i386 while
there is quite some consensus for your plan on changing the time ABI for
all other 32bit architectures in roughly the way you brought forward.

While the i386 discussion seemed a little unproductive at times, I think
there is one major argument that I feel is missing here. If keeping the
32bit time ABI for i386, that effectively becomes a divergence from
every other architecture. i386 will be the one and only architecture to
be time32. As it happens, I have some experience with such divergence
from how bootstrapping interacted with other transitions such as PIE.
Maintaining this kind of divergence has a non-trivial cost. Over time it
becomes more and more difficult and less and less people are interested
in doing it. As such, I see the addition of this kind of divergence as a
way of killing i386.

Judging from the conversation, killing i386 quite obviously is desired
by some participants, but evidently not by all. How quickly we want to
kill it is not obvious to me. However, I think it is fair to say that
keeping time32 on i386 will kill it rather sooner than later. With
time32, we cannot reasonably extend i386 beyond forky as we'd be running
too close to the final deadline.

Some of you may have been aware of that Debian Reunion in Hamburg
recently. There was a BoF on how Debian should decide about non-trivial
matters and one result of that BoF was "maybe we should GR more often".
I think the decision of what to do with time32 is not a really important
one despite some people being very opinionated about it. How about
settling it using a GR anyway? We perceive GRs as painful and there is a
saying that if something is difficult, let's do it more often. How about
trying to do GRs more often with this decision? I think it is pretty
clear that neither answer is wrong. It's a choice that we have make and
then to stick to. And we can learn something about whether GRs really
are painful. I think the worst of outcomes we could get here is going
into much further detail in a GR and adding lots of competing proposals
there. If that were to happen, I'd consider the experiment as failed.
Leaving the details to those who put up with the work (and that quite
obviously is Steve et al here) is important in my book. So unless we can
do it as simple as "i386 should keep being time32" vs "i386 should
become time64 by default", we probably shouldn't GR it.

Helmut



Re: another problem class from /usr-merge [Re: Bug#1036920: systemd: please ship a placeholder in /usr/lib/modules-load.d/]

2023-05-31 Thread Helmut Grohne
Hi,

On Tue, May 30, 2023 at 11:53:00AM +0200, Helmut Grohne wrote:
> In effect, this bug report is an instance of a bug class. I am in the
> process of quantifying its effects, but I do not have useful numbers at
> this time. As an initial gauge, I think it is about 2000 binary packages
> that ship empty directories (which does not imply them to be affected,
> rather this is to be seen as a grossly imprecise upper bound).

I did some more analysis work here and have to admit that I know my data
model has a weakness that may result in false negatives. I'd have to do
a complete reimport of packages and eventually will, so for now I'm
dealing with incomplete data here. I note that content indices do not
cover empty directories, so you really have to download loads of .debs
to find these.

Anyway, to gauge the problem, we're effectively looking for a
combination of packages A and B such that:

 * A ships an empty directory.
 * That empty directory is a path affected by aliasing (either in /usr
   or /).
 * B also ships that directory (e.g. non-empty) in the "other"
   representation of that path.

While we have lots of empty directories in Debian, that third condition
trims down the numbers rapidly. A lot of empty directories (on amd64)
are one of the following:
 * /lib
 * /usr/bin
 * /usr/lib
 * /usr/lib/x86_64-linux-gnu
 * /usr/sbin

I've ignored these, because all of these are shipped in some essential
package and thus are not at risk of removal. /lib is kinda special in
this list as the idea of fixing this up actually is removing /lib (the
directory according to the dpkg database) and replacing it with a link,
but we'll have to treat that special anyway, so not relevant here.

What remains is:
 * /usr/lib/modules-load.d is empty in systemd and aliased by 6
   packages. This is the original instance that Andreas filed. If we
   were not having this moratorium, the obvious fix were to move all
   those 6 files.

 * /usr/lib/pkgconfig is empty in gretl libjte-dev libmpeg3-dev
   libswe-dev pcp pkg-config pkgconf pkgconf-bin and aliased in
   multipath-tools. Again, if it were not for the moratorium, we'd want
   to fix multipath-tools. However, in this instance, we can "bypass"
   the moratorium by moving /lib/pkgconfig/libdmmp.pc to
   /usr/lib//pkgconfig/libdmmp.pc. It also seems to bundle a
   shared library improperly. Chris Hofstaedtler confirmed this on IRC
   and reminded us to never link any of those. The only package in the
   archive that tries to do that (qemu) has its multipath integration
   disabled, so this is not presently a problem. Probably, a better
   solution is not not ship any header nor .pc file in multipath-tools
   at all as that avoids accidental linking.

 * /usr/lib/systemd/system is empty in amazon-ec2-net-utils and aliased
   in lots of other packages. This probably is a regression caused by
   #1034212 and that directory simply needs to be deleted.

 * /lib/udev/rules.d is empty in python3-expeyes and aliased in lots. I
   think this practically is a non-problem, because python3-expeyes
   Depends: udev and udev ships that directory in that representation.
   It will become a problem once udev canonicalizes paths. Jochen
   Sprickerhof pointed out that python3-expeyes really needs this empty
   directory in its postinst script.

 * /usr/lib32 is empty lib32lsan0 and aliased in 5 packages. I think it
   can be dropped there. This also bears another problem. Since removing
   lib32lsan0 deletes /usr/lib32, we are left with a dangling /lib32
   link.

 * /usr/libx32 is empty in libx32lsan0 and aliased in libc6-x32. I think
   it can be dropped there. Likewise, /libx32 can become dangling
   otherwise.

So yeah, this bug class is clearly not one to panic about. As we move
files from / to /usr, I expect this bug class to gain more occurences. I
am not aware of a generic solution and it seems diversions won't cut it.
If you can propose any generic workaround or recipe for this situation,
I'm all ears. The placeholder file sounds ugly, but might work.

I still don't have any data on the multiarch variant of this problem. My
local representation of the archive is unsuitable for analysing this and
I have to perform a complete reimport first. Also placeholder files
won't cut it here.

Helmut



Re: another problem class from /usr-merge [Re: Bug#1036920: systemd: please ship a placeholder in /usr/lib/modules-load.d/]

2023-05-30 Thread Helmut Grohne
Hi Luca,

On Tue, May 30, 2023 at 11:23:07AM +0100, Luca Boccassi wrote:
> > > - unmerged-usr paths are no longer supported
> >
> > Then you argue that this bug would affect only unmerged systems, while
> > it actually is in reverse. Unmerged systems are unaffected by this bug
> > class. The deletion that Andreas describes can only happen due to the
> > aliasing introduced by merging. This bug class only affects merged
> > systems.
> 
> No, this bug report only affects unmerged systems and has no effect on
> merged ones, as the actual bug after the analysis and discussion is
> that some packages since Bullseye install modules-load.d/ files in the
> wrong directory, that nothing actually reads (since Bullseye!),
> effectively making them useless, but nobody ever noticed, and I can
> only speculate that this could be due to the fact that the vast
> majority of systems have been merged and thus there's no difference
> (alternatively it could be that such packages have extremely low
> popcon, I have not checked). If these packages were used on unmerged
> system, these bugs would be very real - the functionality they provide
> would be broken.

Given that we are saying exactly the opposite of each other, it seems
likely that we are talking about different things (thanks to that kind
soul pointing it out to me).

As I read your reply, it seems to me that you see the bug in
multipath-tools and other packages that ship files in
/lib/modules-load.d as opposed to /usr/lib/modules-load.d. Assuming
that's your view, what you write very much makes sense - including the
assertion that it only affects unmerged systems. Do you confirm? If you
confirm, I'd see what you see as the bug we are talking about as not an
issue in systemd at all, but as multiple issues in other packages (such
as multipath-tools) that fail at integrating properly with systemd (when
unmerged, which is unsupported, so not worth fixing in bookworm). Given
that the bug at hand is filed against systemd (rather than
multipath-tools), it did not occur to me earlier that you were having
this problem in mind.

As I understand what Andreas wrote (maybe he can confirm), the problem
he sees is that /usr/lib/modules-load.d (the directory) disappears when
removing other packages such as multipath-tools. So it's very much not
about whether systemd deals with the dropins placed by multipath-tools.
It's about removal of a package having unintended side-effects (removing
a directory still owned by systemd). And this very problem, can only be
experienced on merged /usr. The absence of a directory may not seem like
a big deal to you and none of us seems convinced that it has a
practically relevant impact on using systemd, but it very much has an
impact on piuparts and testing migration and that - to me - is what this
bug report has been about.

Does that make more sense to you now?

Helmut



Re: another problem class from /usr-merge [Re: Bug#1036920: systemd: please ship a placeholder in /usr/lib/modules-load.d/]

2023-05-30 Thread Helmut Grohne
On Tue, May 30, 2023 at 11:53:01AM +0200, Helmut Grohne wrote:
> Are there other kinds of resources in dpkg that can be shared like
> directories? Thinking... Yes, regular files. How can files be shared?
> Via Multi-Arch: same.  Can that happen for real? Yes. I've attached an
> artificial reproducer.  Does it happen in the archive? I really cannot
> tell yet. In effect, this is yet another bug class derived from Andreas'
> directory-loss bug class. This new file-loss bug class is distinct from
> the file-loss bug class that resulted in the moratorium.

Sorry, for the double mail. That attachment thing is so easy to forget.

Also dropped Luca as his email bounces.

Helmut


multiarchbug.sh
Description: Bourne shell script


another problem class from /usr-merge [Re: Bug#1036920: systemd: please ship a placeholder in /usr/lib/modules-load.d/]

2023-05-30 Thread Helmut Grohne
Context for d-devel:

Andreas Beckmann noticed that systemd ships an empty directory
/usr/lib/modules-load.d. When removing a package that ships a file in
/lib/modules-load.d (such as multipath-tools), dpkg may in some
circumstanced delete the empty directory owned by systemd.

On Mon, May 29, 2023 at 07:24:09PM +0100, Luca Boccassi wrote:
> Given what was discussed:

I think the conclusion is drawn too quickly here.

> - bookworm is in hard freeze
> - there is no functional impact

In effect, this bug report is an instance of a bug class. I am in the
process of quantifying its effects, but I do not have useful numbers at
this time. As an initial gauge, I think it is about 2000 binary packages
that ship empty directories (which does not imply them to be affected,
rather this is to be seen as a grossly imprecise upper bound).

> - unmerged-usr paths are no longer supported

Then you argue that this bug would affect only unmerged systems, while
it actually is in reverse. Unmerged systems are unaffected by this bug
class. The deletion that Andreas describes can only happen due to the
aliasing introduced by merging. This bug class only affects merged
systems.

In my earlier reply, I also asked Andreas for a practical impact on
systemd users and suggested lowering the severity of this instance.
However, there is more to consider. This poses a problem to piuparts and
thus testing migration. Making piuparts happy is a use case of its own.
When a mitigation for non-essential adduser broke piuparts (again, I'm
sorry about that), the release team decided that piuparts is an
important piece of the release process and therefore the change was to
be reverted. As a result, apt now depends on adduser in bookworm again.
To be clear, I fully support the decision that has been made here and
thank the release team for dealing with resulting issues (e.g. delayed
migration of other packages). Since the problem we are discussing here
is quite similar, I argue that this problem class also should be
considered release critical in general, because it may impact testing
migration. That being said, IANARM and I therefore leave that judgement
to others.

> - as soon as trixie opens for business we might just canonicalize
> everything (assuming all the ducks will be in a row)

You make this look like a simple way forward. For now, I am unconvinced
that canonicalizing paths is the cure to this problem. To dpkg, a
canonicalization looks like removing a file and adding a different file.
Thus the deletion effect that Andreas reported may kick in while
performing that canonicalization. It probably is not that simple though.
As far as I understand it, dpkg first adds new files and then removes
the old ones thus seeing that the directory it tries to delete is not
empty (and we've seen it issue warnings about that case). To me, this
means that we (or rather I) don't understand the problem well enough to
judge it. It might be harmless, but it might be real. We shouldn't be
scared, but "it probably works" may not be the best approach either.

And then Andreas got me thinking. Before delving into that, I'd like to
again express thanks to Andreas. When we see a bug from Andreas, can we
please start with thanking him? Even if the bug ultimately is due to a
limitation in piuparts (as has happened in the adduser case), his work
(and that of other piuparts people such as Nicolas) still adds a lot of
value to Debian. The occasional report that looks harmless initially
tends to point at real problems more often than not.  When he writes a
mail, it is full of detail for looking at the issue. I ask us all to
better appreciate that work. Let me do that now: Thank you Andreas,
Holger and Nicolas!

So let's stack-pop to where he got me thinking. A directory is a
resource that can be shared between packages. Andreas demonstrated that
removing one package may remove such a shared resource still being
needed when another package references it via an aliased path. In
effect, we break dpkg's reference counting of shared resources.

Are there other kinds of resources in dpkg that can be shared like
directories? Thinking... Yes, regular files. How can files be shared?
Via Multi-Arch: same.  Can that happen for real? Yes. I've attached an
artificial reproducer.  Does it happen in the archive? I really cannot
tell yet. In effect, this is yet another bug class derived from Andreas'
directory-loss bug class. This new file-loss bug class is distinct from
the file-loss bug class that resulted in the moratorium.

Etienne Mollier pointed out that "dpkg --verify" helps with diagnosing
whether unexpected file deletion has happened on a particular system. It
also reports other diagnostics and it does not consider any
--path-excludes that have been configured via /etc/dpkg/dpkg.cfg, so use
the output with a grain of salt. Also reinstalling the affected packages
generally resolves the problem on a particular system.

I wish I could give you numbers. I don't have them. I cannot

Re: Future of GNU/kFreeBSD in the debian-ports archive

2023-05-29 Thread Helmut Grohne
On Mon, May 29, 2023 at 06:11:15PM +0200, Aurelien Jarno wrote:
> Over the past year, GNU/kFreeBSD hasn't seen any significant
> development. After reaching out to various individuals involved, it
> seems unlikely that the situation will change in the foreseeable future.
> Here are some statistics that support this observation:
> 
> - The last buildd upload for kfreebsd-amd64 and kfreebsd-i386 was over a
>   year ago.
> - There have been no porter uploads for kfreebsd-i386 in the past year.
> - In the last year, only 11 porter uploads for kfreebsd-amd64 have been
>   recorded, with the most recent one occurring over two months ago.
> - Only approximately 30% of the packages on these architectures are
>   up-to-date.

 - kfreebsd has not been bootstrappable in a while and I removed it from
   rebootstrap QA in June 2020 after a while of pings. I did not receive
   complaints since and no interest in bootstrapping it again.
 - tar FTBFS on kfreebsd-amd64 since 2016. It is bd-uninstallable since
   2022.
 - I hope that Matthias Klose replies as he was hinting at the
   maintenance cost in gcc.

> With my ports-master hat, I think it is time to consider the removal of
> both the kfreebsd-amd64 and kfreebsd-i386 architectures from the
> debian-ports archive. I would like to emphasize that packages will still
> be available on snapshot.d.o for anyone interested in reviving the port.

I concur. Like Jessica, working with kfreebsd was an interesting
adventure to me. I have good memories of having worked with Steven
Chamberlain before Jessica, thanks.

> In any case, I am waiting for feedback, and I will wait for at least a
> month before taking any action.

I appreciate your careful way of doing this.

Helmut



Re: 64-bit time_t transition for 32-bit archs: a proposal

2023-05-28 Thread Helmut Grohne
Hi Steve,

On Wed, May 17, 2023 at 10:39:21PM -0700, Steve Langasek wrote:
> > I note that this may pose problems with intra-library interaction. Say
> > we need to enable time64 on a higher level library and a lower level
> > library does not use time_t, but uses off_t. As such, you'd opt out of
> > lfs on the lower level library, but the upper one uses it with lfs by
> > having enabled time64. How do you intend to deal with such cases?
> 
> In such a case the lower-level library should opt in to lfs and have a
> package name change as well.  Up to this point I've casually assumed there
> weren't any such packages, but this can also be detected via static analysis
> of the archive.

Did you encounter such cases in your updated analysis?

> > Something that would help with this transition would be a
> > checker-as-a-service kind of thing that indicates:
> >  * Is my package affected by time64?
> >  * Does my package enable time64?
> >+ On i386?
> >  * Do time64 changes affect downstream packages?
> >+ Which?
> 
> > I understand that answering these questions on a per-package basis is
> > far from trivial. That much is evident from your analysis. I think this
> > is ok. Even if such a service says "unknown" 10% of the time, that'd
> > still be very useful. Do you think you could turn your analysis into a
> > continuous checking service?
> 
> This sounds like a substantial amount of work (and computing resources, to
> enable this to "continuously" check) and I don't think I understand how it
> would help the transition, if all of the library transitions are being
> coordinated centrally.  Could you elaborate?

I'm not sure how this would look like exactly. I see some options:
 + The lists that already exist describe the second level of
   affectedness (changes ABI) already.
 + A double-rebuild varying time64 could employ reproducible builds to
   judge affectedness (generally) and we would get a list of unaffected
   packages in particular.
 + If we enable this via dpkg, .buildinfo files can be used to track how
   many packages have been rebuilt with time64 (given that ben will not
   always see a difference).
 + Finally the benfile you proposed may also help to judge this.

I don't yet quite see how this transition can accurately be tracked
using ben, but I'm hoping for a positive surprise here. Failing that,
fusing some of these information sources into a continously updated view
would improve our understanding of how far this went.

I agree that the amount of work to spend on such tracking needs to be
balanced. In effect, we practically do want to rebuild much of the
archive here and while some of that will cause dependency changes, there
are enough examples that will be invisible to ben to justify a bit more
tracking effort here in my view. I hope you can agree with this view.

Helmut



booststrapping /usr-merged systems (was: Re: DEP 17: Improve support for directory aliasing in dpkg)

2023-05-17 Thread Helmut Grohne
Hi,

This bootstrap aspect got me and I discussed this with a number of
people and did some research.

On Sun, May 07, 2023 at 12:51:21PM +0100, Luca Boccassi wrote:
> I don't think this is true? At least not in the broader sense: if you
> compile something on Debian, it will obviously get linked against
> libraries and dependencies as they are in Debian.
> Perhaps what you mean is that, given an entire separate sysroot-like
> tree, passing the appropriate compiler and linker flags and
> environment variables, you can use the local compiler we ship to build
> 'foreign' programs. That is true, but again it requires to set up the
> environment appropriately, including linker flags. And the caller
> needs to ensure the environment, including linker flags, is
> appropriate for the target environment (I guess 'host' environment, in
> GNU parlance). Therefore, I don't think it would be unreasonable to
> require that if the target environment is split-usr, then the caller
> also needs to specify an appropriate
> '-Wl,--dynamic-linker=/lib/ld-whatever' option.

Given the feedback, I am convinced that changing PT_INTERP is a stupid
idea regardless of whether it is technically feasible. There must be a
better way. Let's step back a bit.

The underlying problem here is performing the initial filesystem
bootstrap. The semantics of this are a bit vague as they are not spelled
out in policy, so we will have to derive them from implementations.

I think the major players are (in descending popularity):
 * debootstrap
 * mmdebstrap
 * cdebootstrap
 * multistrap

multistrap predates mmdebstrap and when there was no mmdebstrap, I used
it a lot. When attempting to test it, I totally couldn't convince it to
bootstrap from an unsigned or locally signed repository.  The patch in
#908451 didn't cut it. I also note that it creates a /lib64 -> /lib
symbolic link which feels quite incompatible with merged-/usr.  For
these reasons, I am dropping multistrap from the tools under
consideration and recommend removing it from the archive. If you happen
to use multistrap, now would be a good moment to tell me. Personally,
all of my use cases of multistrap have been converted to mmdebstrap and
that made a lot of things simpler.

cdebootstrap vaguely works though unsigned operation seems dysfunctional
as it runs apt-get update during cdebootstrap-helper-apt.postinst and
that fails. I happen to not have figured out why and treat this failure
as a success.

So the most popular implementations quite evidently are debootstrap and
mmdebstrap and both "just work". I note though that they work quite
differently:
 * debootstrap (depending on flags including --variant) pre-merges its
   chroot while mmdebstrap relies on packages doing it.

   I think that the question whether a distribution is merged is a
   property of the distribution and not the bootstrap tool, so I
   strongly recommend following mmdebstrap's view on this. The
   debootstrap way means that we have to include patches for every
   derivative, which is a process that does not scale well.

 * mmdebstrap operates in two phases. It first unpacks and configures a
   rather minimal set of packages and then proceeds to adding packages
   passed to --include in a second phase once essential is fully
   configured while debootstrap immediately unpacks everything.

   I think the debootstrap approach is slightly worse here, because it
   means that preinst scripts of non-essential packages cannot rely on
   essential packages having been configured.

In any case, we have to deal with both behaviours.

After this little excursion into bootstrap technology, let's go back to
the /usr-merge and its effects.

I think at this point, we have quite universal consensus about the goal
of moving files to their canonical location (i.e. from / to /usr) as a
solution to the aliasing problems while we do not have consensus on
precisely how to do this (i.e. with changing dpkg or without). If you
believe that this is not consensus, please speak up.

So in a distant future our packages will not contain any files in /bin
or /lib. In particular, this affects /bin/sh and the dynamic loader,
both of which are required to run maintainer scripts, which are
currently required for creating the symbolic links. Boom.

Solutions have been proposed to this and I think they all fall into one
of the following four categories.

 1. Don't move. We just keep those files that require a particular
location (such as /bin/sh or the dynamic loader) in their
non-canonical location. As such, maintainer scripts will be able to
run and perform the conversion to symbolic links afterwards.

 2. Move and ship links. Since we unpack all essential data.tar before
running the first maintainer script, having one package contain the
compatibility symlinks is enough to fix the problem.

 3. Move and avoid using non-canonical locations. This is the approach
where we write maintainer scripts as #!/usr/bin/sh and consider

Re: 64-bit time_t transition for 32-bit archs: a proposal

2023-05-17 Thread Helmut Grohne
Hi Steve,

On Tue, May 16, 2023 at 09:04:10PM -0700, Steve Langasek wrote:
> Over on debian-arm@lists, there has been discussion on and off for several
> months now about the impending 32-bit timepocalypse.  As many of you are
> aware, 32-bit time_t runs out of space in 2038; the exact date is now less
> than 15 years away.  It is not too early to start addressing the question of
> 32-bit architecture compatibility, as there are already reports in the wild
> of calendar failures for future events on 32-bit archs.

Thanks for not having dropped the ball on this and for the detailed
analysis.

> Based on the analysis to date, we can say there is a lower bound of ~4900
> source packages which will need to be rebuilt for the transition, and an
> upper bound of ~6200.  I believe this is a manageable transition, and
> propose that we proceed with it at the start of the trixie release cycle.

Can we try to distinguish affected packages into those that change their
downstream interface (i.e. mostly shared libraries changing ABI) and
those that do not (e.g. application packages)? Do I understand correctly
that those 500 mentioned earlier are the former category and the these
4900 to 6200 are the latter category?

Please bear in mind that packages are already starting to enable time64
support. coreutils is built with time64 for a while already and tar was
recently switched to time64 (see #1026204).

As such, I think it would be good to treat these categories as more
distinct. When in doubt, we should treat a package as breaking its
interface though.

> === Technical details ===
> 
> The proposed implementation of this transition is as follows:
> 
> * Update dpkg-buildflags to emit -D_FILE_OFFSET_BITS=64 and -D_TIME_BITS=64
>   by default on 32-bit archs.  (Note that this enables LFS support, because
>   glibc’s 64-bit time_t implementation is only available with LFS also
>   turned on, to avoid a combinatorial explosion of entry points.)

There already is a pending patch for feature+time64 see #1030159. You
are asking to enable this by default effectively.

> * … but NOT on i386.  Because i386 as an architecture is primarily of
>   interest for running legacy binaries which cannot be rebuilt against a new
>   ABI, changing the ABI on i386 would be counterproductive, as mentioned in
>   https://wiki.debian.org/ReleaseGoals/64bit-time.

I think this needs a second thought. coreutils and tar already enable
time64 on i386 and I expect that more things will. From my point of
view, such updates are a good thing and we should not skip them on i386,
because they do not affect ABI.

It is not as simple as this unfortunately. While coreutils and tar have
been simple, other packages will likely depend on their libraries being
time64. In such cases, we will be unable to enable time64 unless we also
the underlying libraries do so a well. Then we have maintainers (such as
Russ, but I also vaguely remember other maintainers) who want to bump
soname to unconditionally enable time64.

As a result, I expect that i386 would become a wild mixture of time64
and time32. Do you have thoughts on how to deal with the resulting mess?

I would also like to point out that there is a third option on the table
that nobody seems to be talking about. Instead of changing (breaking)
the ABI of libraries, we may also consider adding a time64 ABI to
existing libraries. A header can trivially detect whether time64 is
being requested by checking the relevant macros and in such cases divert
functions affected by time64 to 64bit-aware variants. Thus, a library
may become time64-compatible without breaking ABI with non-time64 users.
The obvious downside of this is that is quite a lot of effort and is
probably infeasible unless upstream cooperates, but I think this should
be considered as an option for difficult cases where we have both
non-lfs and time64 downstream users in large numbers. Do you agree?

> * For a small number of packages (~80) whose ABI is not sensitive to time_t
>   but IS sensitive to LFS, along with their reverse-dependencies, filter out
>   the above buildflags with DEB_BUILD_MAINT_OPTIONS=future=-lfs[1]. 
>   Maintainers may choose to introduce an ABI transition for LFS, but as this
>   is not required for time_t, we should not force it as part of *this*
>   transition.  If there is a package that depends on both a time_t-sensitive
>   library and an LFS-sensitive but time_t-insensitive library, however, then
>   the LFS library will need to transition.  

I note that this may pose problems with intra-library interaction. Say
we need to enable time64 on a higher level library and a lower level
library does not use time_t, but uses off_t. As such, you'd opt out of
lfs on the lower level library, but the upper one uses it with lfs by
having enabled time64. How do you intend to deal with such cases?

> * In order to not unnecessarily break compatibility with third-party (or
>   obsolete) packages on architectures where the ABI is not act

Re: DEP 17: Improve support for directory aliasing in dpkg

2023-05-08 Thread Helmut Grohne
Hi Luca,

On Tue, May 09, 2023 at 01:56:53AM +0100, Luca Boccassi wrote:
> On Mon, 8 May 2023 at 19:06, Sean Whitton  wrote:
> > It's designed to stop as-yet-unknown problems happening, too.
> 
> Well, sure, but we've been at this for years, any such problems should
> really be known by now. This is with Bookworm as it stands of course,
> when we go in and make more changes then we obviously have to be
> careful, but that's the entire reason this thread exists and is still
> going on.

This actually feels rather worrying to me. On one hand, you say that
problems should be know. On the other hand, you proposed a simple
transition with quite a number of problems that you apparently didn't
see coming. Even relatively simple mechanisms, such as just repacking
all the .debs to ship files in their canonical location and then trying
to install them, revealed a dpkg unpack error in zutils. This
combination of claiming that problems should be known while at the same
time apparently not knowing them makes me uneasy to move forward here.

So while I want to see the moratorium lifted, it all makes a lot more
sense to me given what we've seen in this thread. The worst of outcomes
I see here is the one where we cause problems that don't have a good
solution as any way forward would break someone's use case (with
someone's use case often being smooth upgrades in one way or another).
It's those where we cannot move forward nor revert.

Helmut



Re: DEP 17: Improve support for directory aliasing in dpkg

2023-05-08 Thread Helmut Grohne
On Mon, May 08, 2023 at 02:07:08AM +0100, Luca Boccassi wrote:
> I can see we don't agree on this matter, of course, that is clear. And
> I hope we can find common ground. But let me provocatively ask this
> first: is the same rule going to be enforced for all other changes
> that happen in the project that might affect external packages? If
> anybody points out past changes, recent or less recent, that caused
> issues for third party packages, will the TC ask for those changes to
> be reverted or otherwise modified accordingly? Will a change to Policy
> be proposed that spells out that third party packages cannot ever be
> broken, no matter what they do, and must always work?

I'm not sure about the TC's role in this. For the record, I am doing all
of the analysis (and design work) in this thread without a TC hat. I
also cannot comment on what the TC is going to rule this matter. Can we
leave that aside or formally file it there if you see a need?

I agree that what we support is vague at best and we can readily see
from earlier conflicts that this is a recurring matter. We still
disagree over how much maintainers should support sysvinit. I've also
quite recently failed at properly preparing a transition (non-essential
adduser) and while we could write about it in release-notes, what is
going to happen is that we'll revert it for bookworm and then I can
retry properly.

You may also have noticed that my analysis of possible problems in this
thread very much reasons about packages shipped in Debian releases. I
would actually like to call external packages and local diversions
unsupported, but I was rightfully criticised that this is falling short.

So no, I cannot tell you where the boundary of our support is. I
initially assumed it to be closer to where you paint it and am now
trying to adapt to meet the expectations of others.

For instance, I've also reached out to DSA and inquired on their use.
While I haven't found local diversions or local statoverrides in
dsa-puppet.git, it seems that a number of external packages ship files
in /sbin or /lib (including udev rules and systemd units).

> The more pre-depends, the more constraints we put on apt. I do not
> have a specific scenario in mind as we don't even have a full set of
> changes to look at, but it seems clear to me it will have _some_
> effect, no?

We've been there with multiarch-support and my experience with that
suggests that the primary effect is increasing the size of Packages
files. Though given that you are obviously worried here, I suppose more
research is warranted.

> Sure that's a legitimate concern, however, wouldn't it fall into the
> "needs special handling" bucket? It is a case where the file is moving
> both in location and package, so it is covered by the blank statement
> "either don't do that or implement the required workaround via
> diversion/conflict/etc". What am I missing?

You are missing the distribution of responsibility. Quite commonly,
backports are performed by someone else than the package maintainer.
Yet, an uncoordinated backport can now render the package in unstable
rc-buggy.

> But the more I think about it, the more I am convinced that the
> default option working best for Debian is the one that matches the
> project's choice of a filesystem layout. After all, this is
> configurable in the toolchain for a reason.
> And the vast majority of the rest of the world has long since finished
> this transition, so I struggle to think where software built with this
> default wouldn't work. Bullseye will be oldoldstable at that point,
> and even that was default merged for new installations, and really old
> ones (oldoldoldoldstable at that point? I lost count) will be long
> EOL. I suppose they could still be around unmaintained, but who uses a
> toolchain from 8 years in the future to build software for an EOL
> distribution 8 years in the past? Normally it's the other way around,
> as even glibc adds new symbols and is not forward compatible.

This seems somewhat convincing to me. Would you reach out to toolchain
maintainers to discuss this as an early change after the release of
bookworm?

> On the ELF interpreter, as long as we can reasonably ensure it works,
> I do believe we should switch it, regardless of what we do with the
> symlinks, how we ship/add/build/package/create/manage them, as a
> desired final state. Again, we should make the default in Debian work
> for Debian. And given the default for Debian from Bookworm onward is
> that the loader is in /usr/lib/, it seems perfectly reasonable to me
> that it software built for Debian and shipped in Debian should look
> there for it.

I suppose that we've been confusing the different approaches here. The
question of what links base-files should contain mostly arises if you
start from the assumption that we do not modify the ELF interpreter
location. Once changing its (and /bin/sh's) location, the question of
how to install those symlinks can indeed be done in b

Re: DEP 17: Improve support for directory aliasing in dpkg

2023-05-07 Thread Helmut Grohne
Hi Luca,

On Sun, May 07, 2023 at 12:51:21PM +0100, Luca Boccassi wrote:
> The local/external aspect is already covered in Ansgar's reply and subthread.

I hope that we can at least agree that we don't have consensus on this
view. And the more I think about it, the more it becomes clear to me
that this non-consensus is part of the larger disagreement we have about
this whole transition. Do you see any way towards getting to common
ground here?

> Sure, but adding changes that are (seemingly) unnecessary for a large
> percentage of affected packages also brings uncertainty. Every
> software has bugs, thus it follows that injecting more software in the
> way of a package being installed will likely also inject bugs. Which
> doesn't mean we shouldn't consider it, however, it should be weighted
> appropriately.

Let me put this into perspective. In this scenario, we will have a few
packages with versioned Pre-Depends on usr-is-merged. The seemingly
unnecessary change here is adding more Pre-Depends of the same kind to
many more packages. It seems very likely to me that one of the few
Pre-Depends will cause usr-is-merged to be upgraded early and thus those
possibly unnecessary Pre-Dependencies will be harmless. Do you actually
have some scenario in mind that would warrant judging this as risky
beyond suspicion? (Which is not to say that there is no risk as the
whole affair bears quite some risk.)

> Packages that need special handling will need special handling for
> backporting too. This is nothing new, there was never a project-wide
> guarantee that a package uploaded to testing can apply 1:1 to
> backports, it is common enough to require changes/reverts/adjustments,
> and if it's fine to require that in other cases, it's fine for this
> case too.

It seems that you missed my argument and it likely wasn't spelled out
explicitly enough, so let me retry. Yes, you may need to adapt packages
that are being backported. We don't disagree about that (and hope people
get it right, which they won't, but so be it). The really bad thing here
is that a backports upload may require changes to the package in
unstable!

Say we packaged foo version 1 in stable and it puts everything in /bin.
Then we update foo to version 2 in unstable and foo gains a new
/bin/bar. Due to the debhelper addon, this is actually shipped as
/usr/bin/bar. Great. Then we backport foo version 2 to stable. Given
that debhelper no longer moves, it'll be /bin/bar. Then we notice that
foo is not laid out nicely and we split a bar package from it in version
3 and move /usr/bin/bar into bar. Now a user may install stable, install
foo version 1, install the foo version 2 backport and then update to
nextstable. In that stable upgrade, bar version 3 may be unpacked before
foo version 3 and as a result /usr/bin/bar goes missing when the
backported foo version 2 gets upgraded to the regular foo version 3 as
this deletes /bin/bar.

So when we backport a package, the unstable package may need to be
modified to avoid such unpack file loss scenarios. In a simple case, we
may be able to just add Conflicts, but the takeaway is that backporting
a package may now break upgrades to nextstable in a way that requires
fixes in nextstable to accommodate for such upgrades.

> If the majority of packages are simply converted, with no manual
> handling and no diversion, then it should be simple to handle: the
> debhelper in stable will not perform the conversion by definition as
> the logic won't be present, and any dh upload to backports will have
> such logic disabled, so that other packages that get uploaded to
> backports and built with either the stable or the backports debhelper
> won't have any change performed on them.

As much a I'd like to trust you on things actually being simple, we've
seen over and over again that the simple approaches have non-trivial
flaws. If you were to highlight resulting problems (and propose
solutions), that would be more convincing to me than continuously
labeling it simple.

> Or to put it in another way: I think our defaults should prioritize
> the Debian native use case. Given we ship our loader in /usr/lib/ld*
> now, it makes sense to me that the default in GCC is to point to
> /usr/lib/ld*. Callers can override that as needed for
> third-party/external/foreign use cases.

I guess you'll be having a hard time convincing the toolchain
maintainers of this change, but my other point was that this is
unnecessary when we can use patchelf after the fact.

> > How about the long-term vision of this? Elsewhere you indicated that
> > you'd like the aliasing symlinks to not be shipped by any data.tar. Does
> > that imply that we'd keep patching the interpreter and using /usr/bin/sh
> > forever in the essential set? If adding the links to base-files, it
> > would be of temporary nature only.
> >
> > If adding the symlinks to base-files, how about /lib64? Would we ship it
> > for all architectures or just for those that need it (e.g. amd64,
> > l

Re: DEP 17: Improve support for directory aliasing in dpkg

2023-05-06 Thread Helmut Grohne
Hi Luca,

On Sat, May 06, 2023 at 09:47:15PM +0100, Luca Boccassi wrote:
> Sure, there are some things that need special handling, as you have
> pointed out. What I meant is that I don't think we need special
> handling for _all_ affected packages. AFAIK nothing is using
> diversions for unit files or udev rules, for example (I mean if any
> package is, please point it out, because I would like a word...). I

I've posted a list in
https://lists.debian.org/20230428080516.ga203...@subdivi.de and indeed,
udev rules are being diverted in one case.

But then, you only capture diversions inside Debian's ecosystem and miss
out on other kinds of diversions such as local diversions. We currently
support imposing local diversions on pretty much arbitrary files
including unit files. And while we've occasionally moved files between /
and /usr before the transition, doing it for 2000 packages significantly
grows the risk of it affecting someone. So really, we want all such
diversions duplicated before unpacking a package the has moved its
files. The way to achieve that is Pre-Depending on usr-is-merged. To me,
this sounds like we really want some special handling for all affected
packages.

I also caution that we've started from a very simple approach and tried
fixing it up to address the problems that we recognized in analyzing it.
My impression is that we are not finished with our discovery here and
won't be for quite some time. This uncertainty of what else we might
break is the most significant downside I see with your approach.

> very strongly suspect this will be a small minority out of the total
> of ~2k would need this treatment, and the vast majority would not. Do
> you disagree with this gut feeling?

I do disagree indeed. While the special handling may be mostly
mechanical for the majority of packages, it still see it as required.

Worse, we also need to discuss how this affects backporting of packages.
Any package enabling the addon needs to have the addon removed for a
backport to undo the move. Worse, when backporting debhelper, any
package that uses the new compat level must explicitly disable the
addon. And then we may need to fix upgrade paths from backports to
stable.

> Of course, it goes without saying, we should check this before going
> forward in any direction.

The more I try, the more I have the impression that we enumerate the
ways this can go wrong and the more we poke, the more we find.

> Of course the release team needs to be on board, no questions about
> that. But given the idea is to maintain their decision exactly as it
> stands I wouldn't imagine it would be an issue? Once again, the
> moratorium is explicitly about moving between locations _and_
> packages, in combination, not either/or. From that same email you
> linked:

This is evidently ambiguous as RT also reference the CTTE moratorium,
which includes

"Files that were in /usr in the Debian 11 release should remain in /usr,
while files that were in /bin, /lib* or /sbin in the Debian 11 release
should remain in those directories."

Quite evidently, clarification is needed.

> There are already distro-wide upgrade piuparts checks run occasionally
> IIRC, at least I've seen a bug from one being reported this week, so
> we should be most of the way there already?

I examined the piuparts check in
https://lists.debian.org/20230425190728.ga1471...@subdivi.de already, so
no, not at all.

> To be clear, this would be very nice and welcome to have obviously,
> but I don't think it needs to be a blocker. We don't have such checks

Actually, getting the service seems to be the least of our problems.
It's fairly simple to implement and I have written a PoC style
implementation for parts of it already as part of my analysis.

> for vast parts of Policy, including moving files without
> Breaks+Replaces as evidenced by the recent MBF, and yet we managed to
> survive thus far. I don't think it's fair that this workstream should
> be held to higher standards than the rest of the project.

Given the expected breakage and its latency ahead, I think minimizing
risk is prudent.

> > 4. Add canonicalization support to debhelper as an (internal) addon.
> >Enable this addon in the next compat level. This will again populate
> >${misc:Pre-Depends} with "usr-is-merged (>= fixed version)" as needed.
> >Note that usrmerge's provides is unversioned and doesn't satisfy such
> >a dependency.
> 
> As already mentioned, I do not believe this is necessary for _all_
> cases. It is necessary for a certain number (that we should ascertain
> beforehand!) of cases, and we need the machinery implemented for them,
> but I don't think we should impose this workflow with pre-depends and
> diversions for all affected packages. I think it should be mandatory
> for problematic packages such as those you already pointed out, _and_
> for cases where the maintainer wants to move files also between binary
> packages.

Given local diversions, I am now convinced 

Re: DEP 17: Improve support for directory aliasing in dpkg

2023-05-06 Thread Helmut Grohne
Hi Luca,

On Sat, May 06, 2023 at 04:52:30PM +0100, Luca Boccassi wrote:
> To have a working system you need several more steps that are
> performed by the instantiator/image builder, such as providing working
> and populated proc/sys/dev, writable tmp/var, possibly etc. And it
> needs to be instantiated with user/password/ssh certs/locale/timezone.
> And if it needs to be bootable on baremetal/vm, it needs an ESP. And
> then if you have an ESP and want to run in a VM with SB, you'll need
> self-enrolling certs on first use or ensuring the 3rd party CA is
> provisioned. And then...

You paint it this way, but it really used to just work until we got the
/usr-merge. Indeed, debvm creates virtual machine images effectively by
bootstrapping a filesystem from packages and turning the resulting tree
into a file system image.

 * /proc, /sys, /dev are mounted by systemd. All you need to do here is
   create the directories and base-files does so.
 * /tmp is shipped by base-files.
 * user and password creation is not handled yet, but can be handled by
   something similar to systemd-firstboot.
 * Not sure what you mean with certs, locale and timezone. You can just
   install ca-certificates, locales and tzdata as part of the bootstrap.
 * The bootloader part for baremetal is kinda out of scope for
   bootstrap, which is why debvm side-steps this. You can also skip it
   for containers and build chroots. So it is one out of multiple use
   cases that needs extra work here.

In a good chunk of situations, you can get just by without messing
around. Well that is until we broke it via usr-is-merged. I concur with
Simon Richter, that restoring this property is a primary concern.

> You get the point. Going from a bunch of packages to a running system
> necessarily has many steps in between, some that are already done and
> taken for granted, for example when you say "works as a container" I'm
> pretty sure the "container" engine is taking care of at the very least
> proc/dev/sys for you, and it's just expected to work. bin -> usr/bin,
> sbin -> usr/sbin and lib -> usr/lib should get the same treatment: if
> they are not there, the invoked engine should prepare them. systemd
> and nspawn have been able to do this for a while now.

No, this misses the point. You can configure essential in a very limited
environment. However, you cannot do so without the lib or lib64 symlink
(depending on the architecture) and the bin symlink. This is so
critical, that it cannot be deferred to some external entity. It must be
part of the bootstrap protocol. There are some suggested ways to fix
this (such as adding separate bootstrap scripts next to maintainer
scripts), but nothing implemented.

> Not having those hard coded means that the use case of / on a tmpfs
> with the rest instantiated on the fly, assembled with the vendor's
> /usr and /etc trees, becomes possible, which is neat. And said trees
> can pass the checksum/full integrity muster.

It's neat that you can solve your use case by breaking other people's
use cases. This is not constructive interaction however. This kind of
behaviour is precisely what caused so much conflict around the
/usr-merge. What if I gave a shit for your use case? Denying the
/usr-merge and just continuing unmerged as long as possible (as merging
would break my use case) would be my strategy of choice. You can make a
difference here by starting to recognize other people's use cases and
proposing solutions in that merged world. And no, it's not "add duct
tape to every bootstrap tool".

So I really want to see a solution for the bootstrap protocol before
moving the dynamic linker and /bin/sh to its canonical location. The
current bootstrap protocol is kept on life-support by installing the
usrmerge package by default. Dropping usrmerge from the
init-system-helpers dependency as first alternative or moving the
dynamic linker would break it. If I had a solution in mind, I'd
definitely post it right here, but unfortunately I have not.

Helmut



Re: DEP 17: Improve support for directory aliasing in dpkg

2023-05-04 Thread Helmut Grohne
Hi Simon,

On Thu, May 04, 2023 at 03:37:49AM +0900, Simon Richter wrote:
> For aliasing support in dpkg, that means we need a safe policy of dealing
> with diversions that conflict through aliasing that isn't "reject with
> error", because the magic dpkg-divert would always generate conflicts.

I think we still have that misunderstanding I mentioned earlier, so let
me try to resolve that again.

>From my point of view, the ultimate goal here should be moving all files
to their canonical location and thereby make aliasing effects
irrelevant. Do you confirm?

As such, I do not see aliasing support in dpkg as complementing the
forced file move approach with lots of workarounds such as diverting
dpkg-divert. Rather, I see them as exclusive strategies. Each of these
strategies has significant downsides. In combining the different
strategies, we combine their downsides, but since their benefit is
shared, we do not gain anything in return but increase the price to pay.
Why should we do that?

So when we discuss diverting dpkg-divert, I imply that we do not change
the implementation of dpkg wrt. aliasing. So this branch of discussion
that you raise here, seems irrelevant to me.

On the flip side, if dpkg (and thus dpkg-divert) is to gain aliasing
support, I see no reason (and benefit) to diverting dpkg-divert.

Can you explain why you see combining these strategies as something
worth exploring?

> then a package containing /bin/foo and a package containing /usr/bin/foo now
> have a file conflict in dpkg. Not sure if that is a problem, or exactly the

This case already is prohibited by policy section 10.1. It can only
happen as a temporary state during a file move (from / to /usr and from
one package to another).

> behaviour we want. Probably the latter, which would allow us to define a
> policy "if aliased paths are diverted, the diversion needs to match", which
> in turn would allow the conflict checker during alias registration to verify
> that the aliased diversions are not in conflict.

If we do not modify dpkg to improve aliasing support, then yes, such a
scenario will require a Conflicts declaration or a different measure
averting this problem.

> The diverted dpkg-divert would probably generate extra register/unregister
> calls as soon as dpkg-divert itself is aliasing aware, but all that does is
> generate warning messages about existing diversions being added again, or
> nonexistent diversions being deleted -- these are harmless anyway, because
> maintainer scripts are supposed to be idempotent, and dpkg-divert supports
> that by not requiring scripts to check before they register/unregister.

Again, the premise seems unreasonable to me. Also note that such a
diversion of dpkg-divert certainly is meant as a temporary measure
facilitating the transition. I'd hope we could delete it in forky
already and failing that thereafter.

> We get to draw this card exactly once, and any package that would need the
> same diversion would need to conflict with usr-is-merged, which would make
> it basically uninstallable.

I don't think the case of packages wanting to divert update-alternatives
is all that common. Please elaborate on the use case. Also note that
this suggestion already is to be considered a plan B. My current
understanding is that as long as we do not canonicalize alternatives at
all, we don't run into problems with them. This kinda is ugly, but the
number of affected packages is small.

Helmut



Re: DEP 17: Improve support for directory aliasing in dpkg

2023-05-03 Thread Helmut Grohne
Hi Raphaël,

On Wed, May 03, 2023 at 10:31:14AM +0200, Raphael Hertzog wrote:
> I don't know APT well enough to answer that question but from my point of
> view it's perfectly acceptable to document in the release notes that you
> need to upgrade dpkg first.

Yes, this issue seems vaguely solvable one way or another. It also
affects other approaches modifying dpkg in the very same way.

> Are you sure that we need anything for diversions except some documented
> policy on how to deal with it?

Yes! There is a hard ordering constraint involved here. Failure to do so
results in unpack errors and or file loss in much the same way.

> AFAIK the following sequence performs no filesystem changes and should
> be sufficient to move a diversion to its new location (I only consider the
> case of an upgrade, not of a new installation that should just work
> "normally" on the new location):
> 
> dpkg-divert --package $package --remove /bin/foo --no-rename
> dpkg-divert --package $package --add /usr/bin/foo --divert 
> /usr/bin/foo.diverted --no-rename

This is insufficient. Either we modify dpkg to consider aliasing when
managing diversions (i.e. Simon Richter's branch or DEP17) or there is a
more complex ordering requirement involved:

 * We must not remove the aliased diversion (/bin/foo) before the
   diverted package has moved its files to the canonical location
   (/bin/foo -> /usr/bin/foo).
 * We must add the canonical diversion (/usr/bin/foo) before the
   diverted package update that moves its files to canonical locations
   can be unpacked.

Say we currently have

Package: diverter
Version: 1
Files: /bin/foo
preinst: diverts /bin/foo

Package: diverted
Version: 1
Files: /bin/foo

We must first update the diverter.

Package: diverter
Version: 2
Files: /usr/bin/foo
preinst: diverts both /bin/foo and /usr/bin/foo

Since we divert both locations, diverter can now deal with an old
diverted and a canonicalized diverted.

Package: diverted
Version: 2
Conflicts: diverter (<< 2~)
Files: /usr/bin/foo

At the time of unpacking the updated diverted, we must ensure that no
diverter versioned 1 is unpacked. Breaks does not help here as it allows
concurrent unpacks. Neither does Replaces since dpkg thinks that
/bin/foo is different from /usr/bin/foo and thus no replacing happens.

Package: diverter
Version: 3
Conflicts: diverted (<< 2~)
Files: /usr/bin/foo
preinst: diverts /usr/bin/foo

When unpacking the updated diverter, we must ensure that no diverted
version 1 is unpacked. Again, Breaks and Replaces does not suffice.
Therefore an upgrade from stable to nextstable containing both diverter
and diverted must temporarily remove either package, which is known to
annoy apt.

What still applies here is that we can have usr-is-merged divert
/usr/bin/dpkg-divert and have it automatically duplicate any possibly
aliased diversion and then the diverter may Pre-Depends: usr-is-merged
(>=...) to have its diversions duplicated. Of course, doing so will make
usr-is-merged very hard to remove, but we have experience here from
multiarch-support.

Hope this clarifies.

> The case of update-alternatives is likely more tricky. You already looked
> into it. That's a place where it will be harder to get things right
> without some changes.

As detailed in
https://lists.debian.org/debian-devel/2023/04/msg00169.html I believe
that update-alternatives really are not tricky at all as long as we do
not attempt to migrate them to canonical paths in any way. For instance,
elvis-tiny needs to continue to name the editor alternative
/bin/elvis-tiny even when it actually moves that file to /usr/bin. The
reason that this does not hurt is that we never attempted to move
alternatives (unlike regular files in packages).

If we really want to migrate alternatives to canonical paths, we do get
into the tricky area of preserving the user configuration and we also
break custom scripts, ansible's community.general.alternatives, uses of
puppet's alternatives modules and probably a lot more.

And of course, we can always draw the diversion card and have
usr-is-merged divert /usr/bin/update-alternatives to have it
canonicalize paths as required to be able to migrate alternatives in a
sane way (from a consumer point of view).

Helmut



Re: DEP 17: Improve support for directory aliasing in dpkg

2023-05-02 Thread Helmut Grohne
On Tue, May 02, 2023 at 02:09:32PM +0200, Helmut Grohne wrote:
> This is problems we know about now, but it likely is not an exhaustive
> list. This list was mostly guided by Guillem's intuition of what could
> break at https://wiki.debian.org/Teams/Dpkg/MergedUsr and I have to say
> that his intuition was quite precise thus far. Notably missing in the
> investigation are statoverrides. However, we should also look for a more
> generic approach that tries capturing unexpected breakage.

I mentioned statoverrides as missing. I think we can split statoverrides
into the two classes "package changes" and "admin changes". Quite
obviously, moving files, will break admin changes. I see little ways
around this, we can partially mitigate this by detecting common
statoverrides and migrating them automatically, but in the end, we'll
probably have to explain issues with admin-initiated statoverrides in
the release notes.

For package changes, the good thing is that statoverrides usually change
stats of files owned by the package initiating them. Thus a package
moving files can also move statoverrides (though this again means that
automatic moves e.g. by debhelper must be opt-in in order to avoid
breaking stuff). For getting an idea of the scope, we can use
https://binarycontrol.debian.net/?q=dpkg-statoverride.*+%2F%28bin%7Csbin%7Clib%7Clib32%7Clib64%7Clibo32%7Clibx32%29&path=%2Funstable%2F

* fuse and fuse3 adapt to an admin initiated statoverride of
  /bin/fusermount.
* nfs-common cleans an obsolete dpkg-statoverride of /sbin/mount.nfs
* systemd-cron adds a statoverride for /lib/systemd-cron/crontab_setgid
  and needs to migrate it with its files.
* yp-tools adds a statoverride for /sbin/unix_chkpwd and needs to
  migrate it with its files.

I also tried installing all packages that contain dpkg-statoverride in
any of their maintainer scripts and capturing the resulting statoverride
file. That doesn't yield anything unexpected thus far, but it also
hasn't completed yet. I'll reply to this message with findings if
there are any beyond the ones above.

So statoverides seem quite similar to the diversions induced by dash:
Mostly harmless if handled correctly while moving the files, but we
cannot just move the files in an opt-out fashion. Beyond that we need to
augment release notes to ask admins to carefully update their local
statoverrides (and local diversions).

Helmut



Re: DEP 17: Improve support for directory aliasing in dpkg

2023-05-02 Thread Helmut Grohne
On Tue, May 02, 2023 at 02:09:32PM +0200, Helmut Grohne wrote:
> I noticed that the number of packages shipping non-canonical files is
> relatively small. It's less than 2000 binary packages in unstable and
> their total size is about 2GB. So I looked into binary-patching them and
> attach the resulting scripts.

Sorry for missing the attachments.

Helmut


repackdeb.sh
Description: Bourne shell script


autorepack.sh
Description: Bourne shell script


createchroot.sh
Description: Bourne shell script


  1   2   3   4   >