On Mon, 2022-04-11 at 10:13:52 +0200, Ansgar wrote:
> Package: dpkg
> Version: 1.21.7
> Severity: wishlist

> Someone wondered on IRC why we ship symbols files in shared library
> packages instead of the associated -dev packages

While there is in theory no technical limitation why these could not
be shipped in -dev packages, as the tools care about only explicitly
linked shared libraries (so if a transitive shared library does not
have shlibs/symbols files, that should not be a problem if these are not
present on the system). The problem is that the tooling needs to be
able to find these files, and it goes from object SONAME declaration
to shared object on disk, then looks for the package owning that file,
and then looks for the shlibs or symbols present in that package (either
from the system or package build directories).

(This should probably be documented somewhere explicitly, as I did not
see anything obvious neither in dpkg docs nor the Debian policy manual.)

> and noted that they
> take 1% of disk space for a fresh debian:sid docker container (and
> probably more on the -slim variant of the container).

I just checked (OOC) and f.ex. on the current sid-slim variant it
seems to be around 1.77%. The actual size of these files there is
1.6 MiB (according to du -sch). After a quick look I see either trivial
targets that would more than offset that, f.ex.:

  $ dpkg -P e2fsprogs libext2fs2 mount gcc-9-base
  $ rm -f /var/cache/debconf/*.dat-old
  $ rm -f /var/cache/debconf/templates.dat

Or other more localized/focused things like there being both libpcre2-8-0
and libpcre3, or (libcrypto + libssl) + (libhogweed + libnettle +
libp11-kit + libtasn + libidn2 + libunistring + libgnutls), that would
give way more significant gains (f.ex. getting rid of the GnuTLS stack
would amount to something like an additional 8 MiB, including
reduction from the then no longer present symbols files).

Matthias Klose seems to have implied (in a bug report) to not find
symbols files for C++ libraries very helpful, so if he'd decide to
stop shipping them for libstdc++6 (the biggest there), that would be
an additional 400 KiB reduction.

Otherwise making dpkg transparently compress such files on the db,
would reduce its size by 1.1 MiB (with just gzip), which is something
that I had previously already considered for the old changelog in the
dpkg db proposal.

> It would be nice if it was possible to exclude symbol files from such
> environments. This could mean:
> 
>  - Ship them in the -dev package instead.

While this could potentially be done, it seems to me the amount of
global effort and resulting properties might not be a very good
trade-off for the gains of currently less than 2 MiB (or potentially
around ~500 KiB) of space there.

Conceptually storing them in either <lib> or <lib>-dev packages can
be argued to make sense and have good and bad properties.

Shipping them on <lib>:

  - They are guaranteed to be kept in sync with the shared object they
    describe (no requirement for guaranteeing exact version dependencies
    between <lib> and <lib>-dev, even though this tends to be current
    practice).
  - They do not require adding some way to back-reference the <lib>-dev
    package corresponding to its <lib> (a new control field f.ex.).
  - They do not depend on the <lib>-dev package being arch:any (which
    Multi-Arch would require, but that's an optional feature from dpkg
    PoV).
  - (There could be external functional reliance on these files being
    shipped in <lib> packages to extract specific symbol version
    information, as this can be considered part of the interface.)

Shipping them on <lib>-dev:

  - They are shipped in the package that would denote the file might
    get used, and don't "waste" space in case no building is going to
    be happening.
  - The Build-Depends-Package field in symbols files could be somehow
    simplified into some boolean variant (but not its
    Build-Depends-Package_s_ counterpart, although both of these are
    optional, unlike the required new back-reference field in the control
    file).

So doing this change seems to me would imply that:

 - Maintainers (not just debhelper) would need to modify the packaging
   to move those files to the new package (which has global impact), for
   a potentially very long-winded transition.
 - Switching all libraries seems like a rather large undertaking for
   a potentially ~500 KiB gain TBH. Switching only packages in the minbase
   set would create a weird packaging oddness and non-uniformity. :/
 - Regardless of a full or a partial transition, both locations would
   need to be supported anyway, which would also make packaging a bit
   more complicated/confusing.

This seems in contrast to other proposals to reduce the essential set,
which imply global efforts, but they also imply complexity reduction by
making f.ex. the bootstrapping requirements smaller, or making
dependencies explicit to get rid of implied assumptions. Which in this
case seems to instead end up adding new complexity.

>  - Ship them in a well-known location in /usr (they are not variable
>    state data after all); this would allow the regular exclusion
>    mechanism already used by -slim images to be used here as well.

They are varying packaging state metadata, like all other stuff stored
in the dpkg database. The excludes used, all seem for non-functionally
altering files anyway. So excluding these files would render these
images not usable as bases for build containers.

(Not to mention the additional complication of having to encode these
pathnames in a way compatible with the dpkg db so that f.ex.
Multi-Arch can be handled correctly, or whatever new requirements
might be coming along, w/o needing to encode the location format
somewhere else.)

Guillem

Reply via email to