Bug#650077: dpkg: The Installed-Size estimate can be wrong by a factor of 8 or a difference of 100MB

2011-11-26 Thread Helmut Grohne
Package: dpkg
Version: 1.16.1.2
Severity: wishlist

Symptom
~~~
I just installed libjs-mathjax. According to its Installed-Size this
would just consume 16512KB. Now according to policy this is just an
estimate of course. But how accurate is it actually? So I installed said
package on ext3. Turns out /usr/share/javascript/mathjax takes up
127296KB and /usr/share/doc/mathjax takes another 1200KB. So our
estimate is wrong by a factor of 8 or a difference of 100MB. This
estimate is also used to determine whether the disk has enough space, so
if my disk just had 50MB left, aptitude would have tried to install this
package and failed.

The actual problem
~~
Problems with Installed-Size are not exactly new as discussion in
http://bugs.debian.org/534408 (unit for Installed-Size) and
http://bugs.debian.org/630533 (usage of du --apparent-size) have shown.
So what is different this time? Installing the very same package on a
btrfs yields a size that is much closer to the listed Installed-Size. (I
don't have any numbers on this.) So whatever dpkg puts into this field,
it *will* be wrong somewhere. The policy already mentions that this
estimate cannot be accurate everywhere, but in fact it will be wrong by
a factor of at least 2.5 (=sqrt(8)) or a difference of at least 50MB
(=100MB/2) somewhere. Any attempt to change the computation of this
value thus cannot fix this bug.

Discussion
~~
In the example of libjs-mathjax the reason for the huge difference is
the inclusion of a large number of very small files. Some filesystems
allocate a block for each of these files and others are able to store
multiple files in a block. A simple approach could be to include an
additional field (Installed-Files?) that returns the number of files
in the package. A second estimate for the Installed-Size would then be
given by the number of files times the block size. The maximum of both
estimates could be used. It would solve the immediate symptoms with
libjs-mathjax. It is not without problems though. For instance I
did not explain what block size to use. An administrator may have
different file systems set up for / and /usr. Also the question remains
whether this feature is worth the associated effort.

To get discussion going I pull in debian-policy@l.d.o.

Helmut



-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/2026110639.ga30...@alf.mars



Bug#701081: debian-policy: mandate an encoding for filenames in binary packages

2013-02-21 Thread Helmut Grohne
Package: debian-policy
Severity: wishlist

Apparently the debian-policy currently says nothing about the characters
used in filenames contained in binary packages. Most packages use common
sense and only use a small subset of US-ASCII. In Debian sid main most
filenames can be represented using the following subset of US-ASCII
characters (written as a regular expression):

[][a-zA-Z0-9{}() ^/,=:!*%#$~@+._-]

The number of exceptions is about 200 contained in about 50 binary
packages. In those packages some filenames are not representable as
UTF-8 (for example aspell-is) and others don't make any sense in
ISO-8859-15 (for example ca-certificates).

It would be nice if some common ground concerning filename encoding
could be reached. The options range from a rather restrictive definition
of acceptable characters via requiring filenames to be representable in
US-ASCII to mandating a particular encoding (such as UTF-8). This could
be first introduced as a SHOULD and later turned into a MUST.

Personally I do not really care about what the precise restriction is as
long as it permits a mechanical transformation to unicode.

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130221114327.ga19...@alf.mars



Bug#701081: debian-policy: mandate an encoding for filenames in binary packages

2013-02-22 Thread Helmut Grohne
Thanks for your comments.

On Sat, Feb 23, 2013 at 01:31:32PM +0900, Charles Plessy wrote:
  - There are here and there discussions raising possible corner cases
where distributing files with a name not representable in UTF-8 might
be justified, for instance in test suites.

Even though the general argument is correct, the particular example
probably applies to source packages in most cases. We don't control
source packages (unless we repack them), so I think they should not be
covered by a filename encoding policy.

  - Similar discussion also took place in #99933.  I wonder about merging this
bug (#701081) and #99933.

I stumbled upon this bug before reporting this one and decided that the
issues were sufficiently separate from each other to warrant a new bug
number. I did not read the full bug log and therefore did not discover
that its scope widened to filenames as well. The discussion found
therein clearly is valuable. I still think that separating bugs for
filename encoding and file content encoding is a good idea, because
those issues can be solved independently. That said merging also makes
sense to point to the rest of the discussion. In the latter case, please
select a better summary message.

I have to admit, that I am slightly in favour of just copying Fedora's
approach. Making distributions more compatible with each other seems
like a worthwhile thing to do.

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130223070209.ga18...@alf.mars



Bug#701081: debian-policy: mandate an encoding for filenames in binary packages

2013-04-01 Thread Helmut Grohne
On Sun, Mar 24, 2013 at 08:01:03PM +0900, Charles Plessy wrote:
 after more than one month of discussion, we have not reached a conclusion.

Thanks for the ping.

 In the current situation there is no policy, which means that everything is
 allowed.  Indeed, there is at least one package with filenames using more than
 one set of non-ASCII characters, so no user can see correctly the names of
 every file in this package at the same time.

Some more data here. I checked sid main amd64 binary packages. The only
ones containing invalid UTF-8 sequences (and thus violating the current
proposal) would be aspell-is and jpilot. This suggests that UTF-8 is a
defacto standard already. Fixing two packages shouldn't be that hard. I
have filed a wishlist bug #704446 against lintian to check for this
regardless of the outcome of this bug.

 On my side, I made a proposal with actionable items: fix the few packages that
 are not using UTF-8, and modify the Policy to reflect the current practice
 of using ASCII in most of the times and other UTF-8 characters parcimoniously.

I am in favour of this solution.

 * Requiring any subset of UTF-8 has the direct benefit of being able to
   interpret all filenames used without guesswork.
 * This is in line with Fedora's policy.
 * I saw very little disagreement about whether to permit non-UTF-8
   sequences. Discussion seemed mostly to be around which subset to
   require.

 I understand very well the arguments against having any UTF-8 character at 
 all,
 but we currently have such packages in our archive, so if there is no plan to
 modify these packages, then we can not plan to solve this bug.

I see little benefit with restricting to ASCII compared to the benefit
with restricting to UTF-8. Remember that the goal of this bug was to
make filenames machine readable. I think that further restrictions
should happen in the context of #99933. I asked for not merging these
issues, because I would like to keep the scope of this issue limited and
thus implementable.

 Can others comment how they would like to see this bug solved ?

Any proposal that limits to a subset of UTF-8 and a superset of
printable ASCII is fine with me. My preferred choice would be just
UTF-8. I have no objections to recommending the use of a subset of
printable ASCII either.

To me it appears to be a matter of wording right now. Consensus is
basically there. Implementing it would cause two policy violations
(aspell-is and jpilot), which imo is little impact.

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130401093755.ga16...@alf.mars



Bug#701081: debian-policy: mandate an encoding for filenames in binary packages

2013-04-08 Thread Helmut Grohne
On Sat, Apr 06, 2013 at 08:20:15PM +0900, Charles Plessy wrote:
   sec id=filenames
 headingFile names/heading
 
 p
   The name of the files installed by binary packages in the system 
 PATH 
   (namely tt/bin/tt, tt/sbin/tt, tt/usr/bin/tt,
   tt/usr/sbin/tt and tt/usr/games//tt) must be encoded in
   ASCII.
 /p
 
 p
   The name of the files and directories installed by binary packages
   outside the system PATH must be encoded in UTF-8 and should be
   restricted to ASCII when they can be represented in that character
   set.
 /p
   /sec
 
 
 What do you think ?

Thanks to all involved parties for your work on this issue. I am very
much satisfied with the result and happy that it is met with consensus.
The suggestions of Julian Gilbey appear sensible, but do not touch the
general direction.

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130408090430.gb21...@alf.mars



Bug#701081: debian-policy: mandate an encoding for filenames in binary packages

2013-04-14 Thread Helmut Grohne
On Sun, Apr 14, 2013 at 11:58:03AM +0200, Bill Allombert wrote:
 I think configuration files should also be included in the first list, 
 because the
 user is supposed to be able to interact dirrectly with them.

I object to this extension of the proposal, because use of UTF-8
characters in conffile names is a current use case of ca-certificates.
If anything it could be treated as a should and turned into must
after working with the ca-certificates maintainers on a solution.

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130414115529.GA25265@localhost.localdomain



Bug#701081: debian-policy: mandate an encoding for filenames in binary packageso

2013-04-14 Thread Helmut Grohne
On Sun, Apr 14, 2013 at 02:22:47PM +0200, Bill Allombert wrote:
 Why files in ca-certificates are configuration files in the first place ?
 I doubt users are expected to edit PEM certificate.

Correction of what I said before: ca-certificates does not ship them as
conffiles, but as configuration files.

Actually they are symbolic links to the actual certificates shipped
within /usr/share. The purpose of the links is to allow the user to
remove particular certificates, that she does not trust. As such those
symbolic links express configuration choices.

As it stands I see ca-certificates as a valid use case of UTF-8
characters in configuration file names. I strongly suggest to talk to
the ca-certificates maintainers before changing the policy in a way this
way.

The reason for reporting this bug was to get a way to interpret
filenames *now*. The proposed wording (by Charles Plessy) enables us to
do so. I would like to see further restrictions on filenames deferred to
another issue, because it has less of a perceived benefit and there is
not the broad consensus and support for further restrictions. Clearly
further discussion is required for these.

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130414124726.GA26069@localhost.localdomain



Bug#443902: debian-policy: (C.3) subdirectories in debian/ not allowed?

2007-09-24 Thread Helmut Grohne
Package: debian-policy
Version: 3.7.2.2
Severity: wishlist
Tags: patch

Appending C.3 says:
All the directories in the diff must exist, except the debian
subdirectory of the top of the source tree, which will be created by
dpkg-source if necessary when unpacking.

This is exactly one exception namely `debian/'. Creating directories
like `debian/patches' (dpatch) would violate the policy while strictly
reading it. I therefore suggest that `subdirectory' is replaced by
`subtree', `subdirectory and subdirectories thereof' or something
similar.

Helmut

-- System Information:
Debian Release: lenny/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.20.1 (SMP w/2 CPU cores)
Locale: LANG=C, LC_CTYPE=de_DE (charmap=ISO-8859-1)
Shell: /bin/sh linked to /bin/dash

-- no debconf information



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: Bug#553135: sendmail-base: maintainer-script-calls-init-script-directly prerm:67 than using invoke-rc.d. The use of invoke-rc.d to invoke the /etc/init.d/* initscripts instead of calling them dire

2010-01-22 Thread Helmut Grohne
Hi,

thanks to Manoj for pointing this out and Richard for explaining it.
Unfortunately this rc bug is still open after two months.

Short summary:

sendmail-base.prerm invokes an init script without invoke-rc.d which
technically is forbidden by the Debian policy. (report from Manoj)

The part that is invoked is not a standard command (clean) and would
that way produce a warning. (pointed out by Richard)

Let me outline possible solutions:
1) Tag it as wontfix and decrease severity.
   The reason for using invoke-rc.d is that it can prevent starting and
   stopping daemons when this is not desired. Cleaning the queue does
   not interfere with this.

2) Use invoke-rc.d --force. (suggested by Richard)

3) Move the queue cleaning script somewhere else and call it from the
   init script.

Please decide about a solution and solve this issue.

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Bug#553135: sendmail-base: maintainer-script-calls-init-script-directly prerm:67 than using invoke-rc.d. The use of invoke-rc.d to invoke the /etc/init.d/* initscripts instead of calling them dire

2010-01-22 Thread Helmut Grohne
severity 553135 normal
thanks

On Fri, Jan 22, 2010 at 01:50:40PM -0800, Russ Allbery wrote:
 That being said, this is clearly not the problem that either Policy or the
 Lintian tag were designed to catch, and you should feel free to decrease
 the severity and add an override.  Also, please feel free to report a bug

Thanks for your input. I just downgrade the severity for now, so others
don't try to fix it as an rc bug.

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: Bug#650077: dpkg: The Installed-Size estimate can be wrong by a factor of 8 or a difference of 100MB

2015-01-07 Thread Helmut Grohne
On Wed, Jan 07, 2015 at 12:22:47PM +0100, Johannes Schauer wrote:
 It is also worth asking what functionality the Installed-Size field is 
 supposed
 to have when looking for a solution. It's primary purpose is probably to give
 apt a clue of whether or not there is enough free space to install a certain
 package.

This was/is a recurring question. The policy expends 4 on the field in
section 5.6.20. It fails however to clarify the purpose and thus the
preferred way of computing or using it.

 I think that an over approximation would be the right way to go because it is
 better to wrongly warn the user that a binary package might not be installable
 due to not sufficient remaining disk space, than to install a package without
 sufficient remaining disk space and only fail once there actually is no more
 space.

Consider Alice. She wants to install foo, which has a good approximation
for her filesystem. Unfortunately, it is too big to be installed. Thus
she looks at other packages and determines that she no longer needs bar.
Duly she issues apt-get install foo bar-. Unfortunately, this command
fails unpacking foo as bar's approximation was bad and thus it does not
free the space advertised in Installed-Size.

   ( find mathjax-2.4 -type f -print0 \
   | du --files0-from=- -b; \
   find mathjax-2.4 \! -type f -printf 1\n ) \
   | awk '{total = total + int($1/4096) + 4096}END{print total}'

Slight improvement:

find ... \( -type f -printf %s\n \) -o \
 \( ! -type f -printf 1\n \) | ...

Helmut


-- 
To UNSUBSCRIBE, email to debian-policy-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/20150107185302.ga16...@alf.mars



Bug#757760: debian-policy: please document build profiles

2017-07-21 Thread Helmut Grohne
On Tue, Jul 18, 2017 at 10:33:06AM +0100, Simon McVittie wrote:
> I suspect stage1 might also still be useful for (possibly pre-emptively)
> breaking cycles involving build-time vs. runtime dependencies, like the one
> that historically existed between glib2.0 and dbus: it seems more
> straightforward to have one profile name than to invent a series of
> nodbus, noglib, noqt, etc. if each one will only be used in practice in
> a small number of places to break cycles.

That may all be true, but simply using "stage1" is harmful in the long
run, because it renders "stage1" meaningless. We have a pile of "stage1"
profiles in the archive already and we essentially lost structure on
them. Some modify binary packages. Others drop binary packages. You
cannot tell what "stage1" does without looking at a particular source
package. The only meaning left is that it is often used with
bootstrapping in unspecified ways, but it doesn't even tell whether
that's native or cross and whether it actually is legacy.

> I'm not sure I agree with this. A maintainer can't reliably know all the
> cycles their package might be involved in over time, sure, but they *can*
> know what is the bare minimum of functionality that their package can
> have and still be useful for build-dependencies, and that's how I used
> stage1 in src:dbus.

It seems that the essence of dbus' stage1 is to reduce as much
functionality as possible. Why not call it pkg.dbus.minimal instead?
That name would directly carry the intention. (See Josch's mail for
rationale.)

We should actually go one step further: We should require that any
profile being used is documented in a "canonical" place. For "standard"
profiles (nocheck, nopython, ...) that'll be some central document (e.g.
policy). For extension profiles (pkg.$sourcepackage.$anything) that'll
be debian/README.source or something similar. Without such documentation
we'll only enlarge the mess we have with "stage1" now.

Let me give a recent example for comparison. I recently had to add a
profile to unbound to break a cycle. unbound builds itself four times
with varying ./configure flags. Reducing it to the one relevant build
pass building the shared library was the obvious thing to do. I could
have called the profile "stage1", but I think "pkg.unbound.libonly" much
better tells what it does. The very same logic applies to dbus, no?

While at it, I'd like to emphasize that it is not forbidden or even
discouraged to use pkg.foo.someprofile in source package bar. The only
requirement is that foo's maintainer agrees with that use (e.g. by
documenting how it is supposed to be used).

I see no urgency in removing "stage1" profiles now. They're a mess, but
a working mess. What I'd like to avoid is furthering the mess. So let's
not add more "stage1" than we have now please. Use descriptive profiles
for new stuff.

Helmut



Bug#749826: Documenting `Multi-Arch: foreign`

2017-08-20 Thread Helmut Grohne
Hi Sean,

Thanks for picking up multiarch!

On Sat, Aug 19, 2017 at 09:50:21PM -0700, Sean Whitton wrote:
> I spoke to Russ and we're both of the view that we should document
> multiarch piecemeal.  Let's begin by getting a definition of the
> Multi-Arch: field into ch. 5 of policy.

I'm glad you agree to my proposal.

> I have pushed a new branch to the Debian policy repo named
> bug749826-spwhitton.  On that branch I've committed a slightly reworked
> form of your draft text.[1]  Please review the diff.  Here are some
> comments/issues:

Very welcome.

> - I substantially shortened your text.  Let me know if you think I went
>   too far.

I fear that some important aspects got lost indeed. More on that later.

> - Previously I was worried about defining 'interface' but I've found
>   another place where policy uses this word without defining it, and I
>   don't think it needs to be changed in either place.

I'm not a friend of vagueness, but I do recognize the difficulty in
expressing the requirements precisely.

> - I couldn't figure out how to include this text, because I didn't
>   understand it:
> 
> For instance, using dpkg --print-architecture can be used to emit the
> native architecture even though dpkg is marked Multi-Arch:
> foreign. Similarly, calling pkg-config (without a prefix) will behave
> differently on different architectures as its search path is
> architecture-dependent even thoug pkg-config is marked Multi-Arch:
> foreign.
> 
>   Are you saying that packages that depend or implicitly depend on dpkg
>   or pkg-config cannot be Multi-arch: foreign, although dpkg and
>   pkg-config themselves are Multi-arch: foreign?  Why are dpkg and
>   pkg-config Multi-arch: foreign, if they provide these
>   architecture-dependent interfaces?

Those are very good questions and clarifying them will lead to a better
understanding of what we have to put into policy. You do understand that
"dpkg --print-architecture" is part of dpkg's interface. Yet its out
varies with its architecture. Taking this strictly would indeed imply
that dpkg is wrongly marked. Similarly, running pkg-config may result in
architecture-dependent paths and thus our strict interpretation would
result in rejecting the foreign marking.

A common theme with such cases is to resort to `Multi-Arch: allowed`
(e.g. make), but that has the downside of requiring most consumers to
attach the :any annotation and that it can never be switched back
(because :any dependencies on packages not marked M-A:allowed are
unsatisfiable).

This is where I thought about README.multiarch:

> - I didn't include your TODO about README.multiarch; let me know whether
>   you have a more concrete idea about the purpose of that file

It can document assumptions one makes about users of a package. For
instance, we expect dpkg users to use `dpkg --print-architecture`
diagnostically only. Similarly, we expect that package builds call
pkg-config if they mean the build architecture and they need to call
$(DEB_HOST_GNU_TYPE)-pkg-config if they mean the host architecture.
Indeed that happens automatically for autotools projects that happen to
use PKG_CHECK_MODULES or PKG_PROG_PKG_CONFIG (i.e. most). It also
happens for cmake when built with dh_auto_build.

Let me give a counter example to illustrate more of the point.
haskell-devscripts-minimal is an `Architecture: all` package with some
shell scripts. Sounds like a good candidate for `Multi-Arch: foreign`.
When you look at /usr/share/haskell-devscripts/Dh_Haskell.sh though, you
see that functions such as cpu(), os(), etc. specifically introspect the
build architecture by using the build architecture ghc. Such usage is
not ok for `Multi-Arch: foreign` (#769377).

I believe that policy should encourage some uniform way to document the
intended interface as we have several cases where this is not obvious.
README.multiarch may be that way. In particular, using a package in a
way not permitted by such README.multiarch would need to be a policy
violation on its own. For instance, one could depend on a shared library
and declare it an implementation detail. Relying on the transitive
dependency would then be considered a policy violation.

> - after we've got text documenting the other possible values of the
>   Multi-Arch: field, we might want to promote the list of things to
>   consider out of the Multi-Arch: foreign subsubsection.  It should
>   become clear once we've got that other text together.

Indeed, documenting `Multi-Arch: same` may be easier (or not). For the
purpose of defining it, we shall call Debian binary packages for
different architectures with equal binary package name and version
"instances" of a package. I currently see the following requirements:

 * It must not be used on `Architecture: all` packages (though I wish
   you could ;).

 * Given any two instances of a package and any filename, that filename
   must be non-existent in at least one package or the type (directory /

Bug#515856: [debhelper-devel] Bug#515856: debhelper: please implement dh get-orig-source

2017-09-18 Thread Helmut Grohne
On Mon, Sep 18, 2017 at 11:28:42AM +0200, Bill Allombert wrote:
> get-orig-source and watch files serve a different purpose.
> 
> get-orig-source is used to build the .orig. tarball from the true
> upstream one. Most package do not need that.  Watch files could not do
> that until recently.
> 
> So the comparaison is unfair.
> 
> What need to be checked is how many get-orig-source rules has been
> reimplemented in term of watch files.

Challenge accepted. ticharich.d.o has an unpack of rules debian/rules
files. Most of them are world-readable. A small number (~30) are
inaccessible, so my analysis will have an error of around 0.2%.

A simple method is to just look at which of them contain the string
"get-orig-source" and which of them contain the string "uscan" assuming
that when both show up, get-orig-source is implemented using uscan.

The following packages do not implement get-orig-source with uscan:

biojava4-live
boinc-app-seti
cjk
edk2
fasttree
freeorion
freerdp
gr-air-modes
gr-fcdproplus
gr-iqbal
gr-osmosdr
htmlunit
ioquake3
iortcw
josm
libb64
libreoffice
libtgvoip
neobio
nvidia-graphics-drivers
nvidia-graphics-drivers-legacy-304xx
pencil2d
pixelmed
qemu
r-cran-rniftilib
sagemath
west-chamber
zsh

So we have around 22500 source packages with watch files, we have 3000
packages with get-orig source, of those 28 don't use uscan. The fair
comparison is 22500 vs. 28. That's almost 3 magnitudes. If anything,
policy should document debian/watch, not get-orig-source. The perl
policy, python policy, elpa policy, ... each affect more packages than
get-orig-source. Keeping it is uneconomic.

Helmut



Bug#872808: [debian-policy] nocheck DEB_BUILD_OPTIONS DEB_BUILD_PROFILES

2017-08-24 Thread Helmut Grohne
On Wed, Aug 23, 2017 at 07:23:14PM +0100, Ghislain Vaillant wrote:
> I also suspect that given DEB_BUILD_PROFILES=nocheck implies
> DEB_BUILD_OPTIONS=nocheck, the same should be true for nodoc?

Like DEB_BUILD_PROFILES=nocheck does *not* imply
DEB_BUILD_OPTIONS=nocheck (you must set the latter explicitly),
DEB_BUILD_PROFILES=nodoc does *not* imply DEB_BUILD_OPTIONS=nodoc.

In general, I think that this historic split into options and profiles
is unfortunate. If we were to restart now, we'd likely remove nocheck,
nodoc and maybe also nostrip from DEB_BUILD_OPTIONS and use
DEB_BUILD_PROFILES exclusively. That's not where we are unfortunately.

Arguably, the same responsibility we require for nocheck should be
applied to nodoc. Given that the nodoc option has a much lower adoption,
I am in favour of simply deprecating it. We should also remove the
"nodocs" option from the archive while at it.

Furthermore, I question the usefulness of nodoc. Since -doc packages are
generally arch:all, most often you can skip them by doing an arch-only
build. In the cases where documentation is stuffed into arch:any
packages, the option modifies package contents. As such, you can no
longer tell whether your modified package correctly satisfies its
reverse dependencies (that may use parts of the documentation other than
/usr/share/doc/). As such the nodoc option/profile is generally
considered "unsafe". Given that you cannot simply rebuild the world with
nodoc active, I have yet to encounter a practical use of nodoc.  It
seems to be a futile exercise in increasing complexity at present.

Whatever the outcome to the relevant questions is, consensus is not what
we have now.

Helmut



Bug#749826: Documenting `Multi-Arch: foreign`

2017-09-04 Thread Helmut Grohne
Hi Simon,

On Sat, Sep 02, 2017 at 05:26:57PM +0100, Simon McVittie wrote:
> That seems like it might be a bug (or design flaw if you prefer). If a
> package (build-)depends on foo:any, it is saying "I am only using the
> arch-indep parts of foo's interface", whatever those are.

You may call it feature. The idea here was that :any should not be used
mindlessly. Thus it is only allowed on packages properly marked for that
used with ``Multi-Arch: allowed``. In Build-Depends, you can mostly
achieve the same effect with :native (which essentially is :any on any
package (but Architecture: all packages (though our dependency resolvers
don't agree here))).

> Perhaps a dependency on foo:any by (for example) bar:mips should
> always be satisfiable by foo:mips (as though the :any had been omitted),
> regardless of foo's multi-arch status? This would bring it back to the
> same meaning as omitting the :any, in the trivial case where only one
> architecture is enabled.

That proposal may ease meta data changes indeed. I suspect that it would
also cause a lot of useless :any annotations. It's a two-sided sword.

> Perhaps a dependency on foo:any should be satisfiable by any instance
> of foo that is Multi-Arch: foreign? (In this case the :any is completely
> redundant, because foreign sets up a similar situation from the other end)

After studying Multi-Arch for many years now, I recognize that a core
idea is to almost always flag the architecture constraint on the target
of an edge. To understand this wicked sentence, consider a dependency
graph and label each node (package) with an architecture. Now Multi-Arch
says that by default every edge (dependency) must enforce equal
architecture on both ends. Most of the header's job is relaxing this
restriction. The designers of Multi-Arch decided that this relaxing
should not be a property of the edges (e.g. :any), but a property of the
dependee.

Thus the current implementation ensures that :any cannot be used in
situations where it is inappropriate. As you point out, that design is
annoying for meta data transitions.

> > > I think "the files installed by ``Architecture: all`` packages always
> > > provide architecture-independent interfaces." is too broad. The counter
> > > example is haskell-devscripts-minimal. This needs to be weakened
> > > somehow.
> 
> I would argue that these interfaces are architecture-independent from
> the perspective of the package's (lack of) architecture. What they
> are not independent of is the *build machine* architecture, just like
> running uname -m or inspecting /proc/cpuinfo aren't independent of the
> build machine architecture. This is certainly a problem for
> cross-compilation, but it isn't the same issue as in dpkg or pkg-config,
> where the architecture for which dpkg or pkg-config was built gets
> hard-coded into its installed files (as the output of --print-architecture
> or part of the default search path, respectively).

That's a nice view, but it is not the view expressed by Multi-Arch. The
meaning of the header considers the whole installation set as a unit.
Whether you view this in a package building context or runtime context
does not matter, what matters is whether the tools behave differently
when you swap the architecture of underlying parts.

As a side note, we marked pkg-config Multi-Arch: foreign, but that is
technically wrong on another level. The marking would imply that it
doesn't matter which architecture you use to supply the package. A
prospective README.multiarch would need to say that you must not use
plain pkg-config (without a triplet prefix). Yet that is what most
packages do. If you perform an archive rebuild of pkg-config build-rdeps
on amd64 in a chroot with preinstalled pkg-config:i386, the majority of
builds will fail even though their Build-Depends are installable.

This is another place where we bend the rules just to make it barely
useful. For performing useful cross builds, one needs to discard host
architecture instances of ``Multi-Arch: foreign`` packages.

> > > For instance, the policy should make it
> > > clear that marking libmdds-dev `Multi-Arch: foreign` (fictional, see
> > > #843023) would be a policy violation.
> 
> It is not clear to me that doing so *should* be a policy violation. If
> libmdds-dev contains only headers (no shared or static library), and it
> exposes architecture-independent libboost-dev headers (but no Boost
> shared or static library), is there really anything wrong with having
> libboost-dev from "the wrong architecture"?

As long as everything is header-only, you can use ``Multi-Arch:
foreign``.  The thing is, even if libboost-dev was
architecture-independent, it would expose libstdc++-7-dev. Since
exposure is transitive, that carries over to libmdds-dev.

Boost's dependency on libstdc++-4.8-dev | libstdc++-dev looks a bit
strange though. Since libc++-dev provides libstdc++-dev (and no compiler
will just use libc++-dev when it is installed without further 

Bug#749826: Documenting `Multi-Arch: foreign`

2017-09-04 Thread Helmut Grohne
On Sat, Sep 02, 2017 at 08:44:14AM -0700, Sean Whitton wrote:
> Rather than introduce the new terminology 'intended interface', which we
> would definitely have to define, how about something like this:
> 
> If all a package's architecture-dependent interfaces are listed in
> README.multiarch, the package is not considered to have any
> architecture-dependent interfaces for the purposes of determining
> whether it may be labelled Multi-Arch: foreign.

This is not how it works. It's not like you can just mark any package
Multi-Arch: foreign after saying that it is architecture-dependent. That
documentation must come with a contract saying that reverse dependencies
must not use those architecture-dependent interfaces.

> If libc6's use is legitimate then it seems we'd need to include this as
> an exception.

Well, it's not exactly legitimate. It's more like unavoidable as Simon
pointed out in his reply. Technically, libc6's behaviour is wrong and
causes unpack errors. The reasonable solution would be prohibiting
coinstallation of libc6:mips and libc6:mipsel, but package metadata does
not allow us to do that currently (#747261 -> self-conflicts are always
ignored). The other option of removing Multi-Arch: same from libc6 would
essentially render Multi-Arch useless. So all we can do now is pretend
the issue wasn't there.

> > * If you rebuild the source package with a very different
> > installation set (i.e. much newer Build-Depends), does it still
> > have to match with older instances? Example: #825146. What
> > divergence in installation sets is ok?
> 
> We could just say that it must match the instances in the target suite.

We could. That would render libgiac0 rc buggy for instance, because it
was built on mips64el three weeks later than on other architectures and
thus uses an incompatible gettext.

That definition is pretty annoying for bootstraps though as replicating
ancient toolchain is kinda the opposite of what bootstrappers do.

> >(A simple way to satisfy this requirement is to use
> >architecture-dependent paths exclusively. That works except for
> >/usr/share/doc/$pkg.)
> >
> >  * The maintainer scripts must handle multiple configuration and
> >multiple deconfiguration correctly. In particular, a package can be
> >purged for one architecture while being installed for another.
> >Example: #682420.
> >
> >(A simple way to satisfy this requirement is to not ship maintainer
> >scripts.)
> >
> >  * Source packages carrying any binary package marked `Multi-Arch: same`
> >must always be binNMUed in lock-step. (Presently violated e.g. by
> >libselinux1)
> 
> Could you turn this into some commits against my branch, please?

I tried and ran into a new problem: I am now convinced that we cannot
just describe one Multi-Arch value after another as they do share some
common values. That "interface" aspect and architecture-constraints on
dependencies is a common theme and likely deserves an introductory text.

Yet, I am attaching what I have.

> It sounds like we need to just drop the whole bullet point.
> Architecture: all packages need to be checked carefully, just like
> Architecture: any packages.

Reworded.

> To my mind, the most important ways to achieve readability in this case
> are
> 
> - avoid repetition
> - avoid "probably", "likely" sentences.

The latter is particularly hard, because we violate the strict
definitions more often than is immediately apparent.

As Simon's mail demonstrates, we likely need more answers/consensus
before continuing. I'll reply in a separate mail.

Helmut
diff --git a/policy/ch-controlfields.rst b/policy/ch-controlfields.rst
index 509a96e..e6451d5 100644
--- a/policy/ch-controlfields.rst
+++ b/policy/ch-controlfields.rst
@@ -1028,6 +1028,18 @@ control file.
 We consider the meaning of each possible value of this field
 separately.
 
+``Multi-Arch: no``
+++
+
+This value is the default. When satisfying a dependency on a package
+(implicitly) marked ``Multi-Arch: no``, the depender and the dependee
+must have the same architecture. For the purpose of this matching,
+``Architecture: all`` packages are treated as if they had the
+architecture value of ``dpkg``.
+
+The value ``no`` cannot currently be used in binary packages due to
+limitations of the archive processing.
+
 ``Multi-Arch: foreign``
 +++
 
@@ -1037,12 +1049,15 @@ architecture.
 In order to determine whether this holds, you should consider
 
 the files installed by the package
-``Architecture: all`` packages always provide
-architecture-independent interfaces.  Shared and static libraries
-provide architecture-dependent ABIs.  Binary executables may
-provide architecture-independent interfaces: could software
-interacting with the executable determine the architecture for
-which it was built without reading the executable file?
+``Architecture: all`` packages tend to provide

Bug#924401: base-files fails postinst when base-passwd is unpacked

2019-03-15 Thread Helmut Grohne
Hi Santiago,

On Fri, Mar 15, 2019 at 11:58:12AM +0100, Santiago Vila wrote:
> blame for such bug, is annoying me. (So, Helmut, please file a bug
> in the bootstrapping tool which does not work for you, and do not
> try to fix it here).

I refuse the view that multistrap is buggy. You cite undocumented
behaviour as a reason to mark it buggy. However, multistrap relies on
semantics presently assured by policy. Given that policy talks about
unpacked packages, applying it to bootstrap (in its present wording) is
reasonable. There is a bug somewhere between policy, base-files and
base-passwd, which is exactly how I filed it.  Once this bug is fixed
(in any one of these components), additional bugs can result from that.

I think at least Guillem and Santiago were arguing that policy should
not be applied to bootstrap. While I don't like that view, I do find it
reasonable. It can be made explicit in section 3.8 quite easily:

 Since dpkg will not prevent upgrading of other packages while an
 ``essential`` package is in an unconfigured state, all ``essential``
 packages must supply all of their core functionality even when
-unconfigured. If the package cannot satisfy this requirement it must not
+unconfigured after being configured at least once.
+If the package cannot satisfy this requirement it must not
 be tagged as essential, and any packages depending on this package must
 instead have explicit dependency fields as appropriate.

After doing so, we'll likely need to do something about mmdebstrap and
multistrap as well as furthering our utopia about declarative
replacements for maintainer scripts.

Helmut



Bug#924401: base-files fails postinst when base-passwd is unpacked

2019-03-14 Thread Helmut Grohne
On Thu, Mar 14, 2019 at 07:50:27AM +0100, Johannes Schauer wrote:
> > I would certainly consider a lot cleaner to add a new field to base-files in
> > the form "Bootstrap-Depends: base-passwd" than converting all chowns in
> > postinst to use integer numbers.
> 
> I agree that we should not expect maintainers to write numeric user and group
> ids into their maintainer scripts. This is not only hard to write but also 
> hard
> to read and maintain. In my opinion, using numeric ids should only be a
> temporary measure until we have a declarative method or other helper that does
> the correct translation instead. But since no such helper exists right now,
> numeric ids are probably the best way to fix this bug for buster.

I object to this view. It was never suggested to have anyone write
numeric ids. What Simon suggested was writing symbolic names and have
the package build use static allocations to translate these symbolic
names to numeric ids at build time. This is a whole different story than
having to write them.

If we agree that this would be the best fix for buster, I volunteer to
write a patch for base-files to implement that. Doing so would be easier
if using a more featureful template interpolation language than sed. Do
you (Santiago) have any preference here? I could think of
m4/sh/perl/python. sed will work, but might be ugly.

On Thu, 14 Mar 2019 10:21:30 +0100, Santiago Vila wrote:
> The way I see it, if base-files fails during bootstrapping it's not
> because it does not "help" the bootstrapping tool, but because the
> bootstrapping tool didn't bootstrap base-passwd in the first place.

I think this view is difficult. How is a bootstrap tool supposed to know
that it must configure base-passwd before base-files? Where should we
document that? Basically everyone in this thread except you argued that
requiring such out-of-band knowledge is bad. And if that really is
required, I think policy should be a little explicit about that. Even
though Guillem found a paragraph that supports this view, it is quite
implicit at present.

> Now the question would be if we really need to add a paragraph to
> Debian Policy, "Recommendations/guidelines for bootstrapping tools",
> clearly stating that bootstrapping tools should bootstrap base-passwd
> before trying to configure base-files. I think that would be quite
> clear by now, but I could be wrong.

I actually don't think that policy should document this dependency,
because it really should be an implementation detail. From my
perspective, making it explicit that policy only applies post-bootstrap
is sufficient (e.g. copying the "configured at least once" language from
section 6.5).

If on the other hand, we require the literal interpretation that
base-files must be able to configure while other essential packages are
only unpacked and never configured, it all becomes a lot easier to
reason about. Simon's proposal implements that easily and is a
maintainable solution.

Helmut



Bug#924401: base-files fails postinst when base-passwd is unpacked

2019-03-12 Thread Helmut Grohne
Package: base-passwd,base-files,debian-policy

Debian policy section 3.8 says:

| Essential is defined as the minimal set of functionality that must be
| available and usable on the system at all times, even when packages
| are in the “Unpacked” state.

When unpacking (but not configuring) a buster or unstable essential
package set, nothing creates /etc/passwd. Creation of that file is
performed by base-passwd.postinst. base-files.postinst relies on a
working /etc/passwd by using e.g. "chown root:root".

Now we can make a choice:
A. /etc/passwd is part of base-passwd's interface and base-files is
   right in relying on it working at all times. Then base-passwd is rc
   buggy for violating a policy must. Fixing this violation is
   technically impossible.
B. /etc/passwd is not part of base-passwd's interface and base-files
   wrongly relies on its presence rendering base-files rc buggy.
C. Guillem Jover hinted that policy expects every essential package to
   be configured at least once. The current text does not make this
   assumption clear. If it holds, policy would simply say nothing about
   how to bootstrap an essential system, which may be fine. Given that
   we have debootstrap, cdebootstrap, multistrap, and mmdebstrap, it
   seems like specifying the bootstrap interface would be a good idea.
   Unfortunately, I don't exactly understand the bootstrap interface at
   present. In practise, you cannot run postinsts of essential packages
   in arbitrary order.

I argue that something is buggy. I'm not sure what. I gave three
options. Can we gather consensus on one of these?

Helmut



Bug#924401: base-files fails postinst when base-passwd is unpacked

2019-03-12 Thread Helmut Grohne
Hi Santiago,

On Tue, Mar 12, 2019 at 06:17:50PM +0100, Santiago Vila wrote:
> To be precise: Who is unpacking (but not configuring) a buster or
> unstable essential package set, if not a bootstrapping tool?

multistrap is doing just that.

https://manpages.debian.org/testing/multistrap/multistrap.1.en.html
| Once installed, the packages themselves need to be configured using
| the package maintainer scripts and "dpkg --configure -a", unless this
| is a native multistrap.

> Do any of them still don't know that base-passwd should be configured
> first because otherwise any other package using root (be it base-files
> or any other) will fail? I think this was already settled in the last
> discussion we had about this several years ago.

multistrap doesn't take care of this and you can provoke a
base-files.postinst failure this way.

Then there is mmdebstrap. I looked into it and couldn't find any code
that orders base-passwd or base-files or creates an /etc/passwd. It
might not fail now.

> Can you provide at least a bug number for the bootstrapping tool that
> apparently still tries to configure all packages at once, or
> base-passwd and base-files in the same row?

#924401, but I'm not yet sure which part we need to fix.

I really like Simon's (thank you for that enlightening reply) view of
interpolating the uids. It removes a bunch of problems from the equation
and works well when bootstrapping from non-Debian or from ancient Debian
releases even in chrootless mode. At the same time, it is quite safe
(due to the static allocation) and easy to implement. I fail to see
downsides.

Just because debootstrap encodes a ton of hacks to make things barely
work (and break every so often) doesn't mean we have to maintain them
until eternity.

> In other words: Is the present bug report to be considered in a
> theoretical way, or it is the result of some problem that you actually
> found recently with a bootstrapping tool?

I don't have a minimal test case at hand, but I can reproduce it with
multistrap at least.

Helmut



Bug#970234: consider dropping "No hard links in source packages"

2020-09-13 Thread Helmut Grohne
Package: debian-policy
Version: 4.5.0.3
Severity: wishlist

Jakub stumbled into the "No hard links in source packages" requirement
added around 1996 and couldn't make sense of it. Neither could Christoph
nor myself. tar does support hard links just fine. lintian does not
check this property. sugar-log-activity/38 is an example package
violating the property. It is shipped in buster and technically
rc-buggy though no bug is filed about it.

I believe that the requriement needs a rationale. Failing that, it
should be dropped.

Helmut



Bug#970234: consider dropping "No hard links in source packages"

2020-10-12 Thread Helmut Grohne
Hi cate,

On Mon, Oct 12, 2020 at 04:10:00PM +0200, Giacomo Catenazzi wrote:
> The rationale was probably similar so symlinks: they may fail across
> different filesystems, and we supported to have e.g. / /usr /usr/share
> /usr/local /var (and various /var/*) /home /tmp /boot etc on different file
> systems. Now we are more strict on where we can split filesystems (and disk
> are larger, and LVM simplified much of filesystem handling).

You appear to be talking about binary packages. This bug is about source
packages. When you unpack a source package, you are creating a directory
hiearchy rooted at the point where you start unpacking. There is not
possibly any reasonable way to split your source package into multiple
file systems. This is very different from binary packages where the
underlying hiearchy is shared with other packages and directories
frequently already exist.

> I think a hardlink on same directory should be fine, or within directories
> which must be on the same filesystem.

I argue that all files within a source package are always located on the
same filesystem, because the unpack step creates the source package root
directory on one file system and everything else resides on that very
filesystem.

For binary packages, restricting the use of symlinks makes a lot more
sense to me.

Helmut



Bug#983657: debian-policy: weaken manual page requirement

2021-02-28 Thread Helmut Grohne
On Sun, Feb 28, 2021 at 10:53:20AM -0700, Sean Whitton wrote:
> Can you post a patch just doing the moving manpages to dependencies part
> and indicate that you are seeking seconds?  Then we can get that
> applied.

I call for seconds on:

--- a/policy/ch-docs.rst
+++ b/policy/ch-docs.rst
@@ -12,9 +12,9 @@
 "cat page".
 
 Each program, utility, and function should have an associated manual
-page included in the same package. It is suggested that all
-configuration files also have a manual page included as well. Manual
-pages for protocols and other auxiliary things are optional.
+page included in the same package or a dependency. It is suggested that
+all configuration files also have a manual page included as well.
+Manual pages for protocols and other auxiliary things are optional.
 
 If no manual page is available, this is considered as a bug and should
 be reported to the Debian Bug Tracking System (the maintainer of the

Helmut



Bug#983657: debian-policy: weaken manual page requirement

2021-02-27 Thread Helmut Grohne
Package: debian-policy
Version: 4.5.1.0
Severity: wishlist

I think that the Debian policy is unreasonably strict in its manual page
requirement. While the common case is that manual pages are small and
should be included in the same package, occasionally they are numerous
and moving them to a separate package makes sense. Other times, there
already is a -common or -doc package and including them there would be
possible without increasing the package count. Doing so often allows
demoting dependencies to Build-Depends-Indep and thus reducing bootstrap
problems.

I therefore think that the policy should explicitly allow manual pages
to be shipped in a dependency. We can see that this already is
established practice from this non-exhaustive list:
 * aptitude -> aptitude-common
 * assaultcube -> assaultcube-data
 * aumix -> aumix-common
 * auto-multiple-choice -> auto-multiple-choice-common
 * binutils -> binutils-common
 * bitlbee -> bitlbee-common
 * bup -> bup-doc (recommends)
 * cpp-10 -> cpp-10-doc (no relation, license re
 * critterding -> crittering-common
 * grass-core -> grass-doc
 * x3270 -> 3270-common

Beyond this, I think that a manual page does not warrant a strong
dependency given that man-db is not essential. Rather a recommendation
should be strong enough. I'm not sure whether this view is universal
though.

So this is actually asking for two distinct things:
 * Allow moving manual pages to dependencies
 * Allow demoting such dependencies to recommends

A possible wording in ch-docs.rst could be:
 Each program, utility, and function should have an associated manual
-page included in the same package. It is suggested that all
+page included in the same package or one of its dependencies or
+recommended packages. It is suggested that all
 configuration files also have a manual page included as well. Manual
 pages for protocols and other auxiliary things are optional.

What do you think?

Helmut



Bug#983657: debian-policy: weaken manual page requirement

2021-02-28 Thread Helmut Grohne
On Sun, Feb 28, 2021 at 11:58:08AM +0100, Bill Allombert wrote:
> On Sun, Feb 28, 2021 at 08:29:21AM +0100, Helmut Grohne wrote:
> > So this is actually asking for two distinct things:
> >  * Allow moving manual pages to dependencies
> >  * Allow demoting such dependencies to recommends
> > 
> > A possible wording in ch-docs.rst could be:
> >  Each program, utility, and function should have an associated manual
> > -page included in the same package. It is suggested that all
> > +page included in the same package or one of its dependencies or
> > +recommended packages. It is suggested that all
> >  configuration files also have a manual page included as well. Manual
> >  pages for protocols and other auxiliary things are optional.
> > 
> > What do you think?
> 
> The goal is to avoid program to be installed but not their manpages,
> so generally I do not find Recommends to be enough.

If we cannot build consensus around that second part, so be it. But
maybe the other part (moving manual pages to dependencies) can reach
consensus?

Helmut



Bug#924401: #924401 base-files fails postinst when base-passwd is unpacked

2021-02-22 Thread Helmut Grohne
On Mon, Feb 22, 2021 at 07:33:10AM +, Tim Woodall wrote:
> A. /etc/passwd is part of base-passwd's interface and base-files is
>right in relying on it working at all times. Then base-passwd is rc
>buggy for violating a policy must. Fixing this violation is
>technically impossible.
> 
> 
> I seem to have hit this same issue independently.
> 
> Could you explain why "Fixing this violation is technically impossible"

The requirement here is that base-passwd needs to work when unpacked.
The only way to make that work is making /etc/passwd a conffile. That
would technically be possible, but it would be very annoying, because
this file is different on virtually any Debian installation. So we
cannot make it a conffile in practice. The next bet would be ensuring
that base-passwd.postinst is run before other packages' postint somehow.
Such an ordering mechanism does not exist at present and it would be
prone to dependency loops.

> As far as I can see, making base-passwd not essential, only required,
> and then making passwd and base-files pre-depend on base-passwd the
> system seems to bootstrap /etc/passed and /etc/group OK.

What you write is almost certainly self-contradictory. base-files is
essential. Anything it depends on (including base-passwd in your
scenario) is pseudo-essential and thus inherits all the same
requirements except for actually being essential. You gained nothing.
And you didn't explain how you'd make base-passwd non-essential.

> That also seems to conform to the debian policy. The oddity is that
> base-files and passwd only actually need to depend on base-passwd, not
> pre-depend on it as they only use /etc/passwd and /etc/group in the
> postinst scripts but the debian policy doesn't seem to consider this
> case.

They don't have to depend on base-passwd at all, because dependencies on
essential packages should be omitted.

I suggest that you detail on the practical issue you have been hitting.
Doing so allows evaluating prospective solutions against all relevant
use cases.

Helmut



Bug#1051801: document DEB_BUILD_OPTIONS value nopgo

2023-09-12 Thread Helmut Grohne
Package: debian-policy
Version: 4.6.2.0
Severity: wishlist
X-Debbugs-Cc: 
debian-cr...@lists.debian.org,rb-gene...@lists.reproducible-builds.org

Hi,

more and more packages implement a technique called profile guided
optimization. The general idea is that it performs a build that is
instrumented for profiling first. It then runs a reasonable workload to
collect profiling data, which in turn is used to guide the optimizer of
a second build which is not thus instrumented. The idea is that this
second build probably is faster than a regular build.

Quite obviously this approach completely breaks cross building. It also
is unclear how it affects reproducible builds since such builds depend
on the performance characteristics of the system performing the build.
This makes it very obvious that the pgo technique has downsides that
warrant disabling it in some situations.

A number of packages have agreed on disabling such optimization when
DEB_BUILD_OPTIONS contains nopgo. I'm aware of the following packages:
 * binutils
 * cross-toolchain-base
 * gcc-VER
 * halide
 * pythonVER

I'll also be filing a patch for foot to support this option.

Is this sufficient coverage to document the option already? If not, this
bug report can serve as a central point for discussing it and its
adoption.

Proposed wording:

This tag requests that any optimization performed during the build
should not rely on performance characteristics captured during the
build. Such optimization is usually called profile guided
optimization.

The proposed tag intentionally is fairly narrow. It does not cover link
time optimization. It also does not cover the case where profiling
information is recorded ahead of upload and included in the source
package[1]. In both cases, neither cross building nor reproducibility is
impacted.

As for cross builds, it is not clear to me where we want to put the
responsibility to disable pgo. At the time of this writing, most
packages automatically disable pgo when performing a cross build. On the
flip side, we could have any cross builder set nopgo like they set
nocheck already. Doing so would allow performing a pgo-enabled cross
build to i386 on amd64 while still benefiting from the larger address
space for instance. I consider this aspect to be a separate matter
though.

Helmut

[1] 
https://lists.reproducible-builds.org/pipermail/rb-general/2022-June/002638.html



Bug#1051371: debian-policy: stop referring to legacy filesystem paths for script interpreters

2023-09-07 Thread Helmut Grohne
Hi Luca,

On Wed, Sep 06, 2023 at 10:50:14PM +0100, Luca Boccassi wrote:
> Package: debian-policy
> X-Debbugs-Cc: j...@debian.org hel...@subdivi.de
> 
> Debian only supports merged-usr since Bookworm. We should update policy
> to reference /usr/bin/sh and similar paths to describe recommended
> shebangs for scripts.

I disagree. The promise of merged-/usr has always been that both paths
are valid. /bin/sh remains the location recommended by external
standards and (like the dynamic loader path) should remain the way it
is.

> I heard many times the policy maintainers mention something along the
> lines of 'policy should not be a hammer to beat other maintainers
> with'. Today I saw policy being used to force a maintainer to re-add
> support for the deprecated and unsupported split-usr filesystem layout,
> as 'policy only mentions /bin/sh, not /usr/bin/sh'.

This can also be addressed by adding a note to policy that allows
maintainers to rely on the aliasing. If there was a need to refer to the
shell via /usr/bin/sh, we would aim for eventually removing the aliasing
symlinks. That's not what we're up to.

> So let's update the policy to refer to modern and supported filesystem
> paths as adopted by Debian de-facto and de-jure, and stop other
> maintainers from getting beaten with it.

I don't think this is right. We intend to finalize the /usr-merge
transition by moving files from / to /usr. This is is an implementation
strategy that arises from the constraints set by the current
implementation of dpkg and other components. It is not a new filesystem
layout that we expect upstreams to support. Rather, we promised to
upstreams that both ways will work. The aspect that in a data.tar we'll
have to install files to /usr is a technical one and can be supported by
debhelper. Still, packages may assume that referencing files they
installed to /usr via aliased paths in / will continue to work.

> Patch attached and also pushed to
> https://salsa.debian.org/bluca/policy/-/tree/bin_sh

Nack to this particular change, but I agree that it is worth considering
two changes to policy sooner and later:
 * Making it explicit that referring to files via either paths for
   read-only consumption is ok.
 * DEP17 aims for not installing any files in aliased locations and we
   should encode that in policy once there is wide adoption of this rule
   in binary packages.

Would you agree to repurpose this bug to propose the former change?
While my variant is weaker, it still prevents people from using policy
to require supporting split-/usr.

Helmut



Bug#970234: consider dropping "No hard links in source packages"

2022-09-22 Thread Helmut Grohne
Hi Russ,

On Thu, Sep 22, 2022 at 07:20:00PM -0700, Russ Allbery wrote:
> From 12b014c4b930577a728dfb1254b64aac6a5eb1e0 Mon Sep 17 00:00:00 2001
> From: Russ Allbery 
> Date: Thu, 22 Sep 2022 19:15:52 -0700
> Subject: [PATCH] Allow hard links in source packages
> 
> It's not clear why this restriction was in place, and Debian
> included a package containing hard links without anyone noticing
> in the last release.
> ---
>  policy/ch-source.rst | 11 ++-
>  1 file changed, 2 insertions(+), 9 deletions(-)
> 
> diff --git a/policy/ch-source.rst b/policy/ch-source.rst
> index c7415fc..a7df539 100644
> --- a/policy/ch-source.rst
> +++ b/policy/ch-source.rst
> @@ -282,8 +282,8 @@ source files in a package, as far as is reasonably 
> possible.  [#]_
>  Restrictions on objects in source packages
>  --
>  
> -The source package must not contain any hard links,  [#]_ device special
> -files, sockets or setuid or setgid files.. [#]_
> +The source package must not contain device special files, sockets, or
> +setuid or setgid files. [#]_
>  
>  .. _s-debianrules:
>  
> @@ -918,13 +918,6 @@ must not exist a file ``debian/patches/foo.series`` for 
> any ``foo``.
> would be nice if the modification time of the upstream source would
> be preserved.
>  
> -.. [#]
> -   This is not currently detected when building source packages, but
> -   only when extracting them.
> -
> -   Hard links may be permitted at some point in the future, but would
> -   require a fair amount of work.
> -
>  .. [#]
> Setgid directories are allowed.
>  

Seconded.

Helmut


signature.asc
Description: PGP signature


Bug#1020323: debian-policy: document DPKG_ROOT

2022-10-05 Thread Helmut Grohne
Hi Joshannes,

On Wed, Oct 05, 2022 at 02:35:30PM +0200, Johannes Schauer Marin Rodrigues 
wrote:
>To enable creating a foreign architecture Debian chroot during the early
>bootstrap of a new Debian architecture, maintainer scripts and utilities
>called by maintainer scripts of packages in the essential and
>build-essential set, should support operating on a custom chroot directory.
>This is to avoid running any of the foreign architecture utilities from the
>chroot, because those cannot be executed during the early bootstrapping
>phase of a new architecture.  Instead, by avoiding the chroot() call,
>utilities from the outside should operate on the chroot path given via the
>`DPKG_ROOT` environment variable.  This environment variable is set but
>empty during normal package installations.  If the `DPKG_ROOT` environment
>variable is not empty, then this indicates to the maintainer scripts and 
> the
>tools it executes, that a chroot is being built as part of an early
>architecture bootstrap and all operations should be performed in the chroot
>path given by the contents of the `DPKG_ROOT` environment variable. In that
>case, the maintainer script should not modify anything outside the chroot
>directory.

Thank you for writing this.

> I refrained from using "must" because we promised maintainers that they would
> not need to do the work themselves but will get patches sent from us. We do 
> not
> want to force work on maintainers by making it an RC bug if they do not 
> support
> DPKG_ROOT.
> 
> Helmut, what do you think?

I think this text is already quite good. I am yet wondering about the
scope of support that we mention here.

1. You write that we want essential + build-essential. In practice, we
   also want things such as apt or systemd. I am wondering whether we
   should rephrase this in a less specific way that leaves open some
   packages beyond the mentioned set. Vagueness can be avoided by
   explaining the purpose: We target packages that are relevant to
   setting up an initial build daemon.

2. We should likely mention that package upgrade and removal paths can
   freely ignore DPKG_ROOT. Maintainer scripts can assume that when
   DPKG_ROOT is in effect, it will be an initial installation. Something
   along this would have helped Michael in determining whether his
   recent changes to init script handling would affect DPKG_ROOT.

Let me try to extend your text:

To enable creating a foreign architecture Debian chroot during the early
bootstrap of a new Debian architecture, maintainer scripts and utilities
called by maintainer scripts of packages relevant to setting up a
build daemon, should support operating on a custom chroot directory.
[... keep rest of the text unchanged ...]
Support for `DPKG_ROOT` in code that handles package upgrades or
package removal is not needed.

Helmut



Bug#945269: debian-policy: packages should use tmpfiles.d(5) to create directories below /var

2023-06-05 Thread Helmut Grohne
On Sun, Jun 04, 2023 at 02:56:59PM +0100, Simon McVittie wrote:
> I think one way or another, if anyone is going to set a package-level
> dependency on systemd-tmpfiles, the first (preferred) dependency needs to
> be on either a concrete provider (systemd or systemd-tmpfiles-standalone
> in this case), or a default-systemd-tmpfiles virtual package
> that only has one provider per architecture (which is the way
> {default-,}dbus-{system,session}-bus are handled). Otherwise, you
> can get a non-deterministic choice of default implementation, which
> seems strictly worse than either depending on systemd or depending on
> systemd-tmpfiles-standalone - if you're unlucky, it can have all the
> disadvantages of either one of those.

Thank you for the elaborate writeup. There is little to add to what you
write except for one minor aspect.

>   - actual result: apt's heuristic might have difficulty realising that
> it needs to do that

I think we should be able to guide apt here. I recently had to look into
Replaces and in that process I also had to re-read policy section 7.6.2.
It details the "other" use of Replaces to guide a package manager (e.g.
apt) for changing implementations of an interface - which is exactly
what we are talking about here. In essence, it says that we should do:

Provides: systemd-tmpfiles
Conflicts: systemd-tmpfiles
Replaces: systemd-tmpfiles

And systemd-standalone-tmpfiles does that. :) But systemd does not. :(
systemd misses out on Conflicts and Replaces. I guess (but have not
verified) that once these are added, apt would be happier to "upgrade"
systemd-standalone-tmpfiles to systemd when needed.

I've also experimented with a minimal chroot, installed the standalone
tools and the asked apt to install libbiometric0 (which happens to have
a dependency on systemd) and apt was quite happy with removing the
standalone variants. This is still missing the consumers of the provided
facilities though, so it might not be representative.

Is there any concrete evidence of apt having difficulties in a real
situation? Or maybe a constructed example demonstrating this? Thanks for
being cautious, but I'd also like to understand whether this is
hypothetical or real.

Helmut



Bug#1057199: debian-policy: express more clearly that Conflicts to not reliably prevent concurrent unpacks

2023-12-01 Thread Helmut Grohne
Package: debian-policy
Version: 4.6.2.0
X-Debbugs-Cc: debian-d...@lists.debian.org, de...@lists.debian.org

Hi,

first of all huge thanks to David, Guillem and Julian for all of their
explanations. In large parts, this bug report is yours and I'm just the
one writing it down.

§7.4 currently starts with:

When one binary package declares a conflict with another using a
Conflicts field, dpkg will refuse to allow them to be unpacked on
the system at the same time.

I believe this is technically wrong. There are situations where dpkg
will allow such unpacks to temporarily co-exist. §6.6 goes into further
detail and is accurate.

Suppose we have two arch:all packages a version 1 and b version 1 both
of which are installed. Now we attempt to install a version 2, which
happens to declare "Conflicts: b (<< 2)". We may therefore mark b for
removal

echo "b:all deinstall" | dpkg --set-selections

and proceed to installing a:

dpkg --auto-deconfigure --unpack a_2.deb

When we do this, dpkg will unpack a version 2 before removing the files
of b version 1. I argue this is very briefly allowing these packages to
be unpacked at the same time as the next thing dpkg does is removing b's
files.

This situation can be forced if we add package b version 2, which
declares "Breaks: a (<< 2)" and attempt to install both. apt figures
that it has to temporarily remove b and hence issues the selection
above. Then it proceeds to unpacking both packages.

The difference actually is rather subtle. As dpkg is tracking ownership
of files, one should not be observing a difference. What one can see is
that a.preinst version 2 is run at a time where b version 1 is still
unpacked (and that's fine as the statement only talks about unpack). The
effects of concurrent unpack are theoretically not observable, due to
dpkg tracking files. However when you add aliasing to the mix, dpkg can
now delete files that are still needed via differences in aliasing. That
way - and I am fully aware that this violates fundamental assumptions of
dpkg - we can make the order of unpacks visible and demonstrate that
indeed a version 2 is unpacked before b version 1 has its files removed.
All of this is fully in line with the long description in §6.6. What I
take issue with is the executive summary at the start of §7.4.

In case you like some kind of test case to tinker with, I'm attaching a
script that demonstrates the situation.

Helmut


conflict-demo.sh
Description: Bourne shell script