Bug#749826: Documenting `Multi-Arch: foreign`
Hello Helmut, On Mon, Sep 04 2017, Helmut Grohne wrote: > On Sat, Sep 02, 2017 at 08:44:14AM -0700, Sean Whitton wrote: >> Rather than introduce the new terminology 'intended interface', which >> we would definitely have to define, how about something like this: >> >> If all a package's architecture-dependent interfaces are listed >> in README.multiarch, the package is not considered to have any >> architecture-dependent interfaces for the purposes of determining >> whether it may be labelled Multi-Arch: foreign. > > This is not how it works. It's not like you can just mark any package > Multi-Arch: foreign after saying that it is > architecture-dependent. That documentation must come with a contract > saying that reverse dependencies must not use those > architecture-dependent interfaces. I see what you mean. I find the terminology "intended interface" confusing: if something is not intended as an interface, then it is simply not an interface. The interfaces of a package are up to its maintainer. I just pushed a commit using this text, which gets around this issue; let me know what you think: The interfaces of a package are determined by its maintainer. However, some packages might expose architecture dependencies when other packages use them in a manner not intended by the maintainer. This can happen when it is not clear which parts of the package are its interfaces. In such cases, where the package satisfies the criterion for ``Multi-Arch: foreign`` but might expose architecture dependency because it is not clear which parts of the package are its interfaces, the interfaces of the package should be described in the file ``debian/README.multiarch``. >> If libc6's use is legitimate then it seems we'd need to include this >> as an exception. > > Well, it's not exactly legitimate. It's more like unavoidable as Simon > pointed out in his reply. Technically, libc6's behaviour is wrong and > causes unpack errors. The reasonable solution would be prohibiting > coinstallation of libc6:mips and libc6:mipsel, but package metadata > does not allow us to do that currently (#747261 -> self-conflicts are > always ignored). The other option of removing Multi-Arch: same from > libc6 would essentially render Multi-Arch useless. So all we can do > now is pretend the issue wasn't there. I don't think Policy should be silent about this. I have pushed a commit adding a footnote which notes that libc6 is not expected to be in compliance. >> > * If you rebuild the source package with a very different >> > installation set (i.e. much newer Build-Depends), does it still >> > have to match with older instances? Example: #825146. What >> > divergence in installation sets is ok? >> >> We could just say that it must match the instances in the target >> suite. > > We could. That would render libgiac0 rc buggy for instance, because it > was built on mips64el three weeks later than on other architectures > and thus uses an incompatible gettext. Is this a problem? Don't we usually binNMU in such a situation? > That definition is pretty annoying for bootstraps though as > replicating ancient toolchain is kinda the opposite of what > bootstrappers do. I'm not sure what you mean. My suggestion is that we say it should match what's in the suite right now -- /not/ any ancient toolchain. >> Could you turn this into some commits against my branch, please? > > I tried and ran into a new problem: I am now convinced that we cannot > just describe one Multi-Arch value after another as they do share some > common values. That "interface" aspect and architecture-constraints on > dependencies is a common theme and likely deserves an introductory > text. > > Yet, I am attaching what I have. Thanks! Applied, and then followed up with a commit tweaking wording. > As Simon's mail demonstrates, we likely need more answers/consensus > before continuing. I'll reply in a separate mail. Do these remaining issues affect our definitions of the `foreign` and `no` values, or only the definitions for the other possible values? If we are finished with our definitions for `no` and `foreign`, I think it would be worth releasing them in Policy, of course with a caveat saying "there are other possible values". I say this as a package maintainer would has benefitted from reading your descriptions of `foreign` and `no`. These are useful and should be out there. -- Sean Whitton signature.asc Description: PGP signature
Bug#749826: Documenting `Multi-Arch: foreign`
On Sat, Sep 02, 2017 at 08:44:14AM -0700, Sean Whitton wrote: > Rather than introduce the new terminology 'intended interface', which we > would definitely have to define, how about something like this: > > If all a package's architecture-dependent interfaces are listed in > README.multiarch, the package is not considered to have any > architecture-dependent interfaces for the purposes of determining > whether it may be labelled Multi-Arch: foreign. This is not how it works. It's not like you can just mark any package Multi-Arch: foreign after saying that it is architecture-dependent. That documentation must come with a contract saying that reverse dependencies must not use those architecture-dependent interfaces. > If libc6's use is legitimate then it seems we'd need to include this as > an exception. Well, it's not exactly legitimate. It's more like unavoidable as Simon pointed out in his reply. Technically, libc6's behaviour is wrong and causes unpack errors. The reasonable solution would be prohibiting coinstallation of libc6:mips and libc6:mipsel, but package metadata does not allow us to do that currently (#747261 -> self-conflicts are always ignored). The other option of removing Multi-Arch: same from libc6 would essentially render Multi-Arch useless. So all we can do now is pretend the issue wasn't there. > > * If you rebuild the source package with a very different > > installation set (i.e. much newer Build-Depends), does it still > > have to match with older instances? Example: #825146. What > > divergence in installation sets is ok? > > We could just say that it must match the instances in the target suite. We could. That would render libgiac0 rc buggy for instance, because it was built on mips64el three weeks later than on other architectures and thus uses an incompatible gettext. That definition is pretty annoying for bootstraps though as replicating ancient toolchain is kinda the opposite of what bootstrappers do. > >(A simple way to satisfy this requirement is to use > >architecture-dependent paths exclusively. That works except for > >/usr/share/doc/$pkg.) > > > > * The maintainer scripts must handle multiple configuration and > >multiple deconfiguration correctly. In particular, a package can be > >purged for one architecture while being installed for another. > >Example: #682420. > > > >(A simple way to satisfy this requirement is to not ship maintainer > >scripts.) > > > > * Source packages carrying any binary package marked `Multi-Arch: same` > >must always be binNMUed in lock-step. (Presently violated e.g. by > >libselinux1) > > Could you turn this into some commits against my branch, please? I tried and ran into a new problem: I am now convinced that we cannot just describe one Multi-Arch value after another as they do share some common values. That "interface" aspect and architecture-constraints on dependencies is a common theme and likely deserves an introductory text. Yet, I am attaching what I have. > It sounds like we need to just drop the whole bullet point. > Architecture: all packages need to be checked carefully, just like > Architecture: any packages. Reworded. > To my mind, the most important ways to achieve readability in this case > are > > - avoid repetition > - avoid "probably", "likely" sentences. The latter is particularly hard, because we violate the strict definitions more often than is immediately apparent. As Simon's mail demonstrates, we likely need more answers/consensus before continuing. I'll reply in a separate mail. Helmut diff --git a/policy/ch-controlfields.rst b/policy/ch-controlfields.rst index 509a96e..e6451d5 100644 --- a/policy/ch-controlfields.rst +++ b/policy/ch-controlfields.rst @@ -1028,6 +1028,18 @@ control file. We consider the meaning of each possible value of this field separately. +``Multi-Arch: no`` +++ + +This value is the default. When satisfying a dependency on a package +(implicitly) marked ``Multi-Arch: no``, the depender and the dependee +must have the same architecture. For the purpose of this matching, +``Architecture: all`` packages are treated as if they had the +architecture value of ``dpkg``. + +The value ``no`` cannot currently be used in binary packages due to +limitations of the archive processing. + ``Multi-Arch: foreign`` +++ @@ -1037,12 +1049,15 @@ architecture. In order to determine whether this holds, you should consider the files installed by the package -``Architecture: all`` packages always provide -architecture-independent interfaces. Shared and static libraries -provide architecture-dependent ABIs. Binary executables may -provide architecture-independent interfaces: could software -interacting with the executable determine the architecture for -which it was built without reading the executable file? +``Architecture: all`` packages tend to provide
Bug#749826: Documenting `Multi-Arch: foreign`
Hi Simon, On Sat, Sep 02, 2017 at 05:26:57PM +0100, Simon McVittie wrote: > That seems like it might be a bug (or design flaw if you prefer). If a > package (build-)depends on foo:any, it is saying "I am only using the > arch-indep parts of foo's interface", whatever those are. You may call it feature. The idea here was that :any should not be used mindlessly. Thus it is only allowed on packages properly marked for that used with ``Multi-Arch: allowed``. In Build-Depends, you can mostly achieve the same effect with :native (which essentially is :any on any package (but Architecture: all packages (though our dependency resolvers don't agree here))). > Perhaps a dependency on foo:any by (for example) bar:mips should > always be satisfiable by foo:mips (as though the :any had been omitted), > regardless of foo's multi-arch status? This would bring it back to the > same meaning as omitting the :any, in the trivial case where only one > architecture is enabled. That proposal may ease meta data changes indeed. I suspect that it would also cause a lot of useless :any annotations. It's a two-sided sword. > Perhaps a dependency on foo:any should be satisfiable by any instance > of foo that is Multi-Arch: foreign? (In this case the :any is completely > redundant, because foreign sets up a similar situation from the other end) After studying Multi-Arch for many years now, I recognize that a core idea is to almost always flag the architecture constraint on the target of an edge. To understand this wicked sentence, consider a dependency graph and label each node (package) with an architecture. Now Multi-Arch says that by default every edge (dependency) must enforce equal architecture on both ends. Most of the header's job is relaxing this restriction. The designers of Multi-Arch decided that this relaxing should not be a property of the edges (e.g. :any), but a property of the dependee. Thus the current implementation ensures that :any cannot be used in situations where it is inappropriate. As you point out, that design is annoying for meta data transitions. > > > I think "the files installed by ``Architecture: all`` packages always > > > provide architecture-independent interfaces." is too broad. The counter > > > example is haskell-devscripts-minimal. This needs to be weakened > > > somehow. > > I would argue that these interfaces are architecture-independent from > the perspective of the package's (lack of) architecture. What they > are not independent of is the *build machine* architecture, just like > running uname -m or inspecting /proc/cpuinfo aren't independent of the > build machine architecture. This is certainly a problem for > cross-compilation, but it isn't the same issue as in dpkg or pkg-config, > where the architecture for which dpkg or pkg-config was built gets > hard-coded into its installed files (as the output of --print-architecture > or part of the default search path, respectively). That's a nice view, but it is not the view expressed by Multi-Arch. The meaning of the header considers the whole installation set as a unit. Whether you view this in a package building context or runtime context does not matter, what matters is whether the tools behave differently when you swap the architecture of underlying parts. As a side note, we marked pkg-config Multi-Arch: foreign, but that is technically wrong on another level. The marking would imply that it doesn't matter which architecture you use to supply the package. A prospective README.multiarch would need to say that you must not use plain pkg-config (without a triplet prefix). Yet that is what most packages do. If you perform an archive rebuild of pkg-config build-rdeps on amd64 in a chroot with preinstalled pkg-config:i386, the majority of builds will fail even though their Build-Depends are installable. This is another place where we bend the rules just to make it barely useful. For performing useful cross builds, one needs to discard host architecture instances of ``Multi-Arch: foreign`` packages. > > > For instance, the policy should make it > > > clear that marking libmdds-dev `Multi-Arch: foreign` (fictional, see > > > #843023) would be a policy violation. > > It is not clear to me that doing so *should* be a policy violation. If > libmdds-dev contains only headers (no shared or static library), and it > exposes architecture-independent libboost-dev headers (but no Boost > shared or static library), is there really anything wrong with having > libboost-dev from "the wrong architecture"? As long as everything is header-only, you can use ``Multi-Arch: foreign``. The thing is, even if libboost-dev was architecture-independent, it would expose libstdc++-7-dev. Since exposure is transitive, that carries over to libmdds-dev. Boost's dependency on libstdc++-4.8-dev | libstdc++-dev looks a bit strange though. Since libc++-dev provides libstdc++-dev (and no compiler will just use libc++-dev when it is installed without further
Bug#749826: Documenting `Multi-Arch: foreign`
On Sat, 02 Sep 2017 at 08:44:14 -0700, Sean Whitton wrote: > On Sun, Aug 20 2017, Helmut Grohne wrote: > > A common theme with such cases is to resort to `Multi-Arch: allowed` > > (e.g. make), but that has the downside of requiring most consumers to > > attach the :any annotation and that it can never be switched back > > (because :any dependencies on packages not marked M-A:allowed are > > unsatisfiable). That seems like it might be a bug (or design flaw if you prefer). If a package (build-)depends on foo:any, it is saying "I am only using the arch-indep parts of foo's interface", whatever those are. Perhaps a dependency on foo:any by (for example) bar:mips should always be satisfiable by foo:mips (as though the :any had been omitted), regardless of foo's multi-arch status? This would bring it back to the same meaning as omitting the :any, in the trivial case where only one architecture is enabled. Perhaps a dependency on foo:any should always be satisfiable by foo:all, regardless of foo's multi-arch status? (which must be either no or foreign in this case I think) Perhaps a dependency on foo:any should be satisfiable by any instance of foo that is Multi-Arch: foreign? (In this case the :any is completely redundant, because foreign sets up a similar situation from the other end) > > * If there is such a conflict, but the relevant packages are not > > coinstallable due to package relations, is that ok? Example: libc6 > > has such a conflict on /lib/ld.so.1 for mips and mipsel. > > (Presently, you get an unpack error here.) > > If libc6's use is legitimate then it seems we'd need to include this as > an exception. libc6's use is mandated by the distro-independent mips* ABIs, so we can't avoid it (unless we are willing to make binaries built on Debian mips unusable on other distros, including older Debian, by defining a Debian-specific ELF interpreter like /lib/mips-linux-gnu/ld.so.1). > > I think "the files installed by ``Architecture: all`` packages always > > provide architecture-independent interfaces." is too broad. The counter > > example is haskell-devscripts-minimal. This needs to be weakened > > somehow. I would argue that these interfaces are architecture-independent from the perspective of the package's (lack of) architecture. What they are not independent of is the *build machine* architecture, just like running uname -m or inspecting /proc/cpuinfo aren't independent of the build machine architecture. This is certainly a problem for cross-compilation, but it isn't the same issue as in dpkg or pkg-config, where the architecture for which dpkg or pkg-config was built gets hard-coded into its installed files (as the output of --print-architecture or part of the default search path, respectively). > > For instance, the policy should make it > > clear that marking libmdds-dev `Multi-Arch: foreign` (fictional, see > > #843023) would be a policy violation. It is not clear to me that doing so *should* be a policy violation. If libmdds-dev contains only headers (no shared or static library), and it exposes architecture-independent libboost-dev headers (but no Boost shared or static library), is there really anything wrong with having libboost-dev from "the wrong architecture"? S
Bug#749826: Documenting `Multi-Arch: foreign`
Hello Helmut, On Sun, Aug 20 2017, Helmut Grohne wrote: >> - I couldn't figure out how to include this text, because I didn't >> understand it: >> >> For instance, using dpkg --print-architecture can be used to emit the >> native architecture even though dpkg is marked Multi-Arch: >> foreign. Similarly, calling pkg-config (without a prefix) will behave >> differently on different architectures as its search path is >> architecture-dependent even thoug pkg-config is marked Multi-Arch: >> foreign. >> >> Are you saying that packages that depend or implicitly depend on dpkg >> or pkg-config cannot be Multi-arch: foreign, although dpkg and >> pkg-config themselves are Multi-arch: foreign? Why are dpkg and >> pkg-config Multi-arch: foreign, if they provide these >> architecture-dependent interfaces? > > Those are very good questions and clarifying them will lead to a better > understanding of what we have to put into policy. You do understand that > "dpkg --print-architecture" is part of dpkg's interface. Yet its out > varies with its architecture. Taking this strictly would indeed imply > that dpkg is wrongly marked. Similarly, running pkg-config may result in > architecture-dependent paths and thus our strict interpretation would > result in rejecting the foreign marking. > > A common theme with such cases is to resort to `Multi-Arch: allowed` > (e.g. make), but that has the downside of requiring most consumers to > attach the :any annotation and that it can never be switched back > (because :any dependencies on packages not marked M-A:allowed are > unsatisfiable). > > This is where I thought about README.multiarch: > >> - I didn't include your TODO about README.multiarch; let me know whether >> you have a more concrete idea about the purpose of that file > > It can document assumptions one makes about users of a package. For > instance, we expect dpkg users to use `dpkg --print-architecture` > diagnostically only. Similarly, we expect that package builds call > pkg-config if they mean the build architecture and they need to call > $(DEB_HOST_GNU_TYPE)-pkg-config if they mean the host architecture. > Indeed that happens automatically for autotools projects that happen to > use PKG_CHECK_MODULES or PKG_PROG_PKG_CONFIG (i.e. most). It also > happens for cmake when built with dh_auto_build. > > Let me give a counter example to illustrate more of the point. > haskell-devscripts-minimal is an `Architecture: all` package with some > shell scripts. Sounds like a good candidate for `Multi-Arch: foreign`. > When you look at /usr/share/haskell-devscripts/Dh_Haskell.sh though, you > see that functions such as cpu(), os(), etc. specifically introspect the > build architecture by using the build architecture ghc. Such usage is > not ok for `Multi-Arch: foreign` (#769377). > > I believe that policy should encourage some uniform way to document the > intended interface as we have several cases where this is not obvious. > README.multiarch may be that way. In particular, using a package in a > way not permitted by such README.multiarch would need to be a policy > violation on its own. For instance, one could depend on a shared library > and declare it an implementation detail. Relying on the transitive > dependency would then be considered a policy violation. Rather than introduce the new terminology 'intended interface', which we would definitely have to define, how about something like this: If all a package's architecture-dependent interfaces are listed in README.multiarch, the package is not considered to have any architecture-dependent interfaces for the purposes of determining whether it may be labelled Multi-Arch: foreign. Then we have a separate section explaining what to put in README.multiarch, and explaining how depending package maintainers must respond to the information in that file. >> - after we've got text documenting the other possible values of the >> Multi-Arch: field, we might want to promote the list of things to >> consider out of the Multi-Arch: foreign subsubsection. It should >> become clear once we've got that other text together. > > Indeed, documenting `Multi-Arch: same` may be easier (or not). For the > purpose of defining it, we shall call Debian binary packages for > different architectures with equal binary package name and version > "instances" of a package. I currently see the following requirements: > > * It must not be used on `Architecture: all` packages (though I wish >you could ;). > > * Given any two instances of a package and any filename, that filename >must be non-existent in at least one package or the type (directory / >regular file / etc.) must match and when the filename refers to a >regular file, the contents must be bitwise equal in both instances. > >This point deserves some more thought as it has some pitfalls: > > * If there is such a conflict, but the relevant packages are not >
Bug#749826: Documenting `Multi-Arch: foreign`
Hi Sean, Thanks for picking up multiarch! On Sat, Aug 19, 2017 at 09:50:21PM -0700, Sean Whitton wrote: > I spoke to Russ and we're both of the view that we should document > multiarch piecemeal. Let's begin by getting a definition of the > Multi-Arch: field into ch. 5 of policy. I'm glad you agree to my proposal. > I have pushed a new branch to the Debian policy repo named > bug749826-spwhitton. On that branch I've committed a slightly reworked > form of your draft text.[1] Please review the diff. Here are some > comments/issues: Very welcome. > - I substantially shortened your text. Let me know if you think I went > too far. I fear that some important aspects got lost indeed. More on that later. > - Previously I was worried about defining 'interface' but I've found > another place where policy uses this word without defining it, and I > don't think it needs to be changed in either place. I'm not a friend of vagueness, but I do recognize the difficulty in expressing the requirements precisely. > - I couldn't figure out how to include this text, because I didn't > understand it: > > For instance, using dpkg --print-architecture can be used to emit the > native architecture even though dpkg is marked Multi-Arch: > foreign. Similarly, calling pkg-config (without a prefix) will behave > differently on different architectures as its search path is > architecture-dependent even thoug pkg-config is marked Multi-Arch: > foreign. > > Are you saying that packages that depend or implicitly depend on dpkg > or pkg-config cannot be Multi-arch: foreign, although dpkg and > pkg-config themselves are Multi-arch: foreign? Why are dpkg and > pkg-config Multi-arch: foreign, if they provide these > architecture-dependent interfaces? Those are very good questions and clarifying them will lead to a better understanding of what we have to put into policy. You do understand that "dpkg --print-architecture" is part of dpkg's interface. Yet its out varies with its architecture. Taking this strictly would indeed imply that dpkg is wrongly marked. Similarly, running pkg-config may result in architecture-dependent paths and thus our strict interpretation would result in rejecting the foreign marking. A common theme with such cases is to resort to `Multi-Arch: allowed` (e.g. make), but that has the downside of requiring most consumers to attach the :any annotation and that it can never be switched back (because :any dependencies on packages not marked M-A:allowed are unsatisfiable). This is where I thought about README.multiarch: > - I didn't include your TODO about README.multiarch; let me know whether > you have a more concrete idea about the purpose of that file It can document assumptions one makes about users of a package. For instance, we expect dpkg users to use `dpkg --print-architecture` diagnostically only. Similarly, we expect that package builds call pkg-config if they mean the build architecture and they need to call $(DEB_HOST_GNU_TYPE)-pkg-config if they mean the host architecture. Indeed that happens automatically for autotools projects that happen to use PKG_CHECK_MODULES or PKG_PROG_PKG_CONFIG (i.e. most). It also happens for cmake when built with dh_auto_build. Let me give a counter example to illustrate more of the point. haskell-devscripts-minimal is an `Architecture: all` package with some shell scripts. Sounds like a good candidate for `Multi-Arch: foreign`. When you look at /usr/share/haskell-devscripts/Dh_Haskell.sh though, you see that functions such as cpu(), os(), etc. specifically introspect the build architecture by using the build architecture ghc. Such usage is not ok for `Multi-Arch: foreign` (#769377). I believe that policy should encourage some uniform way to document the intended interface as we have several cases where this is not obvious. README.multiarch may be that way. In particular, using a package in a way not permitted by such README.multiarch would need to be a policy violation on its own. For instance, one could depend on a shared library and declare it an implementation detail. Relying on the transitive dependency would then be considered a policy violation. > - after we've got text documenting the other possible values of the > Multi-Arch: field, we might want to promote the list of things to > consider out of the Multi-Arch: foreign subsubsection. It should > become clear once we've got that other text together. Indeed, documenting `Multi-Arch: same` may be easier (or not). For the purpose of defining it, we shall call Debian binary packages for different architectures with equal binary package name and version "instances" of a package. I currently see the following requirements: * It must not be used on `Architecture: all` packages (though I wish you could ;). * Given any two instances of a package and any filename, that filename must be non-existent in at least one package or the type (directory /
Bug#749826: Documenting `Multi-Arch: foreign`
Hello Helmut, I spoke to Russ and we're both of the view that we should document multiarch piecemeal. Let's begin by getting a definition of the Multi-Arch: field into ch. 5 of policy. I have pushed a new branch to the Debian policy repo named bug749826-spwhitton. On that branch I've committed a slightly reworked form of your draft text.[1] Please review the diff. Here are some comments/issues: - I substantially shortened your text. Let me know if you think I went too far. - Previously I was worried about defining 'interface' but I've found another place where policy uses this word without defining it, and I don't think it needs to be changed in either place. - I couldn't figure out how to include this text, because I didn't understand it: For instance, using dpkg --print-architecture can be used to emit the native architecture even though dpkg is marked Multi-Arch: foreign. Similarly, calling pkg-config (without a prefix) will behave differently on different architectures as its search path is architecture-dependent even thoug pkg-config is marked Multi-Arch: foreign. Are you saying that packages that depend or implicitly depend on dpkg or pkg-config cannot be Multi-arch: foreign, although dpkg and pkg-config themselves are Multi-arch: foreign? Why are dpkg and pkg-config Multi-arch: foreign, if they provide these architecture-dependent interfaces? - I didn't include your TODO about README.multiarch; let me know whether you have a more concrete idea about the purpose of that file - after we've got text documenting the other possible values of the Multi-Arch: field, we might want to promote the list of things to consider out of the Multi-Arch: foreign subsubsection. It should become clear once we've got that other text together. Thank you again for your work so far. [1] https://wiki.debian.org/DependencyHell#Multi-Arch:_foreign -- Sean Whitton signature.asc Description: PGP signature