Re: Bug#1069841: python-icalendar: FTBFS with tzdata 2024a: UnknownTimeZoneError: 'America/Godthab'

2024-04-25 Thread Simon McVittie
Control: retitle -1 python-icalendar: FTBFS with tzdata 2024a: 
UnknownTimeZoneError: 'America/Godthab'

On Thu, 25 Apr 2024 at 18:27:15 +0200, Santiago Vila wrote:
> E   pytz.exceptions.UnknownTimeZoneError: 'America/Godthab'

This was presumably triggered by this change in tzdata 2024a-2:

tzdata (2024a-2) unstable; urgency=medium
...
  * Replace America/Godthab by America/Nuuk

which appears to have been an intentional compatibility break?

smcv



Re: Bug#1063648: krb5: FTBFS on arm64, armel and ppc64el with "Can't resolve hostname" in dh_auto_test

2024-02-12 Thread Simon McVittie
Control: retitle -1 krb5: FTBFS on IPv6-only buildds: "Can't resolve hostname" 
in dh_auto_test
Control: tags -1 + ipv6

On Sun, 11 Feb 2024 at 23:40:34 +0000, Simon McVittie wrote:
> It might be relevant that according to #972151, arm-conova-03 (and
> perhaps other *-conova-* buildds?) is IPv6-only, with no IPv4 addresses
> or routes other than loopback (not even via NAT).

I gave back the failed builds and they succeeded on a different buildd.

I also notice that the original Architecture: all build of 1.20.1-5 failed
on x86-conova-02, and succeeded when retried on x86-grnet-02. I think
this supports the theory that this is really "FTBFS on IPv6-only buildds".

smcv



Re: Bug#1063648: krb5: FTBFS on arm64, armel and ppc64el with "Can't resolve hostname" in dh_auto_test

2024-02-11 Thread Simon McVittie
On Sun, 11 Feb 2024 at 13:53:56 -0800, Benjamin Kaduk wrote:
> On Sat, Feb 10, 2024 at 01:33:15PM +0100, Johannes Schauer Marin Rodrigues 
> wrote:
> > there as a binNMU "Rebuild to sync binNMU versions" for krb5 and that
> > failed for arm64, armel and ppc64el:
> > 
> > https://buildd.debian.org/status/package.php?p=krb5
> > 
> > The error logs look very similar:
> > *** Output of last command:
> > Can't resolve hostname arm-conova-03
> 
> This is due more to the build environment than the test suite per se.
...
> In short, the test suite, as for the protocol itself, assumes that it can
> resolve the server's hostname to an IP address

It might be relevant that according to #972151, arm-conova-03 (and
perhaps other *-conova-* buildds?) is IPv6-only, with no IPv4 addresses
or routes other than loopback (not even via NAT).

I believe there is consensus that we consider "localhost resolves to
127.0.0.1" to be a reasonable thing to demand from all buildds as part
of the build-essential interface.

I am unsure whether there is consensus that "the result of gethostname()
resolves to some address of the local machine" is also a reasonable
thing to demand from all buildds as part of build-essential: /etc/hosts
typically makes this true, but is not *guaranteed* to do so. On Linux,
packages can ensure that it happens by build-depending on
libnss-myhostname [linux-any], if necessary.

However, even with both of those, if the krb5 test suite (or protocol)
is resolving the local hostname with AF_INET (IPv4-only), and with either
AI_ADDRCONFIG or NULL hints, then that will not yield any results on
an IPv6-only system for the reasons discussed in #952740 and related
bug reports.

A workaround is to resolve with AF_UNSPEC, which currently disregards
AI_ADDRCONFIG, but that is, itself, arguably a bug (#854301).

If I'm understanding the krb5 issue correctly, the version of this in krb5
is more troublesome than the related issues seen in the GLib test suite,
because the GLib test suite would be happy with localhost always being
resolvable to 127.0.0.1 (as requested in #801362), but the krb5 test suite
wants to be able to resolve the local host name as well (so
resolving #801362 would not be enough).

smcv



Re: Bug#1060735: glib2.0/experimental: FTBFS on s390x and other 64-bit BE: gdatetime test fails or crashes

2024-01-15 Thread Simon McVittie
Control: severity -1 important

On Sat, 13 Jan 2024 at 19:32:58 +, Simon McVittie wrote:
> On Sat, 13 Jan 2024 at 16:21:46 +0000, Simon McVittie wrote:
> > On Sat, 13 Jan 2024 at 15:21:02 +0000, Simon McVittie wrote:
> > > I recently uploaded a snapshot of GLib 2.79.x to experimental (in
> > > preparation for NEW processing) and it failed tests on s390x and on
> > > the 64-bit, big-endian ports ppc64 and sparc64. I suspect this means
> > > it's a general problem with 64-bit BE, rather than specifically s390x.
> > 
> > git bisect says commit df4aea76 "gdatetime: Add support for %E modifier
> > to g_date_time_format()" is the first bad commit, which would be consistent
> > with it being...
> > 
> > > instead
> > > of segfaulting, the test failed with an assertion error involving dates 
> > > with
> > > a Japanese era marker:
> > 
> > ... something to do with Japanese and Thai eras, and the %E modifier.
> 
> I can't see anything in the relevant commit[1] that looks like it would be
> affected by endianness. Could there be an endianness problem in one of the
> glibc APIs that it's calling into, or something like that?

I have successfully worked around this by disabling support for era-based
dates with the %E modifier (used in Japan and Thailand) on big-endian
64-bit, which reduces the severity of this bug to non-RC.

It looks as though:

- glibc documents nl_langinfo(ERA) as returning a semicolon-delimited list
  of eras

- but in fact it returns a NUL-delimited, double-NUL-terminated list of
  eras, such that parsing the list cannot be done without risking a read
  overrun, unless you either assume that the undocumented
  double-NUL-termination will be present or use the undocumented
  nl_langinfo(_NL_TIME_ERA_NUM_ENTRIES). GLib currently does the latter.

- GLib has, at least for now, prioritized its own usability for Japanese
  and Thai users higher than the design principle that it should not rely
  on undocumented APIs

- this is OK on 32-bit and on little-endian, but glibc's
  nl_langinfo(_NL_TIME_ERA_NUM_ENTRIES) returns what appears to be a
  wrong result on 64-bit big-endian architectures

Discussion in GLib: https://gitlab.gnome.org/GNOME/glib/-/issues/3225

Workaround in GLib: https://gitlab.gnome.org/GNOME/glib/-/merge_requests/3820

Related glibc bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31030

If there is a safe way to get this information from glibc, then GLib should
use that, but I don't know what that safe way would be.

smcv



Re: Bug#1054552: glibc: stat fails when access time is bogus

2023-10-25 Thread Simon McVittie
Control: reassign -1 libc6

On Wed, 25 Oct 2023 at 20:53:57 +0200, Jarek Czekalski wrote:
> I tried to upgrade system (apt-get upgrade), but it failed in dpkg:
> 
> Unpacking initscripts (3.06-4) over (2.96-7+deb11u1) ...
> dpkg: error processing archive
> /var/cache/apt/archives/initscripts_3.06-4_all.deb (--unpack):
>  unable to stat './var/log' (which was about to be installed): Value too
> large for defined data type

This is nothing to do with GLib (libglib2.0-0), but I assume you meant
glibc (libc6)? Quoting the rest of the bug report below for glibc
maintainers:

> stat /var/log
> 
>   File: /var/log
>   Size: 4096    Blocks: 8  IO Block: 4096 directory
> Device: 8,1 Inode: 2752691 Links: 12
> Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/ root)
> Access: 2040-05-10 23:31:40.285032309 +0200
> Modify: 2023-10-25 16:03:42.313742411 +0200
> Change: 2023-10-25 16:03:42.313742411 +0200
>  Birth: 2017-02-27 09:46:56.739719147 +0100
> 
> This date (2040) causes dpkg to fail. The workaround is correcting it by
> touch /var/log.
> 
> Running system with bogus date (2040).
>    * What exactly did you do (or not do) that was effective (or
>  ineffective)?
> touch /var/log
>    * What was the outcome of this action?
> dpkg started working
>    * What outcome did you expect instead?
> dpkg should work with strange date or give a better message. Maybe just
> documentation (for stat) should be fixed and suggest problems with dates.
> 
> Current outcome is as follows: apt-get suddenly fails with a cryptic message
> (initially it was "unable to stat '.'" instead of /var/log). It may be
> extremely difficult to diagnose the issue.

It is not possible for 32-bit stat() to work correctly on 32-bit systems
with dates beyond 2038, because the timestamp will not fit in the data
type used. The only solution would be for the program in question (in
this case, dpkg) to be compiled with support for 64-bit timestamps.

Your bug report seems to be from an upgrade from Debian 11 to Debian 12,
and Debian 11's glibc version did not support APIs that provide 64-bit
timestamps on 32-bit systems, so Debian 11's dpkg cannot support that
either.

Debian 12's glibc does, but that will only help you after fully upgrading
to Debian 12, at which point you will have Debian 12 versions of glibc
and dpkg.

Unfortunately, I don't think there's necessarily anything that can be done
here, beyond the general move towards supporting 64-bit timestamps
distribution-wide that is already in progress.

smcv



Re: Bug#1040297: gnome-shell: fails to start on login: failed to allocate 51540049304 bytes

2023-07-08 Thread Simon McVittie
Control: reassign -1 libc6 2.37-3
Control: fixed -1 2.37-5
Control: tags -1 - moreinfo

On Sat, 08 Jul 2023 at 16:08:14 +0200, Bastian Venthur wrote:
> I've just updated all packages from unstable, including glibc and the
> problem is solved.

Let's assume this was the glibc bug with corrupted locale archives unless
someone finds evidence to the contrary, then. I'll close the bug after the
reassign command is processed.

smcv



Re: Bug#1040297: gnome: Gnome fails to start on login and falls back to GDM3

2023-07-08 Thread Simon McVittie
On Thu, 06 Jul 2023 at 09:50:57 +0100, Simon McVittie wrote:
> On Wed, 05 Jul 2023 at 22:44:40 +0200, Bastian Venthur wrote:
> > #5  0x7f134cbe97ce g_utf8_collate_key (libglib-2.0.so.0
> > + 0x8a7ce)
> > #6  0x7f134ccee180 e_source_set_display_name
> > (libedataserver-1.2.so.27 + 0x57180)
> 
> Well, this is messed up - something is setting the display name of an
> ESource (a calendar or address book or something similar) to a value that
> is, apparently, so long that allocating memory for its collation key (a
> version that has been modified to sort in the correct locale-sensitive
> order) will fail.

I wonder whether this is the same root cause as #1040452: glibc (>= 2.37-2)
sometimes generating corrupted locale archives? That might explain why
g_utf8_collate_key() would get nonsense results.

Please try with glibc (>= 2.37-5) which fixes that bug, and maybe this
one too.

smcv



Bug#1022787: libc6-dev: Lintian warns that all mips*el executables have executable stack

2022-10-25 Thread Simon McVittie
Package: libc6-dev
Version: 2.35-4
Severity: normal
X-Debbugs-Cc: debian-m...@lists.debian.org, lint...@packages.debian.org, 
jrt...@debian.org
User: debian-m...@lists.debian.org
Usertags: mips mipsel

All mips*el executables and libraries appear to have an executable stack,
resulting in very large numbers of Lintian warnings, particularly for
packages with many small ELF objects like
.

Jessica Clarke looked into this and found that this is intentionally done
by glibc when targeting minimum kernel 4.8.0 or older with mips hardfloat:
https://github.com/bminor/glibc/blob/595c22ecd8e87a27fd19270ed30fdbae9ad25426/sysdeps/unix/sysv/linux/mips/configure.ac#L138-L143

Debian 9 had a kernel newer than 4.8.0, so I think Debian 12 probably
doesn't need to go that far into backwards compatibility? If the mips
porters agree, then glibc on mips*el should stop forcing an executable
stack, either by increasing the minimal kernel version or by patching
this out. That will provide some security hardening on mips*el.

Or, if the mips porters consider this backwards compatibility to be
more important than the security hardening of a non-executable stack,
then Lintian should stop issuing warnings about the executable stack on
mips*el to improve its signal/noise ratio.

Thanks,
smcv



Bug#1003213: locales-all: introduce locales-utf8 package?

2022-01-09 Thread Simon McVittie
On Sun, 09 Jan 2022 at 13:48:06 +0100, Aurelien Jarno wrote:
> On 2022-01-06 11:21, Simon McVittie wrote:
> > * install locales-all (this costs > 200M but ensures that all locales are
> >   available)
> > 
> > For "reasonably large" desktop and server systems, I wonder whether it
> > might be better to generate a subset of locales-all with just the UTF-8
> > locales that we recommend for general use, and install that by default?
> 
> Defining general use is something quite difficult. All languages and
> countries should be considered equally, so we could differentiate
> UTF-8 from non UTF-8 locales, but we should not make further selection.

Right, what I meant was: AIUI we recommend that all speakers of xx_YY
use the xx_YY.utf8 locale, as opposed to a legacy national encoding, so
we could (make it straightforward to) install all the UTF-8 locales
like en_GB.utf8 and none of the legacy national encodings like
en_GB.ISO-8859-15.

> That way of doing it would be fine from the desktop point of view (100M
> is not that much compared to a desktop environment). However we can't
> force the installation of locales-all-utf8 in d-i

I thought task-*-desktop could maybe pull it in?

> From the various discussion on IRC, we more or less concluded that the
> way to go is to have one locale package per language, like it's done in
> most other distributions. From there we could have task-$language
> depends on locales-$language, also simplifying the d-i side.
> 
> Would that work for your use case?

That would mean that UIs like gnome-control-center would still not be able
to offer to add (for example) a French locale on a system that had been
installed in German, unless either the user knows that they need to install
the French language pack first, or the UI grows distro-specific code to:

- know which languages would be candidates for being enabled if the
  appropriate language pack was installed
- ask PackageKit to install the necessary language pack when one of those
  locales was chosen

However, it's consistent with how e.g. Flatpak handles locales (there's one
locale extension per language code, so for example fr_FR and fr_CH go
together).

This would also allow avoiding a long-standing issue with Steam: some
Steam games assume that en_US.UTF-8 is always available (they're wrong,
and should be using C.UTF-8, but that's not portable), so the steam package
could gain a Recommends: locales-en to work around that.

> > locales-utf8 would probably also be enough for many locale-sensitive
> > packages' test suites.
> 
> Not sure about that. Test suites are the main reason why we had to
> revert the removal of non UTF-8 locales.

I suspect this might be a bit circular: the reason that upstreams want
to test support for legacy encodings, and the reason that we want to run
those tests instead of skipping them, is because distros like us still
(claim to) support those encodings, even though we no longer recommend
them.

smcv



Bug#1003213: locales-all: introduce locales-utf8 package?

2022-01-06 Thread Simon McVittie
Package: locales-all
Version: 2.33-1
Severity: wishlist

As discussed recently on -devel and previously in #701585, at the moment
Debian users have a choice between two non-ideal locale setups:

* install locales and generate a subset of locale files with locale-gen
  (this is optimal for small systems, but it's difficult for high-level
  UIs like GNOME Settings to present this to users, particularly in a
  non-distro-specific way)

* install locales-all (this costs > 200M but ensures that all locales are
  available)

For "reasonably large" desktop and server systems, I wonder whether it
might be better to generate a subset of locales-all with just the UTF-8
locales that we recommend for general use, and install that by default?

If I'm counting correctly, that would be about 100M, which is perhaps an
acceptable price to pay for language settings being straightforward -
a reasonably complete set of Noto fonts (without CJK) is already more
than half of that.

locales-all could have a Depends on locales-utf8 and contain the remaining
(legacy national character set) locales, if anyone still needs that.

locales-utf8 would probably also be enough for many locale-sensitive
packages' test suites.

smcv



Re: /usr/bin/ld.so as a symbolic link for the dynamic loader

2021-12-03 Thread Simon McVittie
On Fri, 03 Dec 2021 at 18:29:33 +0100, Florian Weimer wrote:
> > On Thu, 02 Dec 2021 at 19:51:16 +0100, Florian Weimer wrote:
> >> If someone wants to upstream the multi-arch patches, that would be
> >> great.
> >
> > I think multiarch is mostly build-time configuration rather than patches.
> 
> We would have to take the table out of dpkg-architecture and put it into
> upstream glibc (or gcc or binutils), otherwise you can't build a
> multi-arch glibc on a non-Debian system.

Sorry, you asked about patches, so I thought you were under the
impression that Debian was patching glibc to have it use multiarch
library directories. I believe it's mainly done with build-time
configuration rather than by patching, so there isn't necessarily
anything to upstream, because most of what's necessary to enable/allow
that build-time configuration is already upstream - unless you want
glibc to be generically aware of multiarch paths even when built on
non-Debian? Is that your goal here?

As I said, a 99% implementation of multiarch tuples is to take the GNU
tuple that any Autotools-based build system already relies on, discard the
vendor part, and normalize a finite number of special cases (i[3456]86
and arm* are the only ones I'm aware of). I believe the people who did
the early design of multiarch were hoping to standardize it via something
like LSB, but that effort seems approximately as dead as LSB itself.

systemd has an independent implementation of the list of known multiarch
tuples:
https://github.com/systemd/systemd/blob/main/src/basic/architecture.h
https://github.com/systemd/systemd/blob/main/src/basic/architecture.c

Some sort of change to the expansion of $LIB is maybe the only thing
needed in addition to build-time configuration, because the upstream
implementation of $LIB makes the assumption that only the last path
segment of the ${libdir} is desired in $LIB, which is usually the case
but happens to be untrue for multiarch.

> In addition to get the right value for $LIB, it's also desirable to get
> the default search paths right.

This is done by runtime configuration rather than patching, at the moment,
for example this file installed as part of libc6:amd64:

$ cat /etc/ld.so.conf.d/x86_64-linux-gnu.conf
# Multiarch support
/usr/local/lib/x86_64-linux-gnu
/lib/x86_64-linux-gnu
/usr/lib/x86_64-linux-gnu

If you would prefer this to be hard-coded into ldconfig, I suspect there's
no implementation right now that could be upstreamed.

> And there's also /usr/libexec/getconf to worry about.

At the moment, /usr/bin/getconf is only installed for the "main"
architecture (more precisely, for whichever architecture of libc-bin is
installed). If there's meant to be one getconf per architecture, then it
wouldn't be able to appear in /usr/bin or /usr/libexec with that name.
As with ld.so, this is not unique to multiarch: multilib would have the
same problem.

Is that /usr/libexec/getconf upstream, or is /usr/libexec/getconf
something else?

smcv



Re: /usr/bin/ld.so as a symbolic link for the dynamic loader

2021-12-03 Thread Simon McVittie
On Thu, 02 Dec 2021 at 19:51:16 +0100, Florian Weimer wrote:
> Having ld.so as a real command makes the name architecture-agnostic.
> This discourages from hard-coding non-portable paths such as
> /lib64/ld-linux-x86-64.so.2 or even (the non-ABI-compliant)
> /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 in scripts that require
> specific functionality offered by such an explicit loader invocation.

This works up to a point, but because there is only one /usr/bin/ld.so,
it can only work for one architecture per machine, so saying it's
architecture-agnostic is still a bit of a stretch.

In multiarch, in principle there is no such thing as "the" primary
architecture (it is valid to combine sed:amd64 and coreutils:i386 on an
amd64 kernel), but in practice it's usually the case that "most"
executables come from the same architecture as dpkg.

So if we only have one ld.so, then on typical Debian x86_64 machines
it will only work for x86_64 executables, and not for i386 executables
(or cross-executables via qemu-aarch64-static or whatever). Similarly,
on Red Hat-style multilib, it will only work for x86_64 and not for
i386. Does that give you the functionality you are expecting?

One way to make it closer to architecture-agnostic would be to name it
${tuple}-ld.so, similar to how gcc (cross-)compilers are named. From
Debian's point of view, ideally the tuple would be a multiarch tuple,
which is a GNU tuple normalized to eliminate differences within an
ABI-compatible family of architectures:

- start from the GNU tuple, e.g. i686-pc-linux-gnu
- discard the vendor part, e.g. i686-linux-gnu
  - this version is the tools prefix used in cross-compilation
- replace i?86 with i386 and arm* with arm, e.g. i386-linux-gnu
  - this version is the Debian multiarch tuple

(Or perhaps better to have symlinks with both the cross-tools prefix
and the multiarch tuple, where they differ - which I believe is only
i386 and 32-bit ARM, because most/all other architectures sensibly change
the GNU tuple if and only if the ABI is different.)

> The initial implementation will be just a symbolic link.  This means
> that multi-arch support will be missing: the amd64 loader will not be
> able to redirect execution to the s390x loader.

... or to the i386 loader, which is probably a concern for more people
(that affects Red Hat-style multilib, which is present in some form on
most distros, and not just Debian-style multiarch, which is only seen in
Debian derivatives and the freedesktop.org SDK).

> In principle, it should
> be possible to find PT_INTERP with a generic ELF parser and redirect to
> that, but that's vaporware at present.  I don't know yet if it will be
> possible to implement this without some knowledge of Debian's multi-arch
> support in the loader.

I believe Debian uses the interoperable (ABI-compliant) ELF interpreter
as listed on https://sourceware.org/glibc/wiki/ABIList for all
architectures - it certainly does for all *common* architectures (for
example our x86_64 executables use /lib64/ld-linux-x86-64.so.2, which is
a special exception to the rule that we don't usually use lib64).

I had naively believed that all distros do the same, but unfortunately
my work on the Steam Runtime has taught me otherwise: for example, Arch
Linux has a non-standard ELF interpreter /usr/lib/ld-linux-x86-64.so for
executables that are built from the glibc source package (but uses the
interoperable ELF interpreter for everything else), and Exherbo
consistently puts their dynamic linkers in /usr/x86_64-pc-linux-gnu/lib.

Does glibc automatically set up the interoperable ELF interpreter, or is
it something that distros' glibc maintainers have to "just know" if they
are using a non-default ${libdir}?

> If someone wants to upstream the multi-arch patches, that would be
> great.  glibc now accepts submissions under DCO, so copyright assignment
> should no longer be an obstacle.

(Please note that I am not a glibc maintainer and cannot speak for them.)

I think multiarch is mostly build-time configuration rather than patches.
The main thing needing patching is that we want ${LIB} to expand to
lib/x86_64-linux-gnu instead of just x86_64-linux-gnu, so that the
"/usr/${LIB}/libfoo.so.0" idiom works, but glibc would normally only take
the last component of the ${libdir}:

https://salsa.debian.org/glibc-team/glibc/-/blob/sid/debian/patches/any/local-ld-multiarch.diff

The freedesktop.org SDK used for Flatpak also uses Debian-style multiarch
(but is not otherwise Debian-derived), and addresses that differently, in a
way that might be more upstream-suitable:

https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/blob/master/patches/glibc/fix-dl-dst-lib.patch

smcv



Bug#994233: libc6: breaks python3-iptables (<< 1.0.0-2~)

2021-09-14 Thread Simon McVittie
Package: libc6
Version: 2.32-2
Severity: normal

libc6 (>= 2.32) appears to have triggered a regression in
python3-iptables, fixed in python3-iptables (= 1.0.0-2). Please consider
adding a versioned Breaks to prevent broken partial upgrades.

smcv



Bug#994232: libc6: generates unnecessarily tight dependencies on mips, mipsel

2021-09-14 Thread Simon McVittie
Package: libc6
Version: 2.32-2
Severity: normal
Tags: patch
X-Debbugs-Cc: debian-rele...@lists.debian.org

Most architectures' libc6.symbols.$arch have this pattern:

> $ cat debian/libc6.symbols.i386
> #include "libc6.symbols.common"
> ld-linux.so.2 #PACKAGE# #MINVER#
> #include "symbols.wildcards"
> libc.so.6 #PACKAGE# #MINVER#
> #include "symbols.wildcards"

This results in any use of symbols versioned foo@GLIBC_2.23 generating a
dependency on libc6 (>= 2.23), and so on.

However, libc6:mips and libc6:mipsel are missing the final
'#include "symbols.wildcards"', which results in those architectures
generating an unnecessarily strict dependency on libc6 (>= 2.32-1) and
entangling other transitions with glibc's transition. As far as I can see,
this was accidental?

libc6-i386:x32 has the same issue.

Please consider the attached patch.

smcv
>From bffb450291513a1486dae6e04b1d126994ba22fe Mon Sep 17 00:00:00 2001
From: Simon McVittie 
Date: Tue, 14 Sep 2021 09:25:36 +0100
Subject: [PATCH] Restore versioned symbol tracking for mips, mipsel,
 libc6-i386:x32

Commit d3f9fade "debian/libc6.symbols.mips, debian/libc6.symbols.mipsel:
drop symbol overrides for TLS support" accidentally removed the inclusion
of symbols.wildcards, resulting in all symbols in libc.so.6 on the
affected architectures generating an unnecessarily tight dependency on
libc6 (>= 2.32-1).

Commit 5f5dfcb0 caused an equivalent issue for libc6-i386:x32.

Fixes: d3f9fade "debian/libc6.symbols.mips, debian/libc6.symbols.mipsel: drop symbol overrides for TLS support."
Fixes: 5f5dfcb0 "debian/libc6-i386.symbols.x32, debian/libc6.symbols.i386: drop symbol overrides for TLS support.
---
 debian/libc6-i386.symbols.x32 | 1 +
 debian/libc6.symbols.mips | 1 +
 debian/libc6.symbols.mipsel   | 1 +
 3 files changed, 3 insertions(+)

diff --git a/debian/libc6-i386.symbols.x32 b/debian/libc6-i386.symbols.x32
index 3bd6f8f5..357d2f26 100644
--- a/debian/libc6-i386.symbols.x32
+++ b/debian/libc6-i386.symbols.x32
@@ -2,3 +2,4 @@
 ld-linux.so.2 #PACKAGE# #MINVER#
 #include "symbols.wildcards"
 libc.so.6 #PACKAGE# #MINVER#
+#include "symbols.wildcards"
diff --git a/debian/libc6.symbols.mips b/debian/libc6.symbols.mips
index 580f72c8..59332301 100644
--- a/debian/libc6.symbols.mips
+++ b/debian/libc6.symbols.mips
@@ -2,3 +2,4 @@
 ld.so.1 #PACKAGE# #MINVER#
 #include "symbols.wildcards"
 libc.so.6 #PACKAGE# #MINVER#
+#include "symbols.wildcards"
diff --git a/debian/libc6.symbols.mipsel b/debian/libc6.symbols.mipsel
index 580f72c8..59332301 100644
--- a/debian/libc6.symbols.mipsel
+++ b/debian/libc6.symbols.mipsel
@@ -2,3 +2,4 @@
 ld.so.1 #PACKAGE# #MINVER#
 #include "symbols.wildcards"
 libc.so.6 #PACKAGE# #MINVER#
+#include "symbols.wildcards"
-- 
2.33.0



Bug#994006: libc6: NSS modules changes require a restart of systemd-logind, which is not possible

2021-09-14 Thread Simon McVittie
On Mon, 13 Sep 2021 at 22:59:32 +0200, Aurelien Jarno wrote:
> - running the operation on a non-existing user, but as loginctl does a
>   check that the user exists, it has to be done directly with the dbus
>   API, for instance "gdbus call --system --dest org.freedesktop.login1
>   --object-path /org/freedesktop/login1 --method
>   org.freedesktop.login1.Manager.SetUserLinger 12345678 true true"
> 
> The latest is more a bit more complex to do (especially that
> libglib2.0-bin is not necessarily installed on the system), but has the
> advantage of exercising all configured NSS modules.

systemd happens to have its own D-Bus implementation sd-bus (a competitor
to libdbus and GLib's GDBus) for which it provides busctl(1), an
equivalent of gdbus(1) and dbus-send(1). So this could be written as:

busctl call --system org.freedesktop.login1 /org/freedesktop/login1 \
org.freedesktop.login1.Manager SetUserLinger ubb $uid true true

which does not have dependencies outside systemd.deb.

The nonexistent uid should probably be in one of the ranges reserved by
Policy §9.2.2: perhaps 4294967294 or (uint32_t) -2, which is reserved
as a representation of the anonymous NFS user?

smcv



Bug#983910: rpcsvc-proto: uninstallable due to Conflicts: libc6

2021-08-18 Thread Simon McVittie
Control: found -1 1.4.2-3

Sorry, this is still failing, dependent on unpack order:

> Preparing to unpack .../rpcsvc-proto_1.4.2-3_amd64.deb ...
> Unpacking rpcsvc-proto (1.4.2-3) ...
> dpkg: error processing archive 
> /var/cache/apt/archives/rpcsvc-proto_1.4.2-3_amd64.deb (--unpack):
>  trying to overwrite '/usr/bin/rpcgen', which is also in package libc-dev-bin 
> 2.31-13

Doesn't the Breaks/Replaces need to be with versions (<< 2.31-14)
instead of (<< 2.31-13)? 2.31-13 is the version in bullseye, and 2.31-14
is the one with these changes:

>   * debian/rules.d/build.mk: stop passing --enable-obsolete-rpc.
>   * debian/debhelper.in/libc-dev.install{,.hurd-i386}: do not install
> librpcsvc.a.
>   * debian/debhelper.in/libc-dev-bin.manpage, debian/local/manpages/rpcgen.1:
> do not install rpcgen (1) manpage.
>   * debian/rules.d/build.mk: stop deleting  and
> .
>   * debian/control.in/libc, debian/rules.d/debhelper.mk: make libc6-dev to
> depend on rpcsvc-proto, except for stage1 and stage2.

smcv



Bug#985617: glibc: flaky autopkgtest on most architectures

2021-04-25 Thread Simon McVittie
On Sun, 25 Apr 2021 at 10:14:51 +0100, Simon McVittie wrote:
> On Sun, 25 Apr 2021 at 08:11:48 +0200, Paul Gevers wrote:
> > On 25-04-2021 01:55, Aurelien Jarno wrote:
> > > It appears that all the failures are related to containers. I have been
> > > able to reproduce the issue with a bullseye kernel, which defaults to
> > > kernel.unprivileged_userns_clone=1. It seems the autopkgtest runners
> > > still use a buster kernel (at least in the case of this build log).

Looking at support/test-container.c, it seems that these tests will
automatically be skipped (FAIL_UNSUPPORTED) on a kernel that restricts
userns creation (like buster), and will be run (and perhaps fail)
on a kernel that does not (like bullseye). So it is not necessarily
a *regression* that they fail - they might just never have been tried
before we started using bullseye kernels.

The brute-force approach to making the autopkgtest not be flaky would be
to make these tests FAIL_UNSUPPORTED unconditionally, which will result
in the same coverage we would have had on buster kernels. Obviously it
would be better if they could be made to pass, but some reliable testing
is better than none.

These tests seem to be failing here (support/test-container.c:1095):

  execvp (new_child_proc[0], new_child_proc);

  /* Or don't run the child?  */
  FAIL_EXIT1 ("Unable to exec %s\n", new_child_proc[0]);

It would be useful if this printed strerror(errno) at least, so that we
can see whether it's ENOENT or EACCES or something else.

Perhaps the test support code is not copying/mounting everything that needs
to be copied/mounted into the container's filesystem? More debug logging in
support/test-container.c would probably be helpful here - perhaps even
running 'find . -ls' in the new_root_path before chrooting into it?

smcv



Bug#985617: glibc: flaky autopkgtest on most architectures

2021-04-25 Thread Simon McVittie
On Sun, 25 Apr 2021 at 08:11:48 +0200, Paul Gevers wrote:
> On 25-04-2021 01:55, Aurelien Jarno wrote:
> > It appears that all the failures are related to containers. I have been
> > able to reproduce the issue with a bullseye kernel, which defaults to
> > kernel.unprivileged_userns_clone=1. It seems the autopkgtest runners
> > still use a buster kernel (at least in the case of this build log).
> 
> That's correct, all workers run stable except s390x.
> 
> > Could it be that kernel.unprivileged_userns_clone is enabled on some of
> > the runners?
>
> If I want to make our workers equal, I guess
> changing them all to the default sounds sane, right? Do you know if the
> default is different for buster and bullseye?

The default was kernel.unprivileged_userns_clone=0 in buster kernels and
was switched to kernel.unprivileged_userns_clone=1 in bullseye kernels.

References:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=898446
https://salsa.debian.org/kernel-team/linux/-/commit/a381917851e762684ebe28e04c5ae0d8be7f42c7

If you want a quick way to get consistent behaviour, installing the
bubblewrap package from bullseye (but not buster-backports!) installs
a sysctl.d fragment to set kernel.unprivileged_userns_clone=1 even on
older kernels.

smcv



Re: Bug#968342: libgegl-sc-0.4.so: undefined symbol: __exp_finite

2020-08-13 Thread Simon McVittie
Control: retitle -1 libgegl-sc-0.4.so: undefined symbol: __exp_finite
Control: reassign -1 libgegl-0.4-0 0.4.12-2
Control: affects -1 + gimp
Control: tags -1 + bullseye sid
Control: clone -1 -2
Control: severity -2 minor
Control: retitle -2 libc6: please consider adding Breaks on libgegl-0.4-0 (<< 
0.4.18)
Control: reassign -2 libc6 2.31.2

On Thu, 13 Aug 2020 at 12:38:00 +0200, W Forum W wrote:
> Gimp does not start anymore (even after new install)
> 
> $ gimp
> GEGL-Message: Module '/usr/lib/x86_64-linux-gnu/gegl-0.4/seamless-clone.so'
> load error: /lib/x86_64-linux-gnu/libgegl-sc-0.4.so: undefined symbol:
> __exp_finite

This is fallout from upgrading glibc to 2.31. It appears to have
been fixed as a side-effect of build system changes in gegl 0.4.18, so
upgrading to the version of gegl from testing/unstable should fix this.

This can only affect partial upgrades from Debian 10 'buster' to
testing/unstable or the future Debian 11 'bullseye'. Pure Debian 10 systems
are unaffected, and so are pure testing/unstable systems.

> -- System Information:
> Debian Release: 10.5
> ii  libc62.31-2
> ii  libgegl-0.4-00.4.12-2

You appear to have a mixture of packages from the Debian 10 stable release,
and the Debian testing/unstable rolling release that will eventually get
released as Debian 11.

This is referred to as a "Frankendebian" system, and is not something that
can really be fully supported - we try to make it work, but realistically
the bugs that can happen during a partial upgrade will never all be found
or fixed. To make it work, you will have to upgrade some other packages
(in this case gegl, but probably a lot more) to their testing/unstable
versions.

smcv



Bug#966173: libc6: __atan2_finite reference in dlopened module no longer found in executable linked to libm

2020-07-24 Thread Simon McVittie
On Fri, 24 Jul 2020 at 14:36:54 +0200, Bastian Blank wrote:
> On Fri, Jul 24, 2020 at 10:11:04AM +0100, Simon McVittie wrote:
> > The bug (#966150) is that a version of uix86_64.so compiled with a slightly
> > older (2020-02-18) toolchain fails to load on an up-to-date sid system, 
> > with:
> > undefined symbol: __atan2_finite
> 
> Because the binary was not linked with -lm, the linker never saw the
> real symbol __atan2_finite@GLIBC2_16, so the linke only emitted a reference
> to __atan2_finite.

Right. However, note that there's no mention of __atan2_finite() in the
source code - it's only used because older glibc would replace atan2()
with a reference to __atan2_finite() when building with -ffast-math.

> At least dpkg-shlibdeps or so should warn about that.

For at least openarena, it doesn't seem to. I'm not sure why not.

For the next update to openarena I'm going to build it with -Wl,-z,defs
so that missing dependencies are always fatal. However, that isn't
always applicable: some plugin architectures (like Python extensions)
rely on being able to pick up symbols exported by the executable, which
are not necessarily programmatically distinguishable from symbols that
are defined by libraries used by the executable.

> > I've been trying to put together a standalone reproducer that only uses
> > libdl and libm, but so far I have not been successful.
> 
> Something like that?
> 
> | % cat test.c
> | void __atan2_finite(void);
> | void test(void) {
> |   __atan2_finite();
> | }

I was aiming for something a bit closer to openarena's situation,
where there is no explicit reference to __atan2_finite() in the source
code: it calls atan2(), and cc -ffast-math rewrites that into a call
to __atan2_finite(). I've now managed to make this work: see attached.

Compile them and run ./prog in a buster environment (or an outdated
bullseye/sid environment with glibc < 2.31), then run ./prog in an
up-to-date bullseye/sid environment without recompiling.

libmymodule.so will get a dynamic reference to __atan2_finite.

The historical result is that prog outputs 0.463648, twice.

The result in up-to-date bullseye/sid is that prog outputs 0.463648,
once, and then fails with "undefined symbol: __atan2_finite".

Using __FINITE_MATH_ONLY__ (which is defined by -ffast-math) is necessary
to be able to reproduce the bug this way.

If you consider this sort of thing to be too niche to be supportable,
please feel free to close the bug.

smcv
all = prog libmymodule.so

CFLAGS = -ffast-math

check: $(all)
objdump -Tx libmymodule.so
./prog

all: $(all)

prog: prog.c Makefile
$(CC) $(CFLAGS) -Wl,--no-as-needed -o $@ $< -ldl -lm

# Note that this cannot be compiled with -Wl,-z,defs: it deliberately has
# undefined references to symbols from libm
libmymodule.so: module.c Makefile
$(CC) $(CFLAGS) -shared -o $@ $<

clean:
rm -f $(all)
#include 
#include 
#include 
#include 

#if !defined(__FINITE_MATH_ONLY__) || !__FINITE_MATH_ONLY__
#warning Not using finite-only mathematics
#endif

int main (void)
{
  void *module;
  double (*my_atan2) (double, double);

  printf ("%f\n", atan2 (1, 2));

  module = dlopen ("${ORIGIN}/libmymodule.so", RTLD_NOW);
  if (module == NULL)
errx(1, "%s", dlerror ());

  my_atan2 = (double (*) (double, double)) dlsym (module, "my_atan2");
  if (my_atan2 == NULL)
errx(1, "%s", dlerror ());

  printf ("%f\n", my_atan2 (1, 2));

  return 0;
}
#if !defined(__FINITE_MATH_ONLY__) || !__FINITE_MATH_ONLY__
#warning Not using finite-only mathematics
#endif

#include 

double my_atan2 (double x, double y)
{
  return atan2 (x, y);
}


Bug#966173: libc6: __atan2_finite reference in dlopened module no longer found in executable linked to libm

2020-07-24 Thread Simon McVittie
Package: libc6
Version: 2.31-1
Severity: normal

I've encountered an odd bug in openarena (#966150) which I'm concerned
might be a glibc regression affecting other packages.

Some background: openarena is a game running on the ioquake3 engine
(main executable: /usr/lib/ioquake3/ioquake3). During startup, the engine
dlopens some modules, which implement the actual openarena game and UI.
One of those modules is uix86_64.so.

uix86_64.so uses mathematical functions from libm, but is not itself
linked to libm. At runtime (at least on older systems) it works as
intended, because the ioquake3 executable *is* linked to libm. I'm aware
that this is not the most robust setup, and uix86_64.so would ideally be
linked with -lm to make it self-contained; but it's documented as being
expected to work, and has always worked in the past:

Symbol references in the shared object are resolved using (in order):
symbols in the link map of objects loaded for the main program and its
dependencies; [... and some more places ...]
— dlopen(3)

The bug (#966150) is that a version of uix86_64.so compiled with a slightly
older (2020-02-18) toolchain fails to load on an up-to-date sid system, with:

undefined symbol: __atan2_finite

If I recompile openarena in a sid chroot, *with no source code changes*
(in particular uix86_64.so is still not linked to -lm!), then it starts
to work again. The recompiled uix86_64.so has an undefined reference
to atan2, but no reference to __atan2_finite any more.

I'm going to address this in bullseye by making openarena more robust
(explicitly linking to -lm). After I've done that, the updated version of
openarena will not be suitable as a reproducer for this bug report, but
the buster version of openarena will still be suitable.

If you believe this is not a significant regression in glibc and should
only be fixed by changes in openarena, I have no problem with doing that
and just closing this bug report. However, I wanted to raise this in
case it affects other previously-built binaries.

This can be reproduced somewhat conveniently as follows:

* Have a buster virtual machine
* Install openarena and enough of a desktop to get a terminal in an X11
  environment
* Run openarena
* It succeeds
* To exit quickly: Shift+Escape, type "/quit", Enter
* Add a bullseye apt source and "apt update", but do not upgrade everything
* Upgrade libc6 from 2.28-10 to 2.31-1, while upgrading as few other
  packages as possible
* I used aptitude, which made me also upgrade gcc-9 and related
  packages, removing gcc-8
* Run openarena
* It fails as described in #966150
* Downgrade libc6 and closely-related packages from 2.31-1 to 2.28-10
* In my case this meant downgrading libc-dev-bin, libc6-dev, libc6
  and libc-bin, and removing libcrypt-dev and libcrypt1
* Run openarena
* It succeeds again, confirming that this was a glibc behaviour change

I've been trying to put together a standalone reproducer that only uses
libdl and libm, but so far I have not been successful.

I believe this is related to a change in the representation of
the __atan2_finite symbol, which is used (at least by versions of
openarena compiled against older glibc) because openarena is compiled
with -ffast-math. In 2.28-10, that symbol was not hidden:

$ objdump -Tx /lib/x86_64-linux-gnu/libm.so.6
...
00028280 g   iD  .text  0046  GLIBC_2.15  __atan2_finite

In 2.31-1, it is hidden, and there is no non-hidden definition (default
symbol-version):

$ objdump -Tx /lib/x86_64-linux-gnu/libm.so.6
...
0002a1e0 g   iD  .text  0049 (GLIBC_2.15) __atan2_finite

Because uix86_64.so is not directly linked to -lm, it has an undefined
reference to __atan2_finite with no particular version:

$ objdump -Tx /usr/lib/openarena/baseoa/pak6-patch088/uix86_64.so
...
  D  *UND*    __atan2_finite

As far as I can work out, this unversioned undefined reference can be
satisfied by __atan2_finite@@GLIBC_2.15 in the global namespace from
the old libm, but not by the hidden version __atan2_finite@GLIBC_2.15
in the new libm.

Thanks,
smcv



Bug#954915: marked as pending in glibc

2020-03-25 Thread Simon McVittie
On Wed, 25 Mar 2020 at 13:15:03 +, Aurelien Jarno wrote:
> debian/debhelper.in/libc.preinst, debian/rules.d/debhelper.mk: there is no 
> easy way to check if a file belongs to a package with usrmerge. Just drop all 
> safety checks...  Closes: #954915.

The /usr merge merges /foo with /usr/foo (for some values of foo) and
nothing else, so if you would prefer to keep those checks, you could do
something like this:

if dpkg-query -S "${lib#/usr}" >/dev/null 2>&1 ; then
continue
fi

if dpkg-query -S "/usr${lib#/usr}" >/dev/null 2>&1 ; then
continue
fi

which will in particular treat /usr/lib/whatever as equivalent to
/lib/whatever.

(And the same for the quicker check involving libcfiles - but I don't
understand how or whether that one works, because it seems to be telling
grep to look for ^ and $ as literals rather than as anchors, which seems
wrong to me. But perhaps I'm missing something there.)

smcv



Bug#952740: glibc: getaddrinfo with AI_ADDRCONFIG and AF_INET[6] can't resolve localhost in an isolated network namespace

2020-02-28 Thread Simon McVittie
Package: libc6
Version: 2.29-10
Severity: normal

Summarizing the glibc bits of #948834, which I'm about to close because
newer versions of glib2.0 bypass it:

Steps to reproduce
==

- Be in an environment with only loopback addresses. Using bubblewrap and
  `bwrap --unshare-net --dev-bind / / ./getaddrinfo` is one way to make
  this happen; building in pbuilder is another.

- Have nsswitch configured in such a way that localhost "should" resolve
  to 127.0.0.1 and/or ::1, for example by installing netbase to get
  the typical /etc/hosts, or by installing libnss-myhostname.

- Have a program that resolves names with AI_ADDRCONFIG, which is
  often considered to be a good idea. For example,
  glib2.0's GResolver or dbus' support for TCP addresses.
  
https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=948834;filename=getaddrinfo.c;msg=30
  is a convenient way to try this out.

- Attempt to resolve the reserved name "localhost" with the AI_ADDRCONFIG
  flag and the AF_INET or AF_INET6 family. (The family not being AF_UNSPEC
  is significant, due to #854301.)

Expected result
===

localhost resolves to 127.0.0.1 or to ::1, as appropriate for the family.

Actual result
=

-2 "Name or service not known"

Impact
==

Packages whose regression tests connect to localhost by name FTBFS when
built in pbuilder (but not sbuild or a typical development environment).
This currently affects glib2.0 2.62.x, but not 2.64.x (#948834) and in
the past it affected dbus (#897662).

Developers who expect resolving localhost to succeed become confused
about the layer in which it fails (e.g. expecting the problem to be with
/etc/hosts).

Workarounds that other packages can use
===

- Implement
  
  in higher-level resolver APIs (for example glib2.0 >= 2.63.1, recent
  Firefox and recent Chromium do this)

- Implement a special case where AI_ADDRCONFIG is not used for "localhost"
  (for example Firefox has done this for a long time)

- Use AF_UNSPEC (or hints == NULL), and hope #854301 doesn't ever get fixed

- Don't use AI_ADDRCONFIG (generates unnecessary DNS traffic, and
  potentially causes connection delays depending on application behaviour,
  for single-stack hosts)

- On name resolution failure with AI_ADDRCONFIG, retry without AI_ADDRCONFIG
  before giving up

- Configure a useless non-127.0.0.1 IP address inside network namespaces
  (perhaps 127.0.0.2?) to trick AI_ADDRCONFIG into thinking we have IPv4
  connectivity

- Make tests that require resolving localhost skip themselves if it
  doesn't resolve; or don't run such tests at build-time at all, only in
  autopkgtest
  (IMO undesirable because it significantly reduces test coverage on
  non-amd64, non-arm64 architectures, where the buildds are our only
  opportunity to check that the built package is functional)

Possible solutions in glibc
===

- Implement a localhost special-case resembling
  
  at the getaddrinfo() level, before checking AI_ADDRCONFIG

- Implement a localhost special-case that ignores AI_ADDRCONFIG, but then
  delegates to NSS modules as usual

- Instead of implementing AI_ADDRCONFIG at the getaddrinfo() level, pass
  it down to NSS modules, make the dns module respect it (only send DNS
  A/ requests if we have at least one non-local IPv4/IPv6 address),
  and make non-DNS modules like files and myhostname ignore it

- Reject this as not a glibc bug, and recommend that libraries and
  applications use one of the above workarounds
  (suggestions on which ones are the best workarounds welcome!)

Related glibc bug reports
=

- https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=854301 points out that
  the getaddrinfo() specification implies that AF_UNSPEC should arguably
  also fail in the same way

- https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=854302 points out that
  AI_ADDRCONFIG is not necessarily great to have as a default, for this
  reason

- https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=780294 requests that
  IPv6 link-local addresses should be ignored for the purposes of
  AI_ADDRCONFIG

- https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=801362 requests that
  foo.localhost (for all values of foo) should always resolve to 127.0.0.1
  and/or ::1 at the getaddrinfo() level, which would avoid this

References outside Debian
=

- https://fedoraproject.org/wiki/QA/Networking/NameResolution/ADDRCONFIG
- https://sourceware.org/bugzilla/show_bug.cgi?id=12377
- https://github.com/zeromq/libzmq/issues/42



Re: Bug#948834: glib2.0: FTBFS: gio/tests/gsocketclient-slow.c: Error resolving ?localhost?: Name or service not known

2020-02-26 Thread Simon McVittie
On Sun, 09 Feb 2020 at 19:19:24 +, Simon McVittie wrote:
> On Sun, 09 Feb 2020 at 16:45:05 +0100, Mattia Rizzolo wrote:
> > I see glib2.0 is also failing in the r-b infra:
> > https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/glib2.0.html
>
> We could probably work around this in glib2.0 with a Build-Depends on
> libnss-myhostname | netbase, or the other way round.

I tried this, and no, that doesn't work; the situation is more subtle
than I thought, and not the fault of pbuilder's /etc/hosts.

localhost *does* resolve in the container. However, it only resolves
with certain options, and those options don't match all the options GIO
is going to use.

Specifically, GResolver is normally implemented by GThreadedResolver,
which uses getaddrinfo with socktype SOCK_STREAM, protocol IPPROTO_TCP,
flags AI_ADDRCONFIG, and a varying family: either AF_UNSPEC, AF_INET or
AF_INET6 depending on options. The "Happy Eyeballs" algorithm exercised
in this test carries out separate AF_INET and AF_INET6 name resolution,
so that it can make HTTP connections via IPv4 and IPv6 in parallel,
and take whichever works first.

Unfortunately, AI_ADDRCONFIG is documented like this (my emphasis):

 If  hints.ai_flags includes the AI_ADDRCONFIG flag, then IPv4 addresses
 are returned in the list pointed to by res only if the local system has
 at  least  one IPv4 address configured, and IPv6 addresses are returned
 only if the local system has at least one IPv6 address configured. **The
 loopback  address is not considered for this case as valid as a
 configured address.**

and pbuilder's network namespace only has loopback addresses. So we
would expect resolving "localhost" to always fail in that namespace with
AI_ADDRCONFIG, which I would have expected to affect more packages than
just GLib - but that doesn't happen, due to #854301.

To debug this I hacked the attached program into a package built in
pbuilder (GLib is inconveniently large, so I added the program to procenv
instead). You can get similar (but not identical!) results without pbuilder
by compiling the program, installing bwrap and using:

bwrap --unshare-net --dev-bind / / ./getaddrinfo

By experiment, what actually happens is:

no hints (which in glibc means AF_UNSPEC and AI_ADDRCONFIG|AI_V4MAPPED):
success, return 127.0.0.1 (only, I don't get ::1 for some reason)
AF_INET:
if AI_ADDRCONFIG: fails with -2 "Name or service not known"
else: success, return 127.0.0.1
AF_INET6:
if AI_ADDRCONFIG: fails with -2 "Name or service not known"
else (pbuilder): fails with -3 "Temporary failure in name resolution"
else (bwrap): success, return ::1
AF_UNSPEC:
success, return 127.0.0.1 (even if AI_ADDRCONFIG is set)

Things I don't understand here:

- Why does (AF_UNSPEC, AI_ADDRCONFIG) succeed? Its documentation suggests
  that it would fail the same way as AF_INET and AF_INET6.
  (This has been reported as a bug before, in #854301.)
- Why does (AF_INET6, not AI_ADDRCONFIG) fail in pbuilder? /etc/hosts lists
  both 127.0.0.1 and ::1 as addresses of localhost, so I would expect
  that to work.

The good news is that GLib 2.63.x should fix this, because GLib 2.63.x
implements
<https://tools.ietf.org/html/draft-ietf-dnsop-let-localhost-be-localhost-02>
and hard-codes "localhost" to resolve to 127.0.0.1 and/or ::1 (depending
on the requested address family).

However, I think it's likely to be a recurring problem that unit tests
for network software try to connect to "localhost", use AI_ADDRCONFIG
because it is usually the right thing to do for Internet names, and find
that they cannot resolve that name - particularly if glibc changes its
behaviour to match its documentation (fixing #854301).

Possible solutions:

- In pbuilder's network namespace, assign a useless non-127.0.0.1
  address (perhaps 127.0.0.2) so that AI_ADDRCONFIG thinks we have
  basic IPv4 connectivity and will resolve localhost to 127.0.0.1
- Implement "let localhost be localhost" in either glibc, or everything
  that does name resolution, or both
  (e.g.
  <https://gitlab.gnome.org/GNOME/glib/-/merge_requests/616> in GIO,
  also implemented in Firefox and Chromium)
- Implement a special case that disables AI_ADDRCONFIG when looking up
  localhost in either glibc, or everything that does name resolution,
  or both
  (Mozilla does this, and Firefox still does:
  <https://hg.mozilla.org/releases/mozilla-1.9.2/rev/c5d74bcd7421>
  
<https://sources.debian.org/src/firefox-esr/68.5.0esr-1/nsprpub/pr/src/misc/prnetdb.c/?hl=2037#L2037>)
- Make tests that require resolving localhost skip themselves if it
  doesn't resolve. I think this is potentially undesirable because if
  sbuild starts to do the same no-network trick as pbuilder, it would
  effectively reduce our test coverage from every architecture down to
  the 2 architect

Re: Options for 64-bit time_t support on 32-bit architectures

2019-07-19 Thread Simon McVittie
On Fri, 19 Jul 2019 at 15:13:00 +0300, Adrian Bunk wrote:
> Remaining usecases of i386 will be old binaries, some old Linux binaries 
> but especially old software (including many games) running in Wine.
> Old Linux binaries will still need the old 32bit time_t.

Based on background from my contributions to the Steam Runtime:

I don't have numbers, but you might be surprised how many Linux-supporting
games are 32-bit. The Steam client itself is currently also 32-bit
(with some 64-bit subprocesses); this is somewhat deliberate, to act as
a canary for whether 32-bit code works at all, particularly when combined
with graphics.

The Steam Runtime (a LD_LIBRARY_PATH library bundle used to run Steam and
Steam games) is built on an increasingly ancient version of Ubuntu, but
it tries to use newer libraries of the same SONAME from the host system
where available, which they often will be, because people who install
Steam probably also install Wine, which has 32-bit dependencies. If those
libraries have an incompatible ABI involving 64-bit time_t, and it is used
at the ABI "surface" between a host-system library and a Steam Runtime
library or the game, then 32-bit games, and the Steam client itself,
will crash.

The Steam Runtime also relies on the host system for the OpenGL stack
(in practice Mesa or proprietary NVIDIA drivers), and for glibc itself.

In practice, many of the 32-bit games are not ever going to be recompiled
against a new ABI; the games are no longer developed or actively
supported, and their developers might no longer even be still in business.

Outside the Steam ecosystem, 32-bit games typically rely on host-system
libraries for things like SDL, X11 libraries, audio libraries and graphics
format libraries. It's unfortunate that GTK is one of the libraries
with time_t in its ABI, because GTK 2 is a fairly common choice for
game launcher/frontend programs.

smcv



Bug#877900: How to get 24-hour time on en_US.UTF-8 locale now?

2019-02-07 Thread Simon McVittie
On Thu, 07 Feb 2019 at 14:05:33 +0100, Adam Borowski wrote:
> a locale for a silly country with weird customs

Please don't take this tone. Insulting people who disagree with you[1]
is rarely an effective way to persuade them that you're right and
they're wrong.

> • promoting C.UTF-8 in our user interfaces (allowing to select it in d-i,
>   making dpkg-reconfigure locales DTRT, making it the d-i default)

I think this is exactly the "international/culture-neutral English"
locale that you're looking for. (Well, the C/POSIX locale is the formally
standardized form of that, but breaks text outside the ASCII range;
C.UTF-8 is the C locale with Unicode support added.)

> • inventing a new locale "en" without a country bias
>   -- good in the long term but problematic a month before freeze

I assume this would be a UTF-8 locale like en_US.utf8 and en_GB.utf8,
so probably en.utf8, possibly with a simple "en" alias?

As you say, I don't think a country-neutral specifically-English locale
is going to happen before buster.

How would this locale differ from C.UTF-8? Is the only difference
that C.UTF-8 has strict lexicographical sorting, whereas "en" would have
case-insensitive sorting like en_GB.utf8 does? (If that's the only
difference, then perhaps something like "LANG=C.utf8 LC_COLLATE=en_US.utf8"
is enough.)

smcv

[1] As it happens, I do agree with you that AM/PM time and middle-endian
dates are not a good default; but I'm from a different English-speaking
country with its own weird customs.



Bug#915621: glibc-source: needs versioned Breaks on some binary from cross-toolchain-base (<< 29~)

2018-12-05 Thread Simon McVittie
Package: glibc-source
Version: 2.28-1
Severity: important

autopkgtest fails while trying to test whether glibc in unstable could
migrate to testing:
https://ci.debian.net/data/autopkgtest/testing/amd64/c/cross-toolchain-base/1438735/log.gz

> tar -x -f  /usr/src/glibc/glibc-2.28.tar.xz
> cp -a /usr/src/glibc/debian/ glibc-2.28
> cd glibc-2.28 ; \
> QUILT_PATCHES=/tmp/autopkgtest-lxc.ufw2cd2w/downtmp/build.UwU/src/debian/patches/glibc/debian
>  quilt --quiltrc /dev/null push -a && \
> rm -rf .pc/
> Applying patch dpkg-shlibs.patch
> patching file debian/rules.d/debhelper.mk
> 
> Applying patch local-kill-locales.patch
> patching file debian/rules
> patching file localedata/SUPPORTED
> Hunk #1 FAILED at 2.
> 1 out of 1 hunk FAILED -- rejects in file localedata/SUPPORTED

I think this means glibc-source should have a Breaks: on some binary
package that gets installed by cross-toolchain-base's autopkgtest, so
that the testing migration infrastructure knows that cross-toolchain-base
28 and glibc-source 2.28 is not a valid combination.

Unfortunately, cross-toolchain-base's d/tests/control doesn't include
any binary packages built by cross-toolchain-base. Perhaps one needs to
be added, as a way to tell the testing migration infrastructure what is
going on?

Presumably this also means that cross-toolchain-base's autopkgtest is
not actually testing anything about the previously built .debs, which
seems contrary to the design of autopkgtest. I would have expected an
autopkgtest for cross-toolchain-base to be something more like this:

- install linux-libc-dev-arm64-cross, libc6-arm64-cross, etc.
- use them to compile and link a "hello, world" arm64 executable
- (optionally) use qemu or something to run it

where the choice of arm64 is arbitrary, but should ideally not match
the architecture of the CI infrastructure.

smcv



Re: RFC: use of shlib bump for libc dependency on new multiarch directories?

2011-02-23 Thread Simon McVittie
On Wed, 23 Feb 2011 at 13:52:35 -0800, Steve Langasek wrote:
 we almost certainly will not be using the path which has been enabled
 in glibc up to now, namely /lib/i486-linux-gnu.

I'd heard that, and was somewhat concerned about whether that'd block
multiarch for yet another release cycle; I'm glad to hear it isn't.

One possibility that occurred to me is adding a Pre-Depends on a new package
(multiarch-enabler, perhaps) which is arch:any and just contains this
file:

# /etc/ld.so.conf.d/x86-linux-glibc.conf
/lib/x86-linux-glibc
/usr/lib/x86-linux-glibc

Am I right in thinking that (probably only needed for the native architecture,
even) would be enough to bootstrap support for the multiarch paths in the
native architecture's linker far enough to perform the upgrade? A future
libc6 could even Replace it or something.

(It'd be a bit subtle by being transitively Essential, though.)

S


-- 
To UNSUBSCRIBE, email to debian-glibc-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/20110223225551.ga17...@reptile.pseudorandom.co.uk