On Sun, 09 Feb 2020 at 19:19:24 +0000, Simon McVittie wrote:
> On Sun, 09 Feb 2020 at 16:45:05 +0100, Mattia Rizzolo wrote:
> > I see glib2.0 is also failing in the r-b infra:
> > https://tests.reproducible-builds.org/debian/rb-pkg/unstable/amd64/glib2.0.html
> We could probably work around this in glib2.0 with a Build-Depends on
> libnss-myhostname | netbase, or the other way round.

I tried this, and no, that doesn't work; the situation is more subtle
than I thought, and not the fault of pbuilder's /etc/hosts.

localhost *does* resolve in the container. However, it only resolves
with certain options, and those options don't match all the options GIO
is going to use.

Specifically, GResolver is normally implemented by GThreadedResolver,
which uses getaddrinfo with socktype SOCK_STREAM, protocol IPPROTO_TCP,
flags AI_ADDRCONFIG, and a varying family: either AF_UNSPEC, AF_INET or
AF_INET6 depending on options. The "Happy Eyeballs" algorithm exercised
in this test carries out separate AF_INET and AF_INET6 name resolution,
so that it can make HTTP connections via IPv4 and IPv6 in parallel,
and take whichever works first.

Unfortunately, AI_ADDRCONFIG is documented like this (my emphasis):

     If  hints.ai_flags includes the AI_ADDRCONFIG flag, then IPv4 addresses
     are returned in the list pointed to by res only if the local system has
     at  least  one IPv4 address configured, and IPv6 addresses are returned
     only if the local system has at least one IPv6 address configured. **The
     loopback  address is not considered for this case as valid as a
     configured address.**

and pbuilder's network namespace only has loopback addresses. So we
would expect resolving "localhost" to always fail in that namespace with
AI_ADDRCONFIG, which I would have expected to affect more packages than
just GLib - but that doesn't happen, due to #854301.

To debug this I hacked the attached program into a package built in
pbuilder (GLib is inconveniently large, so I added the program to procenv
instead). You can get similar (but not identical!) results without pbuilder
by compiling the program, installing bwrap and using:

    bwrap --unshare-net --dev-bind / / ./getaddrinfo

By experiment, what actually happens is:

no hints (which in glibc means AF_UNSPEC and AI_ADDRCONFIG|AI_V4MAPPED):
    success, return (only, I don't get ::1 for some reason)
    if AI_ADDRCONFIG: fails with -2 "Name or service not known"
    else: success, return
    if AI_ADDRCONFIG: fails with -2 "Name or service not known"
    else (pbuilder): fails with -3 "Temporary failure in name resolution"
    else (bwrap): success, return ::1
    success, return (even if AI_ADDRCONFIG is set)

Things I don't understand here:

- Why does (AF_UNSPEC, AI_ADDRCONFIG) succeed? Its documentation suggests
  that it would fail the same way as AF_INET and AF_INET6.
  (This has been reported as a bug before, in #854301.)
- Why does (AF_INET6, not AI_ADDRCONFIG) fail in pbuilder? /etc/hosts lists
  both and ::1 as addresses of localhost, so I would expect
  that to work.

The good news is that GLib 2.63.x should fix this, because GLib 2.63.x
and hard-codes "localhost" to resolve to and/or ::1 (depending
on the requested address family).

However, I think it's likely to be a recurring problem that unit tests
for network software try to connect to "localhost", use AI_ADDRCONFIG
because it is usually the right thing to do for Internet names, and find
that they cannot resolve that name - particularly if glibc changes its
behaviour to match its documentation (fixing #854301).

Possible solutions:

- In pbuilder's network namespace, assign a useless non-
  address (perhaps so that AI_ADDRCONFIG thinks we have
  basic IPv4 connectivity and will resolve localhost to
- Implement "let localhost be localhost" in either glibc, or everything
  that does name resolution, or both
  <https://gitlab.gnome.org/GNOME/glib/-/merge_requests/616> in GIO,
  also implemented in Firefox and Chromium)
- Implement a special case that disables AI_ADDRCONFIG when looking up
  localhost in either glibc, or everything that does name resolution,
  or both
  (Mozilla does this, and Firefox still does:
- Make tests that require resolving localhost skip themselves if it
  doesn't resolve. I think this is potentially undesirable because if
  sbuild starts to do the same no-network trick as pbuilder, it would
  effectively reduce our test coverage from every architecture down to
  the 2 architectures where we have autopkgtest (amd64 and arm64).
- Don't test anything involving name resolution (even of localhost) at
  build-time, only in autopkgtest. I think this is undesirable because,
  again, it would reduce our test coverage from every architecture down
  to 2 architectures (amd64 and arm64).

See also:
- https://sourceware.org/bugzilla/show_bug.cgi?id=12377
- https://github.com/zeromq/libzmq/issues/42
- https://fedoraproject.org/wiki/QA/Networking/NameResolution/ADDRCONFIG

#define _GNU_SOURCE
#include <errno.h>
#include <netdb.h>
#include <stdio.h>
#include <string.h>
#include <sys/socket.h>
#include <sys/types.h>

typedef struct
  int value;
  const char *name;
} NamedInt;
#define ITEM(x) { x, #x }

static const NamedInt ai_flags[] = {
  { 0, NULL }

static const NamedInt families[] = {
  { 0, NULL }

static const NamedInt protocols[] = {
  { 0, NULL }

static const NamedInt socktypes[] = {
  { 0, NULL }

static void
print_flags (const char *indent,
             int flags)
  int i;

  for (i = 0; ai_flags[i].name != NULL; i++)
      if (flags & ai_flags[i].value)
        printf ("%s%s (0x%x)\n", indent, ai_flags[i].name, ai_flags[i].value);

static const char *
describe (int value,
          const NamedInt *names)
  int i;

  for (i = 0; names[i].name != NULL; i++)
      if (value == names[i].value)
        return names[i].name;

  return "(unknown)";

static void
try_getaddrinfo(const char *name,
                const struct addrinfo *hints)
  struct addrinfo *addrs = NULL;
  const struct addrinfo *a;
  int res;
  int saved_errno;

  printf ("==== trying getaddrinfo %s ====\n", name);

  if (hints == NULL)
      printf ("\tno hints\n");
      printf ("hints:\n");
      printf ("\tai_flags: 0x%x\n", hints->ai_flags);
      print_flags ("\t\t", hints->ai_flags);
      printf ("\tai_family: %d %s\n", hints->ai_family, describe (hints->ai_family, families));
      printf ("\tai_socktype: %d %s\n", hints->ai_socktype, describe (hints->ai_socktype, socktypes));
      printf ("\tai_protocol: %d %s\n", hints->ai_protocol, describe (hints->ai_protocol, protocols));

  errno = 0;
  res = getaddrinfo (name, NULL, hints, &addrs);
  saved_errno = errno;

  if (res != 0)
      printf ("result %d: %s (errno %d: %s)\n",
              res, gai_strerror (res),
              saved_errno, strerror (saved_errno));
      printf ("\n");

  printf ("results:\n");

  for (a = addrs; a != NULL; a = a->ai_next)
      char host[1024];

      res = getnameinfo (a->ai_addr, a->ai_addrlen, host, sizeof (host),
                         NULL, 0, NI_NUMERICHOST);
      if (res != 0)
          printf ("\tai_addr: (getnameinfo failed: %d %s)\n",
                  res, gai_strerror (res));
          printf ("\tai_addr: %s\n", host);

      printf ("\tai_flags: 0x%x\n", a->ai_flags);
      print_flags ("\t\t", a->ai_flags);
      printf ("\tai_family: %d %s\n", a->ai_family, describe (a->ai_family, families));
      printf ("\tai_socktype: %d %s\n", a->ai_socktype, describe (a->ai_socktype, socktypes));
      printf ("\tai_protocol: %d %s\n", a->ai_protocol, describe (a->ai_protocol, protocols));
      printf ("\tai_addrlen: %d\n", a->ai_addrlen);
      printf ("\tai_canonname: %s\n", a->ai_canonname);

      printf ("\n");

  freeaddrinfo (addrs);

#define N_ELEMENTS(arr) sizeof (arr) / sizeof (arr[0])

main (int argc, char *argv[])
  struct addrinfo hints = { 0 };
  int families[] = { AF_INET, AF_INET6, AF_UNSPEC };
  int i, j, k;
  const char *name;

  if (argc > 1)
    name = argv[1];
    name = "localhost";

  try_getaddrinfo (name, NULL);

  for (i = 0; i < N_ELEMENTS (families); i++)
      hints.ai_family = families[i];

      for (j = 0; j < 2; j++)
          if (j)
            hints.ai_flags |= AI_ADDRCONFIG;
            hints.ai_flags &= ~AI_ADDRCONFIG;

          for (k = 0; k < 2; k++)
              if (k)
                  hints.ai_protocol = IPPROTO_TCP;
                  hints.ai_socktype = SOCK_STREAM;
                  hints.ai_protocol = 0;
                  hints.ai_socktype = 0;

              if (j && k)
                printf ("(this next one is what GIO actually does)\n");

              try_getaddrinfo (name, &hints);

  return 0;

Reply via email to