Well, this one required GDB to tell what was going on, but I managed to find a workaround.
> Short version: For recent libc releases, and at least on Gentoo, > getgrent() and endgrent() seem to magically set errno to EINVAL (I > think), > [...] > s6-envuidgid [...] fails with a strange "invalid argument" error > whenever it tries to set GIDLIST. s6-setuidgid invokes s6-envuidgid, > so it also fails. For those who don't know, in GNU libc, getgrent(3) and endgrent(3) are implemented by __nss_getent_r() and __nss_endent(), respectively, which are part of the Name Service Switch (NSS) mechanism. You know, the one that Laurent's nsss project is a replacement of. My /etc/nsswitch.conf, which I don't recall having ever modified, says: group: db files With this configuration, the first time __nss_getent_r() is called, it tries to call the implementation of setgrent(3) from each of these services, "db", implemented by module /lib64/libnss_db.so.2, and "files", implemented by module /lib64/libnss_files.so.2. The first one is _nss_db_setgrent(), which tries to open /var/db/group.db, fails because on my machine that file does not exist, and returns an 'unavailable' status (NSS_STATUS_UNAVAIL). The second one is _nss_files_setgrent(), which tries to open /etc/group, succeeds, and returns a 'successful' status (NSS_STATUS_SUCCESS). From then on, __nss_getent_r() always calls the implementation of getgrent() from libnss_files.so, named _nss_files_getgrent_r(). Relevant output of an strace of the test program in my OP: openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/etc/nsswitch.conf", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib64/libnss_db.so.2", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib64/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/var/db/group.db", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/etc/group", O_RDONLY|O_CLOEXEC) = 3 But it turns out that, with this configuration, __nss_endent() *also* wants to call the implementation of endgrent(3) from each of these services. And the one from libnss_db.so, named _nss_db_endgrent(), is just a wrapper around a munmap(2) system call, via an intermediate internal_endent() function: * https://sourceware.org/git/?p=glibc.git;a=blob;f=nss/nss_db/db-open.c;h=8a83d6b9302b39a071d0ddca5ab686e6ecfd6178;hb=56c86f5dd516284558e106d04b92875d5b623b7a In my case, this results in a 'munmap(NULL, 0)' call that… you guessed, fails with EINVAL (remember that _nss_files_setgrent() said the service was unavaliable?). And strace happens to see it: write(1, "End of file or error (errno = Su"..., 39) = 39 munmap(NULL, 0) = -1 EINVAL (Invalid argument) close(3) = 0 write(1, "errno = Invalid argument\n", 25) = 25 The implementation from libnss_files.so, _nss_files_endgrent(), is also called, and succeeds, but errno is already set. So, the workaround? I removed the "db" service for the group database in /etc/nsswitch.conf: group: files With this change, the output of the test program looks exactly like Casper's and Brett's, strace no longer shows openat() calls for /lib64/libnss_db.so.2 and /var/db/group.db, and both s6-setuidgid and s6-envuidgid work again. I have no idea what changed, why this used to work before my upgrade of the libc, or why it apparently never failed for anyone else not on Gentoo. G.
