Re: s6-envuidgid: Weird errors with GNU libc's getgrent() and endgrent()

2019-07-07 Thread Guillermo
El dom., 2 jun. 2019 a las 5:53, Laurent Bercot escribió:
>
> So I'd advise reporting the bug to the glibc maintainers.

By the way, Gentoo's toolchain maintainers agreed, so this became
upstream bug #24696. A fix, currently in its 8th iteration after
several rounds of reviews, is in preparation, so it will eventually be
commited to the repository. Hopefully :)

https://sourceware.org/ml/libc-alpha/2019-06/msg00957.html

Meanwhile, I also discovered why I didn't have any problems with
Gentoo's packaging of version 2.27, and did with version 2.29. It
turns out Gentoo did supply an nsswitch.conf file, but it was buried
inside the archive that contained Gentoo's patchset, so one had to
extract all files to discover it. This file said:

group: compat files

So there was no 'db' service configured for the group database, and
the endgrent() bug, which Gentoo maintainers traced back to at least
GNU libc 2.26, was not exposed. As of version 2.28, this file was no
longer supplied with the patchset, and upstream's example one was
installed instead.

G.


Re: s6-envuidgid: Weird errors with GNU libc's getgrent() and endgrent()

2019-06-11 Thread Brett Neumeier
On Mon, Jun 10, 2019 at 9:10 PM Guillermo  wrote:

> El lun., 10 jun. 2019 a las 4:13, Casper Ti. Vector escribió:
> > > /etc/nsswitch.conf, which I don't recall having ever modified, says:
> > >   group: db files
> /etc/nsswitch.conf is 'owned' by sys-libs/glibc, and Gentoo's default
> comes directly from the libc's source package:
>

For what it's worth, on my system -- which was built using glibc 2.29 with
branch updates through 2019-05-03 and otherwise unmodified -- the
/etc/nsswitch.conf file does not contain any "db" references at all. Full
text is:

passwd: files
group: files
shadow: files
hosts: files dns
networks: files
protocols: files
services: files
ethers: files
rpc: files

Presumably, it was installed this way because at the time glibc was
compiled there was a minimal set of packages available. I haven't looked at
the source carefully enough to determine whether that's actually the case,
but it seems a reasonable conjecture.

Anyway, I bet that's why I don't see the same error.

-- 
Brett Neumeier (bneume...@gmail.com)


Re: s6-envuidgid: Weird errors with GNU libc's getgrent() and endgrent()

2019-06-11 Thread Peter Pentchev
On Tue, Jun 11, 2019 at 04:49:04PM +0800, Casper Ti. Vector wrote:
> On Tue, Jun 11, 2019 at 11:20:13AM +0300, Peter Pentchev wrote:
> > Neither the glibc manual page nor the POSIX text say anything about
> > endgrent() setting errno or leaving it alone,
> 
> Well, it seems that the current POSIX text says otherwise:
> 

Okay; that's the POSIX text that I referred to, but I only looked
at the "return value" section and I didn't remember that it was
already referenced in this thread. Sorry for the confusion!

G'luck,
Peter

-- 
Peter Pentchev  roam@{ringlet.net,debian.org,FreeBSD.org} p...@storpool.com
PGP key:http://people.FreeBSD.org/~roam/roam.key.asc
Key fingerprint 2EE7 A7A5 17FC 124C F115  C354 651E EFB0 2527 DF13


signature.asc
Description: PGP signature


Re: s6-envuidgid: Weird errors with GNU libc's getgrent() and endgrent()

2019-06-10 Thread Casper Ti. Vector
On Mon, Jun 10, 2019 at 11:10:13PM -0300, Guillermo wrote:
> /etc/nsswitch.conf is 'owned' by sys-libs/glibc, and Gentoo's default
> comes directly from the libc's source package:

So Gentoo follows the upstream more closely, and the upstream default
would now result in the problem.  I think this is definitely worth a
glibc bug report.

> This is interesting. It hints at the problem really being in the
> upstream package. And you said that you added the 'db' service, so I
> take it that it wasn't there by default. Is this Void's current
> default /etc/nsswitch.conf?

Yes, precisely.

-- 
My current OpenPGP key:
RSA4096/0x227E8CAAB7AA186C (expires: 2020.10.19)
7077 7781 B859 5166 AE07 0286 227E 8CAA B7AA 186C



Re: s6-envuidgid: Weird errors with GNU libc's getgrent() and endgrent()

2019-06-10 Thread Guillermo
El lun., 10 jun. 2019 a las 4:13, Casper Ti. Vector escribió:
>
> > /etc/nsswitch.conf, which I don't recall having ever modified, says:
> >   group: db files
>
> Try using `qfile -o' to find the owner, and subsequently how it should
> originally have been?  (I used Gentoo for several years before migrating
> to Alpine/Void two or three years ago, which is why I still lurk on its
> forums.)

/etc/nsswitch.conf is 'owned' by sys-libs/glibc, and Gentoo's default
comes directly from the libc's source package:

* 
https://gitweb.gentoo.org/repo/gentoo.git/tree/sys-libs/glibc/glibc-2.29-r2.ebuild#n1281
* 
https://sourceware.org/git/?p=glibc.git;a=blob;f=nss/nsswitch.conf;h=39ca88bf5198df2bfa8f4a2e4bf631f3baee16c0;hb=56c86f5dd516284558e106d04b92875d5b623b7a

> > I have no idea what changed, why this used to work before my upgrade
> > of the libc, or why it apparently never failed for anyone else not on
> > Gentoo.
>
> You are correct: the issue can be reproduced on my void/glibc system if
> `db' is added (whether prepended or appended) to the `group:' line in
> /etc/nsswitch.conf.  (The /etc/nsswitch.conf is the distro-default for
> glibc/x86_64 systems, unchanged on my system.)

This is interesting. It hints at the problem really being in the
upstream package. And you said that you added the 'db' service, so I
take it that it wasn't there by default. Is this Void's current
default /etc/nsswitch.conf?

* 
https://github.com/void-linux/void-packages/blob/6c9706db3f2034677057ab1e70ce59fd06134ea3/srcpkgs/base-files/files/nsswitch.conf

If yes, it means that the 'db' service isn't configured for any
database at all, and would explain Void's 'immunity' to this problem.

G.


Re: s6-envuidgid: Weird errors with GNU libc's getgrent() and endgrent()

2019-06-10 Thread Casper Ti. Vector
On Sun, Jun 09, 2019 at 05:28:49PM -0300, Guillermo wrote:
> /etc/nsswitch.conf, which I don't recall having ever modified, says:
>   group: db files

Try using `qfile -o' to find the owner, and subsequently how it should
originally have been?  (I used Gentoo for several years before migrating
to Alpine/Void two or three years ago, which is why I still lurk on its
forums.)

> I have no idea what changed, why this used to work before my upgrade
> of the libc, or why it apparently never failed for anyone else not on
> Gentoo.

You are correct: the issue can be reproduced on my void/glibc system if
`db' is added (whether prepended or appended) to the `group:' line in
/etc/nsswitch.conf.  (The /etc/nsswitch.conf is the distro-default for
glibc/x86_64 systems, unchanged on my system.)

-- 
My current OpenPGP key:
RSA4096/0x227E8CAAB7AA186C (expires: 2020.10.19)
7077 7781 B859 5166 AE07 0286 227E 8CAA B7AA 186C



Re: s6-envuidgid: Weird errors with GNU libc's getgrent() and endgrent()

2019-06-09 Thread Laurent Bercot

But it turns out that, with this configuration, __nss_endent() *also*
wants to call the implementation of endgrent(3) from each of these
services. And the one from libnss_db.so, named _nss_db_endgrent(), is
just a wrapper around a munmap(2) system call, via an intermediate
internal_endent() function:


Impressive investigative work, congratulations on having found
the root of the problem! :)



I have no idea what changed, why this used to work before my upgrade
of the libc, or why it apparently never failed for anyone else not on
Gentoo.


Let this be a reminder for everyone that complexity always, always
has unintended consequences...

--
Laurent


Re: s6-envuidgid: Weird errors with GNU libc's getgrent() and endgrent()

2019-06-03 Thread Laurent Bercot




Adelie (musl-posix) must be getting really close to replacing
sysv/openrc with full s6.


 sysv is done: on Adélie, you can now choose between sysvinit and
s6-linux-init as your init system. Both are supported.
 Providing a packaged alternative to OpenRC, however, will be much
more difficult and time-consuming. It will have to wait until I can
free a year, which should be slowly approaching.

--
 Laurent



Re: s6-envuidgid: Weird errors with GNU libc's getgrent() and endgrent()

2019-06-03 Thread Casper Ti. Vector
On Mon, Jun 03, 2019 at 08:36:00AM +, fungal-net wrote:
> Void (both glibc and musl) have had the whole s6 suite and skarnet libs
> on its repositories for all (xx) architectures.  s6-rc doesn't work out
> of the box though (tried and tried, I don't know enough to get it
> working).  I can install an arch kernel and make a bootable image by
> installing Obarun's s6-rc into void (glibc) and then 66.  Obarun's 66 is
> also in Void repositories, current and identical, I believe.

I use slew (, developed by myself)
on both Void (both glibc and musl) and Alpine (quite a few machines),
and all run smoothly.  What is you specific issue with s6-rc?  (Perhaps
this is more suitable for the supervision mailing list.)

-- 
My current OpenPGP key:
RSA4096/0x227E8CAAB7AA186C (expires: 2020.10.19)
7077 7781 B859 5166 AE07 0286 227E 8CAA B7AA 186C



Re: s6-envuidgid: Weird errors with GNU libc's getgrent() and endgrent()

2019-06-03 Thread fungal-net
Brett Neumeier:
> On Sun, Jun 2, 2019 at 10:02 AM Guillermo  wrote:
> 
>> El dom., 2 jun. 2019 a las 3:27, Casper Ti. Vector escribió:
>>>
>>> On my machine using Void with glibc 2.29 since 20190305
>>
>> Yay! I thought chances of hearing from someone who uses a GNU
>> libc-based distribution that is not Gentoo, with a sufficiently recent
>> version (which usually means it's rolling release), and who is also a
>> subscriber of this list, were rather slim :)
> 
> I am another -- using a completely built-from-source libc-based system.
> With glibc 2.29 and branch updates through 2019-05-03, and s6 2.8.0.0 with
> branch updates through 2019-03-05, I also have no issues.
>> , I never encountered this issue.
>>
>> Do you happen to build skarnet.org packages statically linked to musl
>> on those Void machines, or do you let them link to the distribution's
>> libc?

Void (both glibc and musl) have had the whole s6 suite and skarnet libs
on its repositories for all (xx) architectures.  s6-rc doesn't work out
of the box though (tried and tried, I don't know enough to get it
working).  I can install an arch kernel and make a bootable image by
installing Obarun's s6-rc into void (glibc) and then 66.  Obarun's 66 is
also in Void repositories, current and identical, I believe.

Adelie (musl-posix) must be getting really close to replacing
sysv/openrc with full s6.



Re: s6-envuidgid: Weird errors with GNU libc's getgrent() and endgrent()

2019-06-02 Thread Guillermo
El dom., 2 jun. 2019 a las 12:21, Brett Neumeier escribió:
>
> FWIW, I compiled and ran your test program; the program output concludes with:
>
> name: ec2user members: (errno = Invalid argument)
> End of file or error (errno = Success)
> errno = Success

Huh. So it looks like I've got a combination of upstream and
Gentoo-specific. Lucky me :(

> With glibc 2.29 and branch updates through 2019-05-03,

In the chance that there might have been a regression that has been
fixed at least for endgrent(3) in some of those commits, I'm going to
have a look then, and also check what Void does. If I find nothing,
then next stop for me is Gentoo's bug tracker, I guess.

Many thanks to you and Casper for the testing.
G.


Re: s6-envuidgid: Weird errors with GNU libc's getgrent() and endgrent()

2019-06-02 Thread Casper Ti. Vector
On Sun, Jun 02, 2019 at 12:02:04PM -0300, Guillermo wrote:
> Do you happen to build skarnet.org packages statically linked to musl
> on those Void machines, or do you let them link to the distribution's
> libc?

On both Alpine and Void, I use the stock packages for skaware from the
distros, and both use dynamically linked binaries.

> Yeah, what triggers s6-envuidgid's failure here is that endgrent() is
> setting errno to the weird EINVAL value, the program checks errno
> *after* the call, and thinks it was caused by a failing  getgrent()
> call. Would it be to much to ask you if you could also check if
> endgrent(3) flips errno from 0 to EINVAL?

Actually it is trivial :)  Just compiled your test code and the runtime
result is:
> name: root members: (errno = Invalid argument)
> name: kmem members: (errno = Invalid argument)
> [...]
> End of file or error (errno = Success)
> errno = Success
So on my system endgrent(3) does not change errno from 0 to EINVAL.

-- 
My current OpenPGP key:
RSA4096/0x227E8CAAB7AA186C (expires: 2020.10.19)
7077 7781 B859 5166 AE07 0286 227E 8CAA B7AA 186C



Re: s6-envuidgid: Weird errors with GNU libc's getgrent() and endgrent()

2019-06-02 Thread Laurent Bercot

Short version: For recent libc releases, and at least on Gentoo,
getgrent() and endgrent() seem to magically set errno to EINVAL (I
think), except when errno's value is actually meaningful.

> (...)

End of file or error (errno = Success)
errno = Invalid argument

POSIX says:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/endgrent.html

"The setgrent() and endgrent() functions shall not change the setting of
errno if successful."

So I'd advise reporting the bug to the glibc maintainers.

--
Laurent


Re: s6-envuidgid: Weird errors with GNU libc's getgrent() and endgrent()

2019-06-02 Thread Casper Ti. Vector
On Sat, Jun 01, 2019 at 11:55:57PM -0300, Guillermo wrote:
> Sooo... thoughts? Does anyone else use a sufficiently recent version
> of GNU libc and experience the same?

On my machine using Void with glibc 2.29 since 20190305, I never
encountered this issue.  I can confirm the behaviour you described of
getgrent(3), but it seems that prot_readgroups() (the only place where
getgrent(3) is called in s6-envuidgid.c) always calls endgrent(3) only
after getgrent(3) returns NULL, whether on error or upon exhaustion of
the entries; the `if (n >= max) break;' line would not be triggered
under normal circumstances because `max' is set to `NGROUPS_MAX' by
main().  Therefore, at least on my system, endgrent(3) is always called
with `errno' set to zero.

BTW, gcc with `-Wsign-compare' compains about the operand(s) of `?' on
the last line of prot_readgroups(); I do not know how Laurent thinks
about this.

-- 
My current OpenPGP key:
RSA4096/0x227E8CAAB7AA186C (expires: 2020.10.19)
7077 7781 B859 5166 AE07 0286 227E 8CAA B7AA 186C