time64 ABI fix coming to upstream glibc

2024-05-02 Thread Florian Weimer
The  and  headers had a bug that the on-disk structures
defined there could change size on some targets when _TIME_BITS was set
to 64.  This is obviously wrong because the files are not going to
magically change their layout because the application accessing them was
built in a specific way.  We're going to fix this in glibc upstream on
the stable release branches, going all the way back to glibc 3.34 (the
first release with this kind of time64 support).  After the fix, the
_TIME_BITS definition will no longer impact struct layout.  Usually,
that means epoch fields are 32-bits wide, to match co-installable
architectures.

To extend the usable life-time of these interfaces somewhat, glibc 2.40
changes epoch fields to unsigned types in these structures.  This change
is specific to the upcoming glibc 2.40 release, I do not plan to
backport it.

Thanks,
Florian



Re: static pie: confusion between _DYNAMIC, crt1.o, Scrt1.o

2022-10-24 Thread Florian Weimer
* Mike Frysinger via Libc-alpha:

> On 24 Oct 2022 13:12, Florian Weimer via Libc-alpha wrote:
>> * Samuel Thibault:
>> > Florian Weimer, le lun. 24 oct. 2022 12:11:03 +0200, a ecrit:
>> >> * Samuel Thibault:
>> >> 
>> >> > Is it not possible to make -static -pie get the same behavior? That'd be
>> >> > way more orthogonal for people to understand.
>> >> 
>> >> I think you want -static to mean -static-pie if GCC defaults to PIE,
>> >> right?
>> >
>> > That would actually provide the pie benefit automatically for all
>> > static executable, yes. Otherwise static pie will be a nice thing, but
>> > not actually largely used in practice. And most people won't actually
>> > realize it.
>> 
>> That's true.
>> 
>> Fedora uses a specs file fragment that turns -static into -static-pie
>> under certain conditions.
>> 
>> >> That will break a few things that use gcc -static to build binaries for
>> >> quasi-bare-metal targets using the GNU ELF toolchain (where glibc's
>> >> startup code is not use).
>> >
>> > But then the piece which is saying that glibc's startup code is not in
>> > use can be fixed into not using static-pie, can't it?
>> 
>> In theory, yes.  How hard it will be depends on the specs file change
>> for --enable-default-pie.
>
> i don't see a problem with -static DTRT.  people abusing a compiler for a
> target it wasn't designed for means they get the pieces.  it's not like
> they're using -static in the first place to pull in the C library & gcc
> internal libs (which also depend/assume the corresponding OS & C lib).
>
> plus, -static -no-pie would get you back to a non-PIE static binary.

The last part depends on the specs file, it has to be put there
explicitly I think.  And perhaps -Wl,-no-pie as well?

Maybe also do -no-pie implicitly with -static -nostartfiles?

Thanks,
Florian



Re: static pie: confusion between _DYNAMIC, crt1.o, Scrt1.o

2022-10-24 Thread Florian Weimer
* Samuel Thibault:

> Florian Weimer, le lun. 24 oct. 2022 12:11:03 +0200, a ecrit:
>> * Samuel Thibault:
>> 
>> > Is it not possible to make -static -pie get the same behavior? That'd be
>> > way more orthogonal for people to understand.
>> 
>> I think you want -static to mean -static-pie if GCC defaults to PIE,
>> right?
>
> That would actually provide the pie benefit automatically for all
> static executable, yes. Otherwise static pie will be a nice thing, but
> not actually largely used in practice. And most people won't actually
> realize it.

That's true.

Fedora uses a specs file fragment that turns -static into -static-pie
under certain conditions.

>> That will break a few things that use gcc -static to build binaries for
>> quasi-bare-metal targets using the GNU ELF toolchain (where glibc's
>> startup code is not use).
>
> But then the piece which is saying that glibc's startup code is not in
> use can be fixed into not using static-pie, can't it?

In theory, yes.  How hard it will be depends on the specs file change
for --enable-default-pie.

Thanks,
Florian



Re: static pie: confusion between _DYNAMIC, crt1.o, Scrt1.o

2022-10-24 Thread Florian Weimer
* Samuel Thibault:

> Is it not possible to make -static -pie get the same behavior? That'd be
> way more orthogonal for people to understand.

I think you want -static to mean -static-pie if GCC defaults to PIE,
right?

That will break a few things that use gcc -static to build binaries for
quasi-bare-metal targets using the GNU ELF toolchain (where glibc's
startup code is not use).  Overall it might still be the better
trade-off.

Thanks,
Florian



Bug#1015719: libc6-dev: Build glibc with latest packaged kernel version

2022-07-25 Thread Florian Weimer
* Alejandro Colomar:

> Hi Florian!
>
> On 7/25/22 12:38, Florian Weimer wrote:
>> * Alejandro Colomar via Libc-alpha:
>> 
>>> Is there an easy way to regenerate that header to get the tatest
>>> syscalls?  Maybe a command could be supplied so that users (or at
>>> least distributors) have it easy to regenerate them?  Maybe it already
>>> exists but it's not widely known?
>> I have recently backported the syscall-names.list updates to glibc
>> 2.34,
>> but not glibc 2.33.  It's a simple backport.
>> We could perhaps enhance the glibc build process that it uses the
>> union
>> of the known system call names and what's found in the kernel headers.
>
> I guess it's a simple backport, since it's just adding the macros (I
> guess 0 side effects).
>
> But maybe providing a script, e.g., update-libc-syscalls(1), that
> distributions and users can call when updating a kernel to immediately 
> backport syscalls to their system, would make it even simpler.
>
> E.g., when one runs `apt-get upgrade`, if the kernel is upgraded,
> update-libc-syscalls(1) would be called by apt-get as a post install 
> script, and libc macros would have the new syscall numbers provided by
> the new kernel.  No need to wait glibc programmers to provide the
> backport.
>
> Makes sense?

Sure, that's a possibility.  We don't do this in Fedora because RPM does
not have delayed script execution, so it's hard to make sure everything
is installed properly when the processing script runs.

Thanks,
Florian



Bug#1015719: libc6-dev: Build glibc with latest packaged kernel version

2022-07-25 Thread Florian Weimer
* Alejandro Colomar via Libc-alpha:

> Is there an easy way to regenerate that header to get the tatest
> syscalls?  Maybe a command could be supplied so that users (or at
> least distributors) have it easy to regenerate them?  Maybe it already
> exists but it's not widely known?

I have recently backported the syscall-names.list updates to glibc 2.34,
but not glibc 2.33.  It's a simple backport.

We could perhaps enhance the glibc build process that it uses the union
of the known system call names and what's found in the kernel headers.

Thanks,
Florian



Bug#1004577: ldconfig -p coredumps

2022-02-01 Thread Florian Weimer
* Christoph Berg:

> Package: libc-bin
> Version: 2.33-3
> Severity: important
>
> In 
> https://salsa.debian.org/python-team/packages/python-telethon/-/jobs/2413916
> there is a diff generated between the two builds because a core file
> from `ldconfig -p` appears as /usr/lib/python3.10/dist-packages/core.

Is this the “FATAL: kernel too old” error?

We could remove this check from upstream, and just try to run code and
see how far we get.  I assume that these days, the check does more harm
than good.  People with pre-3.2 kernels (glibc's built-in baseline) will
likely run a heavily patched 2.6.32 kernel, and that should be *almost*
there.

Thanks,
Florian



Re: /usr/bin/ld.so as a symbolic link for the dynamic loader

2021-12-04 Thread Florian Weimer
* Aurelien Jarno:

> Hi,
>
> On 2021-12-02 19:51, Florian Weimer wrote:
>> I'd like to provide an ld.so command as part of glibc.  Today, ld.so can
>> be used to activate preloading, for example.  Compared to LD_PRELOAD,
>> the difference is that it's specific to one process, and won't be
>> inherited by subprocesses—something is that exactly what is needed.
>> There is also some useful diagnostic output in --help,
>> --list-diagnostics.
>
> This sounds a good idea, I guess for instance that in the future the ldd
> feature could be implemented as an option.
>
> However before exposing directly to user, there might be some work need
> to improve error reporting. For instance running the dynamic linker
> against a static library like ldconfig just causes a segmentation fault
> (at least in 2.34, I haven't tried in HEAD). I haven't tried either
> binaries from other libc, but we should make sure that an error is
> correctly reported if users try that. At least a binary from a different
> architecture correctly reports the error.

I filed <https://sourceware.org/bugzilla/show_bug.cgi?id=28648> for that.

I have a patch that uses execve in those cases.  So ”ld.so
/sbin/ldconfig” will work in the future.  This has some backwards
compatibility impact: In theory there might be objects which do not have
dynamic dependencies, but use a program interpreter, using custom
startup code.  Running those objects would lose dynamic loader
customization.

“ld.so /path/to/lib.so” crashes for completely different reasons: due to
a recently fixed binutils bug, shared objects have a fictitious entry
point set by BFD ld.  Due to the bug, the address is in an loadable,
executable segment (it's the start of the .text section), so we can't
detect this situation in the dynamic loader.  This will only go away
once shared objects are linked with a fixed ld.

>> Having ld.so as a real command makes the name architecture-agnostic.
>> This discourages from hard-coding non-portable paths such as
>> /lib64/ld-linux-x86-64.so.2 or even (the non-ABI-compliant)
>> /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 in scripts that require
>> specific functionality offered by such an explicit loader invocation.
>
> Do you actually have example of usage of the non-ABI-compliant dynamic
> loader in Debian? Independently of the current discussion, this should
> probably be fixed.

There are few suspicious examples in source code:

  <http://codesearch.debian.net/search?q=%2Flib%2F.*-gnu%2Fld-linux=0>

I have not scanned ELF binaries in built packages.

>> I thought that commands with file extensions might be Policy violation.
>> Policy actually talks about file extensions for programs installed in
>> /usr/bin—but only for scripts.  So it's technically okay.  And today,
>> there's already an ld.so manual page, although it's in section 8 and 1.
>> (I think /usr/bin is still appropriate because running ld.so does not
>> require special privileges.)
>> 
>> The initial implementation will be just a symbolic link.  This means
>> that multi-arch support will be missing: the amd64 loader will not be
>> able to redirect execution to the s390x loader.  In principle, it should
>> be possible to find PT_INTERP with a generic ELF parser and redirect to
>> that, but that's vaporware at present.  I don't know yet if it will be
>> possible to implement this without some knowledge of Debian's multi-arch
>> support in the loader.  Upstream doesn't have those features (we only
>> support /usr/lib vs /usr/lib64 and some minor variants of that), so
>> integration might be lacking.
>
> A simple symlink would work as a start, however it means creating a
> new non-multiarch package. If we want that feature to be available to
> all system, we need a way to ensure this package is automatically
> installed.
>
> What is the plan about supporting /usr/lib vs /usr/lib64? If upstream go
> the wrapper way, there might not be a lot of differences with what is
> needed to support multiarch. From what I understand a wrapper would need
> to have a basic understanding of the arguments passed to ld.so to find
> the binary that needs to be loaded. Looking at PT_INTERP is one way to
> go, however we should define what would be the behaviour for non GNU
> libc dynamic loader.

There is not going to be a wrapper.  We will integrate the logic into
ld.so itself, teaching it to execve a different loader.  All ld.so's
will have this capability, so it won't matter which one ends up as
/usr/bin/ld.so, except for performance reasons.

Basically, if we detect an incompatible architecture, we will fetch
PT_INTERP from the executable and execve that, using mostly the original
dynamic linker command line.  At least that's my plan.

>>

Re: /usr/bin/ld.so as a symbolic link for the dynamic loader

2021-12-04 Thread Florian Weimer
* Helmut Grohne:

> Hi Florian,
>
> On Fri, Dec 03, 2021 at 06:29:33PM +0100, Florian Weimer wrote:
>> We can add a generic ELF parser to that ld.so and use PT_INTERP, as I
>> mentioned below.  I think this is the way to go.  Some care will be
>> needed to avoid endless loops, but that should be it.
>
> Can I ask you to go into a bit more technical detail as to how this is
> supposed to work?

Sure!

> From what was said, I expect that /usr/bin/ld.so is an ELF executable.
> It will likely be part of libc-bin. Do you confirm?

Yes, that's what I expect as well.

> Since libc-bin is Multi-Arch: foreign. The new ld.so really must have an
> architecture-independent API. If it does not, it must not go there.

It is as architecture-independent as ldconfig or getconf.  Perhaps a bit
more so than getconf.

> As far as I understand things, the typical use will be "ld.so
> --preload somelib someprogram". Now consider an i386 ld.so, an amd64
> somelib and an amd64 someprogram. Will that work with the generic ELF
> parser?
>
> At present, it does not seem to work:
>
> $ /lib/ld-linux.so.2 --preload /usr/lib/x86_64-linux-gnu/libeatmydata.so 
> /bin/true
> /bin/true: error while loading shared libraries: /bin/true: wrong ELF class: 
> ELFCLASS64
> $

As has explained elsewhere, you need to use $LIB or just the soname (so
that ld.so searches the right paths).  I expect this to work eventually:

  ld.so --preload libeatmydata.so /bin/true

Even if /bin/true is an i386 program, assuming that libeatmydata1:i386
is installed.  Whether ld.so is built for i386 or amd64 will not matter.

> If that is what you will get from /usr/bin/ld.so, then it must not be
> part of libc-bin or Multi-Arch: foreign must be dropped. The latter
> likely is a non-option due to the amount of resulting breakage.

With the patch I've posted, you'll get the ELFCLASS64 error.  But I have
some ideas how to fix that eventually.  Is this sufficient for now?

Thanks,
Florian



Re: /usr/bin/ld.so as a symbolic link for the dynamic loader

2021-12-03 Thread Florian Weimer
* Simon McVittie:

> On Thu, 02 Dec 2021 at 19:51:16 +0100, Florian Weimer wrote:
>> Having ld.so as a real command makes the name architecture-agnostic.
>> This discourages from hard-coding non-portable paths such as
>> /lib64/ld-linux-x86-64.so.2 or even (the non-ABI-compliant)
>> /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 in scripts that require
>> specific functionality offered by such an explicit loader invocation.
>
> This works up to a point, but because there is only one /usr/bin/ld.so,
> it can only work for one architecture per machine, so saying it's
> architecture-agnostic is still a bit of a stretch.

We can add a generic ELF parser to that ld.so and use PT_INTERP, as I
mentioned below.  I think this is the way to go.  Some care will be
needed to avoid endless loops, but that should be it.

Things will break if people link with --dynamic-linker=/usr/bin/ld.so,
but that's just broken (like using --dynamic-linker=/lib/dl-2.33.so
today).

>> In principle, it should
>> be possible to find PT_INTERP with a generic ELF parser and redirect to
>> that, but that's vaporware at present.  I don't know yet if it will be
>> possible to implement this without some knowledge of Debian's multi-arch
>> support in the loader.
>
> I believe Debian uses the interoperable (ABI-compliant) ELF interpreter
> as listed on https://sourceware.org/glibc/wiki/ABIList for all
> architectures - it certainly does for all *common* architectures (for
> example our x86_64 executables use /lib64/ld-linux-x86-64.so.2, which is
> a special exception to the rule that we don't usually use lib64).

I'm not aware of any Debian divergence yet, either.

If we can just run any specified PT_INTERP and use something else for
loop detection (e.g., an additional argument), then it should probably
work out of the box.  I was just trying to set expectations because I
had not really thought about it in detail, in particular the loop
avoidance scheme and it whether it must know about all the known
loaders.

Some distributions also want to avoid code execution from ldd.  Another
thing to consider before lifting paths out of PT_INTERP.

> I had naively believed that all distros do the same, but unfortunately
> my work on the Steam Runtime has taught me otherwise: for example, Arch
> Linux has a non-standard ELF interpreter /usr/lib/ld-linux-x86-64.so for
> executables that are built from the glibc source package (but uses the
> interoperable ELF interpreter for everything else),

/usr/lib/ld-linux-x86-64.so could be a botched attempt at completing
UsrMove.  The upstream makefiles are not really set up for that.

> and Exherbo consistently puts their dynamic linkers in
> /usr/x86_64-pc-linux-gnu/lib.

No idea about that one.

> Does glibc automatically set up the interoperable ELF interpreter, or is
> it something that distros' glibc maintainers have to "just know" if they
> are using a non-default ${libdir}?

With ./configure --prefix=/usr, upstream glibc is expected to use the
official path in the file system (and it should no longer be a symbolic
link, either).  The just-built binaries should use that path, too.

But the dynamic linker pathname is not entirely unique, which creates
problems for Debian-style multi-arch.

>> If someone wants to upstream the multi-arch patches, that would be
>> great.  glibc now accepts submissions under DCO, so copyright assignment
>> should no longer be an obstacle.
>
> (Please note that I am not a glibc maintainer and cannot speak for them.)
>
> I think multiarch is mostly build-time configuration rather than patches.
> The main thing needing patching is that we want ${LIB} to expand to
> lib/x86_64-linux-gnu instead of just x86_64-linux-gnu, so that the
> "/usr/${LIB}/libfoo.so.0" idiom works, but glibc would normally only take
> the last component of the ${libdir}:
>
> https://salsa.debian.org/glibc-team/glibc/-/blob/sid/debian/patches/any/local-ld-multiarch.diff

That must get the data from somewhere else.  Looking at

  <https://salsa.debian.org/glibc-team/glibc/-/blob/sid/debian/rules.d/build.mk>

it seems to come from DEB_HOST_MULTIARCH, and that's:

| DEB_HOST_MULTIARCH?= $(shell dpkg-architecture -qDEB_HOST_MULTIARCH)

We would have to take the table out of dpkg-architecture and put it into
upstream glibc (or gcc or binutils), otherwise you can't build a
multi-arch glibc on a non-Debian system.  Something like a generic
--with-multiarch-tuple= configure option would sidestep that, but risk
that different distributions end up with different multi-arch tuples.

> The freedesktop.org SDK used for Flatpak also uses Debian-style multiarch
> (but is not otherwise Debian-derived), and addresses that differently, in a
> way that might be more upstream-suitable:
>
> https://gitlab.com/freedesk

Re: /usr/bin/ld.so as a symbolic link for the dynamic loader

2021-12-03 Thread Florian Weimer
* Theodore Y. Ts'o:

> * How does ld.so --preload *work*?

The dynamic loader has an array of preloaded sonames, and it processes
them before loading the dependencies of the main program.  This way,
definitions in the preloaded objects preempt definitions in the shared
objects.

> * Does it modify /bin/ls, so that all users running /bin/ls get the
> preloaded library?

No, it's purely a run-time change.

The global setting is in /etc/ld.so.preload.

> * Does it modify something in the user's home directory?

No.  Well, the shell might put that command into .bash_history, or
something like that.

> * How do you undo the effects ld.so --preload?

You run the program without the --preload option.  Or unset LD_PRELOAD.

Thanks,
Florian



Re: /usr/bin/ld.so as a symbolic link for the dynamic loader

2021-12-03 Thread Florian Weimer
* Bastian Blank:

> On Fri, Dec 03, 2021 at 01:57:08PM +0100, Florian Weimer wrote:
>> Right, thanks for providing a concrete example.  A (somewhat) portable
>> version would look like this:
>>   ld.so --preload '/usr/$LIB/libeatmydata.so.1.3.0' /bin/sl
>
> You mean
>   ld.so --preload libeatmydata.so.1.3.0 /bin/ls
> ?

Right, that is even better.

No objects to /usr/bin/ld.so then?

Thanks,
Florian



Re: /usr/bin/ld.so as a symbolic link for the dynamic loader

2021-12-03 Thread Florian Weimer
* Paul Wise:

> Florian Weimer wrote:
>
>> I'd like to provide an ld.so command as part of glibc.
>
> Will this happen in glibc upstream or just in Debian?

Upstream, and then Debian.  The symbolic link would likely and up in
libc-bin in Debian.

>> Today, ld.so can be used to activate preloading, for example. 
>> Compared to LD_PRELOAD, the difference is that it's specific to one
>> process, and won't be inherited by subprocesses—something is that
>> exactly what is needed.
>
> That appears to be activated like this:
>
> /lib64/ld-linux-x86-64.so.2 --preload 
> /usr/lib/x86_64-linux-gnu/libeatmydata.so.1.3.0 /bin/ls

Right, thanks for providing a concrete example.  A (somewhat) portable
version would look like this:

  ld.so --preload '/usr/$LIB/libeatmydata.so.1.3.0' /bin/sl

This assumes that $LIB expands to the multi-arch subdirectory.
(In upstream, it switches between lib, lib64, libx32 as needed.)

>> Anyway, do you see any problems with providing /usr/bin/ld.so for use
>> by skilled end users?
>
> It means more folks get exposed to ld.so features, which might mean
> more support and feature requests for glibc upstream,. For example the
> set of features provided by environment variables is different to the
> set of features provided by command-line options.

The intent of this change is to expose these loader features to more
users and tools.  This came up for --list-diagnostics, where we'd
otherwise had to teach the sos tool (and others) how to find the loader
path.

Thanks,
Florian



/usr/bin/ld.so as a symbolic link for the dynamic loader

2021-12-02 Thread Florian Weimer
I'd like to provide an ld.so command as part of glibc.  Today, ld.so can
be used to activate preloading, for example.  Compared to LD_PRELOAD,
the difference is that it's specific to one process, and won't be
inherited by subprocesses—something is that exactly what is needed.
There is also some useful diagnostic output in --help,
--list-diagnostics.

Having ld.so as a real command makes the name architecture-agnostic.
This discourages from hard-coding non-portable paths such as
/lib64/ld-linux-x86-64.so.2 or even (the non-ABI-compliant)
/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 in scripts that require
specific functionality offered by such an explicit loader invocation.

I thought that commands with file extensions might be Policy violation.
Policy actually talks about file extensions for programs installed in
/usr/bin—but only for scripts.  So it's technically okay.  And today,
there's already an ld.so manual page, although it's in section 8 and 1.
(I think /usr/bin is still appropriate because running ld.so does not
require special privileges.)

The initial implementation will be just a symbolic link.  This means
that multi-arch support will be missing: the amd64 loader will not be
able to redirect execution to the s390x loader.  In principle, it should
be possible to find PT_INTERP with a generic ELF parser and redirect to
that, but that's vaporware at present.  I don't know yet if it will be
possible to implement this without some knowledge of Debian's multi-arch
support in the loader.  Upstream doesn't have those features (we only
support /usr/lib vs /usr/lib64 and some minor variants of that), so
integration might be lacking.

If someone wants to upstream the multi-arch patches, that would be
great.  glibc now accepts submissions under DCO, so copyright assignment
should no longer be an obstacle.

Anyway, do you see any problems with providing /usr/bin/ld.so for use by
skilled end users?

Thanks,
Florian



Re: Preparing for glibc 2.34: library locations

2021-07-26 Thread Florian Weimer
* Michael Hudson-Doyle:

>  (but then, dpkg is not
>  impacted by the symbolic link issue as far as I know).
>
> Is this problem written up somewhere? I only subscribed to libc-alpha
> a few weeks ago.

I've written about it in various places.

As far as I know, it's specific to how RPM performs package updates.
Files removed in an update are only removed towards the end, after all
work on *all* packages has been done.  With the previous approach,
this means that when downgrading from glibc 2.29 to glibc 2.28, during
the update, there are files

  ld-2.28.so
  ld-2.29.so
  libc-2.28.so
  libc-2.29.so

RPM immediately updates the dynamic linker symbolic link target from
ld-2.29.so to ld-2.28.so.  But if ldconfig is invoked, it will prefer
libc-2.29.so over libc-2.28.so as the provider of the soname libc.so.6,
and update the symbolic link and cache to point to libc-2.29.so.  That
is of course no good once the downgrade is finally completed and
libc-2.29.so is removed.  The dynamic linker and the libc.so.6 symbolic
link can also become desynchronized, and then failures happen earlier
during the update procedure.

I believe dpkg handles file removals during upgrades/downgrades before
running scripts, so it shouldn't suffer from this problem.  But I
believe the simplification is still worth it.

Thanks,
Florian



Re: Preparing for glibc 2.34: library locations

2021-07-26 Thread Florian Weimer
* Michael Hudson-Doyle:

> There is another wrinkle of course in that Debian/Ubuntu install these
> files to /lib/$multiarch/, not /lib or /lib64 as upstream expects.
>
> What I've implemented[0] for Ubuntu (only for testing so far) is to
> install libc to /lib/$multiarch/libc.so.6, the dynamic linker to
> /lib/$multiarch/$dynamic_linker_soname, and then have a symlink from
> the ABI-mandated dynamic linker path to the new path for the dynamic
> linker. This feels like a reasonable compromise between the upstream
> changes and what Debian does to me but I'm certainly interested in
> hearing other opinions (ideally before Ubuntu feature freeze :-p).

I agree that this layout is reasonable.  The target of the remaining
symbolic link is stable, so it does not matter for the issues that we
tried to address with this upstream change (but then, dpkg is not
impacted by the symbolic link issue as far as I know).

Would you please consider contributing your multiarch patches upstream?
Thanks.

Florian



Bug#969926: glibc: Parsing of /etc/gshadow can return bad pointers causing segfaults in applications

2021-06-04 Thread Florian Weimer
* Aurelien Jarno:

>> > Is it possible to commit those patches to the upstream 2.28 branch? If
>> > so, I guess we can simply pull the branch in the Debian package, fixing
>> > many other security bugs at the same time.
>> 
>> I'm concerned about the GLIBC_PRIVATE internal ABI change, it causes
>> issues if the update is applied without a reboot:
>> 
>>   glibc: After upgrade, before reboot, systemd services using USER= do
>>   not start (caused by fix for bug 1871397)
>>   
>
> That issue looks problematic for Debian, we usually do not require a
> (immediate) reboot after applying a security upgrade.

I submitted a merge request that should work around it, using the
patch from CentOS 8 (and eventually Red Hat Enterprise Linux, of
course):

  

Please let me know what you think.  The new glibc seems to work okay
in general.



Bug#969926: glibc: Parsing of /etc/gshadow can return bad pointers causing segfaults in applications

2021-06-04 Thread Florian Weimer
* Aurelien Jarno:

> On 2021-06-04 20:34, Florian Weimer wrote:
>> * Moritz Mühlenhoff:
>> 
>> > Am Wed, Sep 09, 2020 at 12:30:44PM +0200 schrieb Aurelien Jarno:
>> >> control: forcemerge 967938 969926
>> >> 
>> >> Hi,
>> >> 
>> >> On 2020-09-09 02:58, Bernd Zeimetz wrote:
>> >> > Source: glibc
>> >> > Version: 2.28-10
>> >> > Severity: serious
>> >> > Tags: security upstream patch
>> >> > X-Debbugs-Cc: Debian Security Team 
>> >> > 
>> >> > Hi,
>> >> > 
>> >> > we are running into the bug
>> >> > https://sourceware.org/bugzilla/show_bug.cgi?id=20338
>> >> > causing systemd-sysusers to segfault.
>> >> > 
>> >> > Patch is available in the linked bug report.
>> >> 
>> >> This has already been reported, Florian will work on a backport, as it
>> >> is not straightforward to backport it to buster due to the usage of
>> >> private symbols.
>> >
>> > Florian, did you manage to backport this to 2.31? It would be nice to get 
>> > this
>> > fixed for a Buster point release still.
>> 
>> Do you mean 2.28?  DJ Delorie did the backport, and Carlos O'Donell
>> implemented the GLIBC_PRIVATE ABI compatibility fix.  I'll see if I
>> can get the patches to apply to Debian's 2.28 tree.
>
> Is it possible to commit those patches to the upstream 2.28 branch? If
> so, I guess we can simply pull the branch in the Debian package, fixing
> many other security bugs at the same time.

I'm concerned about the GLIBC_PRIVATE internal ABI change, it causes
issues if the update is applied without a reboot:

  glibc: After upgrade, before reboot, systemd services using USER= do
  not start (caused by fix for bug 1871397)
  <https://bugzilla.redhat.com/show_bug.cgi?id=1927040>

I guess we can use Carlos' patch for upstream as well.

However, I would also have to backport it to 2.28, 2.29, 2.30, 2.31,
so that we have bug fix monotonicity.  2.31 is probably doable, which
should help bullseye.  It's mostly a psychological thing for me, I'm
very busy with getting patches into glibc 2.34 at work, and downstream
Debian work would be at least slightly different.



Bug#969926: glibc: Parsing of /etc/gshadow can return bad pointers causing segfaults in applications

2021-06-04 Thread Florian Weimer
* Moritz Mühlenhoff:

> Am Wed, Sep 09, 2020 at 12:30:44PM +0200 schrieb Aurelien Jarno:
>> control: forcemerge 967938 969926
>> 
>> Hi,
>> 
>> On 2020-09-09 02:58, Bernd Zeimetz wrote:
>> > Source: glibc
>> > Version: 2.28-10
>> > Severity: serious
>> > Tags: security upstream patch
>> > X-Debbugs-Cc: Debian Security Team 
>> > 
>> > Hi,
>> > 
>> > we are running into the bug
>> > https://sourceware.org/bugzilla/show_bug.cgi?id=20338
>> > causing systemd-sysusers to segfault.
>> > 
>> > Patch is available in the linked bug report.
>> 
>> This has already been reported, Florian will work on a backport, as it
>> is not straightforward to backport it to buster due to the usage of
>> private symbols.
>
> Florian, did you manage to backport this to 2.31? It would be nice to get this
> fixed for a Buster point release still.

Do you mean 2.28?  DJ Delorie did the backport, and Carlos O'Donell
implemented the GLIBC_PRIVATE ABI compatibility fix.  I'll see if I
can get the patches to apply to Debian's 2.28 tree.



Re: Seeing clarification for locale names

2021-02-15 Thread Florian Weimer
* Marc Haber:

> I would appreciate pointers to documentation, personal opinions, war
> stories, encoding tales, historic lectures, anything that might
> enlighten me and help me build the knowlegde and understanding about
> UNIX locales are supposed to work in Debian GNU/Linux. Thank you in
> advance!

For the charset normalization, it's in the manual:

The only new thing is the @code{normalized codeset} entry.  This is
another goodie which is introduced to help reduce the chaos which
derives from the inability of people to standardize the names of
character sets.  Instead of @w{ISO-8859-1} one can often see @w{8859-1},
@w{88591}, @w{iso8859-1}, or @w{iso_8859-1}.  The @code{normalized
codeset} value is generated from the user-provided character set name by
applying the following rules:

@enumerate
@item
Remove all characters besides numbers and letters.
@item
Fold letters to lowercase.
@item
If the same only contains digits prepend the string @code{"iso"}.
@end enumerate

@noindent
So all of the above names will be normalized to @code{iso88591}.  This
allows the program user much more freedom in choosing the locale name.


This code dates back to the mid-90s, I think.

I general, I think it is best to treat locale names as opaque strings.
Parsing them to derive charsets is not going to work (e.g., no charset
can mean ISO-8859-1 or UTF-8, depending on the age of the locale).  To
get the charset of the current locale, you can use “locale -k charmap”,
for example.  It corresponds to the glibc charmap name (of which there
aren't too many).

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill



Bug#980764: libc6-dev: wrong return value for fputs when STDOUT_FILENO was closed()

2021-02-07 Thread Florian Weimer
* Morel Bérenger:

>> * Bérenger:  
> ...
>> Why do you think this is a bug?  
>
> POSIX 10031-2017 standard says:

POSIX requires that if you manipulate the underlying file descriptor
of a stream, you first need to call fseek when using the stream again.
Your example code does not do that, so it's not following POSIX
requirements for these interfaces.

But there's another reason why POSIX requirements are met by the glibc
implemetnation.

> In the error section, we can read that it can return the same errors
> (in errno) as fputc, which itself says, as for errors:
>
>> [EBADF] The file descriptor underlying stream is not a valid file
>> descriptor open for writing.  

The error is conditional:

| The fputc() function shall fail if either the stream is unbuffered
| or the stream's buffer needs to be flushed, and: […]

As I explained, the stream is buffered because it is not connected to
a terminal.



Bug#980764: libc6-dev: wrong return value for fputs when STDOUT_FILENO was closed()

2021-01-22 Thread Florian Weimer
* Bérenger:

> When running following code:
>
> ```C
> #include 
> #include 
> #include 
>
> int main()
> {
>   close( STDIN_FILENO );
>   close( STDOUT_FILENO );
>   int fd = dup( STDERR_FILENO );
>   close( STDERR_FILENO );
>   if( -1 == fprintf( stdout, "%d\n", fd ) )
>   {
>   return -1;
>   }
>
>   char s[] = "should fail\n";
>   if( -1 == write( STDOUT_FILENO, s, sizeof( s ) ) )
>   {
>   return -2;
>   }
>   return EXIT_SUCCESS;
> }
> ```
>
> built with glibc, the program returns 254. When built with muslc, it
> returns the expected value of 255.
>
> I believe glibc's behavior here is wrong. From what I could get by using
> strace, it seems that the 1st printf's write() call is ran _after_ the
> 2nd one, even when adding a call to fflush( stdout ) right after the
> printf.

The reason for the glibc behavior is that stdout ends up as a buffered
stream because it is not a terminal.  Why do you think this is a bug?



Bug#976865: Fwd: Bug#974900: dash removes trailing slash from script arguments

2020-12-12 Thread Florian Weimer
* Herbert Xu:

> On Thu, Dec 10, 2020 at 08:58:37AM +0100, Aurelien Jarno wrote:
>>
>> That's the dash symptoms. glob(3) takes a pattern and just returns the
>> paths matching the pattern, as they are named on the filesystem. That
>> said, the option GLOB_MARK can return a trailing slash for all matched
>> path that are a directory.
>
> Yes but it's really a bug in glob(3).  It should really return
> a no-match for the case in question, rather than matching and then
> returning a filename without the slash.
>
> IOW the pattern "foo\/" should not match a regular file foo.

I believe this has been reported upstream here:

  

(But I have not reveiwed this particular bug thread here, sorry.)



Bug#731082: Processed: severity of 731082 is normal

2020-12-03 Thread Florian Weimer
forwarded 731082 https://sourceware.org/bugzilla/show_bug.cgi?id=27008
tags 731082 + upstream
thanks

I happen to have a patch for this:



A bit by accident, it ended up as part of the glibc-hwcaps work.



Bug#975026: Use 公元 not 西元

2020-11-17 Thread Florian Weimer
* 積丹尼 Dan Jacobson:

> I think this,
> $ LC_TIME=zh_TW.UTF-8 date
> 西元2020年11月18日 (週三) 12時39分44秒 CST
> should say 公元 not 西元.

Why do you think so?  These Taiwanese newspapers appear to use 西元
for Gregorian years:

  
  

Is there a difference between calendar dates and historical
references?



Bug#972510: glibc: Please ignore misc/tst-sbrk and/or misc/tst-sbrk-pie on all archs

2020-10-23 Thread Florian Weimer
* Aurelien Jarno:

> brk/sbrk is definitely something deprecated. But it is still part of the
> API (especially for old architectures) and still used by software like
> jemalloc, gcl or libgc. This is therefore important to keep this feature
> in a good shape.
>
> It's also used by many less important packages, often just to print a
> backtrace.
>
> If someone has spoons it might be worth opening bugs again those
> package, so that they stop using brk/sbrk.

glibc malloc also uses sbrk, and has some glitches in corner cases
when it has to switch from sbrk to mmap for the main arena.

I think it is worth investigating *why* sbrk fails.  Usually that is
due to an obstructing mapping caused by problematic address space
layouts.  With ASLR, such failures can essentially appear to be
random.



Bug#969645: glibc: deferred error : (error "Deferred process exited abnormally:

2020-09-06 Thread Florian Weimer
* Aurelien DESBRIERES:

> This is emacs and jedi-mode and much more elisp stuff to works with
> emacs as python3 ide.
>
> Please update this glibc it seems to be outdated!

That's simply not going to happen for the buster release.

You will have to change the way you set up your pip environment.



Bug#969645: glibc: deferred error : (error "Deferred process exited abnormally:

2020-09-06 Thread Florian Weimer
* aurelien desbrieres:

> *** Reporter, please consider answering these questions, where appropriate ***
>
>* What led up to the situation?
>try to use jedi-mode on emacs
>* What exactly did you do (or not do) that was effective (or
>  ineffective)?
>M-x jedi:install-server
>* What was the outcome of this action?
>deferred error : (error "Deferred process exited abnormally:
>   command: /home/aurelien/.emacs.d/.python-environments/default/bin/pip
>   exit status: exit 1
>   event: exited abnormally with code 1
>   buffer contents: 
> \"/home/aurelien/.emacs.d/.python-environments/default/bin/python3: 
> /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by 
> /home/aurelien/.emacs.d/.python-e\
> nvironments/default/bin/python3)
> \"")

Have you copied the pip environment from another system?  You need to
regenerate on this host.



Bug#969618: getopt: optarg is NULL outside of loop

2020-09-06 Thread Florian Weimer
* John Scott:

> #define _POSIX_C_SOURCE 200809L
> #include 
> #include 
> #include 
> #include 
> int main(int argc, char *argv[]) {
>   int opt;
>   while((opt = getopt(argc, argv, "a:")) != -1) {}
>   assert(optarg != NULL);
> }
>
> If this is invoked as './a.out -afoo', the inner assertion will
> the last assertion will fail with glibc.

POSIX leaves it unspecified if optarg is changed if getopt returns -1.
Only optind must be left unchanged.  I do not think this is a glibc
bug (or a musl bug).



Bug#967938: libc6: systemd-sysusers SEGV due to glibc bug in fgetgsent

2020-08-06 Thread Florian Weimer
* Aurelien Jarno:

> On 2020-08-06 06:08, Jinpu Wang wrote:
>> Hi Florian,
>> 
>> On Wed, Aug 5, 2020 at 6:44 PM Florian Weimer  wrote:
>> >
>> > * Jinpu Wang:
>> >
>> > > Dear Maintainer:
>> > >
>> > > Sorry, add some missing information below:
>> > >
>> > > After update to Buster, the systemd-sysusers are segfaulting every time.
>> > > After search around, I found following bugreport in glibc
>> > > https://sourceware.org/legacy-ml/libc-alpha/2016-06/msg01015.html
>> > >
>> > > I backported to the fix to 2.28-10, it fixed the problem.
>> > >
>> > > glibc upstream have a different fix for it in 2.32, see
>> > >  https://sourceware.org/bugzilla/show_bug.cgi?id=20338
>> > >
>> > > I think it's still easier to backport the fix in msg01015.html to 2.28 
>> > > version,
>> > > patch attached in the initial report.
>> >
>> > The patch from 2016 is incomplete because it does not seek back to the
>> > original file position, so the next call of fgetsgent_r skips over the
>> > entry that could not be fully parsed.
>> Thanks for quick response,  can you provide a minimum bugfix, which
>> can be easily backported to old version like 2.28?
>
> I think we do not want to diverge from the upstream fix, even if it is a
> bit more work to backport. We first need to fix it in bullseye/sid and
> then we can try to get this in the next buster stable release.

I can backport it to upstream release branches, all the way to version
2.28.  Would that help?

I plan to add local copies of the new functions, so that the
GLIBC_PRIVATE ABI remains unchanged.

But I have other commitments, so that may not happen until
September-ish.

>> as you also make the bug 20338 as a security hole.
>
> It is marked as "security-", so it is *not* considered as a security
> issue (as the content of this file is trusted).

That's right.



Bug#967938: libc6: systemd-sysusers SEGV due to glibc bug in fgetgsent

2020-08-05 Thread Florian Weimer
* Jinpu Wang:

> Dear Maintainer:
>
> Sorry, add some missing information below:
>
> After update to Buster, the systemd-sysusers are segfaulting every time.
> After search around, I found following bugreport in glibc
> https://sourceware.org/legacy-ml/libc-alpha/2016-06/msg01015.html
>
> I backported to the fix to 2.28-10, it fixed the problem.
>
> glibc upstream have a different fix for it in 2.32, see
>  https://sourceware.org/bugzilla/show_bug.cgi?id=20338
>
> I think it's still easier to backport the fix in msg01015.html to 2.28 
> version,
> patch attached in the initial report.

The patch from 2016 is incomplete because it does not seek back to the
original file position, so the next call of fgetsgent_r skips over the
entry that could not be fully parsed.



Re: Arch qualification for buster: call for DSA, Security, toolchain concerns

2020-08-04 Thread Florian Weimer
* Florian Weimer:

>>  * Concern for mips, mips64el, mipsel and ppc64el: no upstream support
>>in GCC
>>(Raised by the GCC maintainer; carried over from stretch)
>
> I'm surprised to read this.  ppc64el features prominently in the
> toolchain work I do (though I personally do not work on the GCC side).

The ppc64le situation has been clarified.  It's now listed explicitly
as a primary architecture, as powerpc64le-unknown-linux-gnu:

  <https://gcc.gnu.org/gcc-11/criteria.html>

This has always been the intent, but I can understand that
distributions view powerpc64le-unknown-linux-gnu and
powerpc64-unknown-linux-gnu quite very different things.



Re: Arch qualification for buster: call for DSA, Security, toolchain concerns

2020-07-09 Thread Florian Weimer
* Paul Gevers:

>  * Concern for armel and armhf: only secondary upstream support in GCC
>(Raised by the GCC maintainer; carried over from stretch and buster)

glibc upstream lately has trouble finding qualified persons to
implement security fixes for the 32-bit Arm architecture.

>  * Concern for mips, mips64el, mipsel and ppc64el: no upstream support
>in GCC; Debian carries patches in binutils and GCC that haven't been
>integrated upstream even after a long time.
>(Raised by the GCC maintainer; carried over from stretch and buster)

I think I said this the last time, but the claim that there is no GCC
upstream support for ppc64le in GCC or binutils does not appear to be
grounded in fact. 8-/



Re: [RFC PATCH v4 1/2] configure: Remove --enable-obsolete-nsl

2020-06-30 Thread Florian Weimer
* Petr Vorel:

>> nss_compat no longer depends on libnsl in current glibc.  It can be used
>> without NIS, and some users do that.  I don't think your patch changes
>> this.

> Interesting. I guess adding this would be worth then:
> libnss_compat no longer depends on libnsl and can be used without NIS.

We made this change a while back, in glibc 2.27, when the sources were
moved to nss/nss_compat (from nis/nss_compat).  So this isn't something
new.

Thanks,
Florian



Re: [RFC PATCH v4 1/2] configure: Remove --enable-obsolete-nsl

2020-06-30 Thread Florian Weimer
* Petr Vorel:

> Hi Florian,
>
> thank you for your review. I'll have time to send next version in second
> half of July.

If we merge new ports for glibc 2.32, it would be nice not include
sunrpc in them.  We'll figure something out.

>> > diff --git a/grp/initgroups.c b/grp/initgroups.c
>> > index f4c4e986e9..0c17141117 100644
>> > --- a/grp/initgroups.c
>> > +++ b/grp/initgroups.c
>> > @@ -31,12 +31,6 @@
>> >  #include "../nscd/nscd-client.h"
>> >  #include "../nscd/nscd_proto.h"
>
>> > -#ifdef LINK_OBSOLETE_NSL
>> > -# define DEFAULT_CONFIG "compat [NOTFOUND=return] files"
>> > -#else
>> > -# define DEFAULT_CONFIG "files"
>> > -#endif
>> > -
>
>> That looks a bit like a pre-existing bug—we do have nss_compat even
>> without libnsl.  But the change itself looks okay.

> Hm, I'll have look into it after this patchset is finished, but not sure
> if I'm able to fix this.

Sorry, no change to the patch is required.  Removing this is fine.  We
shouldn't have had a default that depends on LINK_OBSOLETE_NSL.

> Hm, libnss_compat is not built (now libnsl is only built as shared
> library, for platforms where it was supported), so what exactly would
> you put here?

nss_compat no longer depends on libnsl in current glibc.  It can be used
without NIS, and some users do that.  I don't think your patch changes
this.

Thanks,
Florian



Re: [RFC PATCH v4 1/2] configure: Remove --enable-obsolete-nsl

2020-06-24 Thread Florian Weimer
* Petr Vorel:

> diff --git a/NEWS b/NEWS
> index a660fc59a8..cfaf50c816 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -33,6 +33,14 @@ Major new features:
>  
>  Deprecated and removed features, and other changes affecting compatibility:
>  
> +* Remove configure option --enable-obsolete-nsl. libnsl is only built as 
> shared
> +  library for backward compatibility and the NSS modules libnss_compat,
> +  libnss_nis and libnss_nisplus are not built at all, libnsl's headers aren't
> +  installed. This compatibility is kept only for architectures and ABIs that
> +  have been added in or before version 2.28. Replacement implementations 
> based
> +  on TI-RPC, which additionally support IPv6, are available from
> +  .
> +

Please add two spaces after sentence-ending periods.  And wrap the lines
a bit earlier (column 72 or so).

> diff --git a/grp/initgroups.c b/grp/initgroups.c
> index f4c4e986e9..0c17141117 100644
> --- a/grp/initgroups.c
> +++ b/grp/initgroups.c
> @@ -31,12 +31,6 @@
>  #include "../nscd/nscd-client.h"
>  #include "../nscd/nscd_proto.h"
>  
> -#ifdef LINK_OBSOLETE_NSL
> -# define DEFAULT_CONFIG "compat [NOTFOUND=return] files"
> -#else
> -# define DEFAULT_CONFIG "files"
> -#endif
> -

That looks a bit like a pre-existing bug—we do have nss_compat even
without libnsl.  But the change itself looks okay.
 
> diff --git a/manual/nss.texi b/manual/nss.texi
> index 821469a78a..7cb307246a 100644
> --- a/manual/nss.texi
> +++ b/manual/nss.texi
> @@ -328,17 +328,11 @@ For the @code{hosts} and @code{networks} databases the 
> default value is
>  the DNS service not to be available but if it is available the answer it
>  returns is definitive.
>  
> -The @code{passwd}, @code{group}, and @code{shadow} databases are
> +The @code{passwd}, @code{group}, and @code{shadow} databases was
>  traditionally handled in a special way.  The appropriate files in the
> -@file{/etc} directory are read but if an entry with a name starting
> -with a @code{+} character is found NIS is used.  This kind of lookup
> -remains possible if @theglibc{} was configured with the
> -@code{--enable-obsolete-nsl} option and the special lookup service
> -@code{compat} is used.  If @theglibc{} was configured with the
> -@code{--enable-obsolete-nsl} option the default value for the three
> -databases above is @code{compat [NOTFOUND=return] files}.  If the
> -@code{--enable-obsolete-nsl} option was not used the default value
> -for the services is @code{files}.
> +@file{/etc} directory were read but if an entry with a name starting
> +with a @code{+} character was found NIS was used.  This kind of lookup
> +was removed and now the default value for the services is @code{files}.

I wonder if it makes sense to reference nss_compat here?

Thanks,
Florian



Bug#963508: /lib/ld-linux.so.2: LD_PRELOAD breaks with plain filename [and 1 more messages]

2020-06-24 Thread Florian Weimer
* Aurelien Jarno:

>> This doesn't seem correct to me.  Is there any documentation giving a
>> rationale for this ?  Is there a way to change this locally ?
>
> I do not know enough about apparmor and its threat model to know if it
> should be considered or not. From the glibc point of view, nothing can
> be really done, it just obeys the AT_SECURE flag passed by the kernel.
>
> Now looking at apparmor.d(5), it seems it *might* be controlled by the
> change_profile option with the safe and unsafe mode. But I don't speak
> apparmor fluently enough to actually know how to introduce that option
> in a profile.

I think LSMs can nowadays also express security transitions that trust
the execution environment, that is, that they add more restrictions
instead of increasing privileges.  I believe we use this with SELinux,
so that these transitions to do not cause AT_SECURE to be set.  Maybe
this is something that apparmor could do as well?

Thanks,
Florian



Re: [RFC PATCH v2 0/2] Remove --enable-obsolete-nsl --enable-obsolete-rpc

2020-06-05 Thread Florian Weimer
* Petr Vorel:

>> I'm still having issues with elf/tst-ldconfig-ld_so_conf-update when
>> running with both commits (it's ok when running only first commit).
>
> OK, I noticed core dump (can be reproduced):
> systemd-coredump[26018]: Process 26016 (ld-linux-x86-64) of user 1000 dumped 
> core.
>
>PID: 26016 (ld-linux-x86-64)
>UID: 1000 (foo)
>GID: 100 (users)
> Signal: 6 (ABRT)
>  Timestamp: Fri 2020-06-05 18:41:54 CEST (16min ago)
>   Command Line: 
> /home/foo/build/glibc/remove-rpc.v2.second-commit.2/elf/ld-linux-x86-64.so.2 
> --library-path 
> /home/foo/build/glibc/remove-rpc.v2.second-commit.2:/home/foo/build/glibc/remove-rpc.v2.second-commit.2/math:/home/foo/build/glibc/remove-rpc.v2.second-commit.2/elf:/home/foo/build/glibc/remove-rpc.v2.second-commit.2/dlfcn:/home/foo/build/glibc/remove-rpc.v2.second-commit.2/nss:/home/foo/build/glibc/remove-rpc.v2.second-commit.2/nis:/home/foo/build/glibc/remove-rpc.v2.second-commit.2/rt:/home/foo/build/glibc/remove-rpc.v2.second-commit.2/resolv:/home/foo/build/glibc/remove-rpc.v2.second-commit.2/mathvec:/home/foo/build/glibc/remove-rpc.v2.second-commit.2/support:/home/foo/build/glibc/remove-rpc.v2.second-commit.2/crypt:/home/foo/build/glibc/remove-rpc.v2.second-commit.2/nptl
>  /home/foo/build/glibc/remove-rpc.v2.second-commit.2/debug/tst-ssp-1
> Executable: /home/foo/build/glibc/remove-rpc.v2.second-commit.2/elf/ld.so
>  Control Group: /user.slice/user-1000.slice/session-1.scope
>   Unit: session-1.scope
>  Slice: user-1000.slice
>Session: 1
>  Owner UID: 1000 (foo)
>Boot ID: bfef12e3ca2046009a97d35fb89674bc
> Machine ID: 66e50c6d8dd0edc674b23b51586326ca
>   Hostname: dell5510
>Storage: none
>Message: Process 26016 (ld-linux-x86-64) of user 1000 dumped core.
> Coredump entry has no core attached (neither internally in the journal nor 
> externally on disk).

This seems unrelated.  I think systemd-coredump ignores ulimit -c 0
(which we perform programatically in the test skeleton), to cover cases
like this where the process is expected to abort.  So you get a few
spurious reports like this one.

Thanks,
Florian



Bug#956418: src:glibc: Please provide optimized builds for ARMv8.1

2020-04-30 Thread Florian Weimer
* Florian Weimer:

> I raised the matter of compiler defaults on the GCC list:
>
>   <https://gcc.gnu.org/pipermail/gcc/2020-April/232261.html>

The link is now: <https://gcc.gnu.org/pipermail/gcc/2020-April/000491.html>



Bug#956418: src:glibc: Please provide optimized builds for ARMv8.1

2020-04-29 Thread Florian Weimer
I raised the matter of compiler defaults on the GCC list:

  

Thanks,
Florian



Bug#956418: src:glibc: Please provide optimized builds for ARMv8.1

2020-04-22 Thread Florian Weimer
* Noah Meyerhans:

> On Sun, Apr 12, 2020 at 12:18:35PM +0200, Aurelien Jarno wrote:
>> > Significant performance impact has also been observed in less contrived
>> > cases (MariaDB and Postgres), but I don't have a repro to share.
>> 
>> But indeed what counts is number on real workloads. It would be nice to
>> get numbers when those software are run against a rebuilt glibc. As
>> those software are using a lot of atomics directly, it would be also
>> interesting to have numbers with those software also rebuilt to use
>> those new instructions.
>
> Agreed.  I don't have specific examples of real world impact at the
> moment.  AIUI, the most significant impact comes in the usage of atomics
> in pthread_mutex_lock().  When there are multiple threads contending for
> a lock, one thread will (approximately) always obtain the lock, while
> the others will starve.  With atomics support in place, the probability
> of obtaining the lock is roughly evenly distributed among all the
> threads.  So any workload in which multiple threads may contend for a
> lock should be a candidate to demonstrate this problem in the real
> world.

Does this behavior affect just one implementation with LSE, or also
implementations without LSE?

If the latter, we might need a different mutex implementation for
AArch64. 8-(



Bug#956418: src:glibc: Please provide optimized builds for ARMv8.1

2020-04-11 Thread Florian Weimer
* Noah Meyerhans:

> On Sat, Apr 11, 2020 at 09:14:11PM +0200, Florian Weimer wrote:
>> > At least if I'm reading the code right (which I may very well not be
>> > doing, being generally unfamiliar with gcc internals), -mtune=generic
>> > enables the equivalent of ARMv8 support:
>> >
>> > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/common/config/aarch64/aarch64-common.c;h=0bddcc8c3e9282a957c5479b4df7f68058093bab;hb=HEAD#l176
>> >
>> > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/aarch64/aarch64-cores.def;h=ea9b98b4b0ad2a578755561bba5b6d5c56115994;hb=HEAD
>> >
>> > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/aarch64/aarch64.h;h=8f08bad3562c4cbe8acdf5891e84f89d23ea6784;hb=HEAD#l226
>> 
>> Hmm.  I don't see anything that sets TARGET_OUTLINE_ATOMICS by
>> default.
>
> Only -moutline-atomics enables that.  Otherwise, unconditional support
> for atomics is enabled by TARGET_LSE, which itself is enabled by a
> number of options, e.g. -marmv8-a+lse, -marmv8.1-a, etc.
>
> See
> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/aarch64/aarch64.c;h=4af562a81ea760891fac3cf7101b8bf887fe7a0d;hb=HEAD#l18961

Sorry, I have a feeling that we are discussing different matters.

I believe that ideally, Debian (and Fedora etc.) should follow
upstream GCC defaults.  I don't think we are in this state
(code_for_aarch64_compare_and_swap uses the atomics.md patterns to
call aarch64_split_compare_and_swap, as far as I can see).

Or put differently: If upstream doesn't want to default to
-moutline-atomics, why should Debian?



Bug#956418: src:glibc: Please provide optimized builds for ARMv8.1

2020-04-11 Thread Florian Weimer
* Noah Meyerhans:

> On Sat, Apr 11, 2020 at 08:44:29AM +0200, Florian Weimer wrote:
>> > Gcc provides two ways to enable support for these instructions at build
>> > time.  The simplest, and least disruptive, is to enable -moutline-atomics
>> > globally in the arm64 glibc build.
>> 
>> Shouldn't GCC do this by default, at least for -mtune=generic?
>
> Maybe.  Would you rather pursue that avenue first?

My hope is that GCC upstream defaults reflect current practices for
the architecture.  It doesn't make sense if every distribution ends up
patching in same GCC defaults which are not upstream.

Sure, there might be bare-metal targets which do not want this, but is
this really the primary audience nowadays?

> At least if I'm reading the code right (which I may very well not be
> doing, being generally unfamiliar with gcc internals), -mtune=generic
> enables the equivalent of ARMv8 support:
>
> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/common/config/aarch64/aarch64-common.c;h=0bddcc8c3e9282a957c5479b4df7f68058093bab;hb=HEAD#l176
>
> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/aarch64/aarch64-cores.def;h=ea9b98b4b0ad2a578755561bba5b6d5c56115994;hb=HEAD
>
> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/aarch64/aarch64.h;h=8f08bad3562c4cbe8acdf5891e84f89d23ea6784;hb=HEAD#l226

Hmm.  I don't see anything that sets TARGET_OUTLINE_ATOMICS by
default.



Bug#956418: src:glibc: Please provide optimized builds for ARMv8.1

2020-04-11 Thread Florian Weimer
* Noah Meyerhans:

> Gcc provides two ways to enable support for these instructions at build
> time.  The simplest, and least disruptive, is to enable -moutline-atomics
> globally in the arm64 glibc build.

Shouldn't GCC do this by default, at least for -mtune=generic?



Bug#954715: glibc: FTBFS: tests failed: signal/tst-minsigstksz-1 signal/tst-minsigstksz-2

2020-03-22 Thread Florian Weimer
* Lucas Nussbaum:

> Source: glibc
> Version: 2.30-2
> Severity: serious
> Justification: FTBFS on amd64
> Tags: bullseye sid ftbfs
> Usertags: ftbfs-20200322 ftbfs-bullseye
>
> Hi,
>
> During a rebuild of all packages in sid, your package failed to build
> on amd64.

>> FAIL: signal/tst-minsigstksz-1
>> FAIL: signal/tst-minsigstksz-2

--
--
FAIL: signal/tst-minsigstksz-1
original exit status 1
Didn't expect signal from child: got `Segmentation fault'
--
--
FAIL: signal/tst-minsigstksz-2
original exit status 1
Incorrect signal from child: got `Segmentation fault', need `Aborted'


The build host has this CPU:

model name  : Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz

This CPU supports AVX-512, and the minimum signal stack size is not
large enough for the amount of data the kernel saves on the stack.

  



Bug#953083: __glibc_has_include macro needs to be restored until GCC is rebuilt

2020-03-13 Thread Florian Weimer
* Matthias Klose:

> ok, now removing that leads to:
>
> $ cat foo.c
> #include 
>
> $ gcc -c foo.c
> In file included from foo.c:1:
> /usr/include/limits.h:124:26: error: no include path in which to search for 
> limits.h
>   124 | # include_next 
>   |  ^
>
> wondering if other distros patch glibc for that ...

Other distributions install limits.h from GCC (in a directory under
/usr/lib/gcc), and that header is picked up first, before
/usr/include/limits.h.



Bug#953083: __glibc_has_include macro needs to be restored until GCC is rebuilt

2020-03-04 Thread Florian Weimer
* Matthias Klose:

> On 3/4/20 9:33 AM, Florian Weimer wrote:
>> * Matthias Klose:
>> 
>>> The __glibc_has_include macro needs to be restored until GCC is rebuilt. At
>>> least on s390x, you get a non-wrorking compiler, which at least cannot glibc
>>> anymore.  The macro is still referenced in the include-fixed directory.
>>>
>>> Seen with the 2.31 branch, but I see that this is also backported to 2.30.
>> 
>> This is a bug in the gcc package.  It must not run fixincludes, to
>> avoid producing mutually incompatible headers because only a subset of
>> them is rewritten.
>
> Is this something which should be done upstream?  Or just don't include any
> fixed header in the GCC packages?

Distributions should never run fixincludes for this reason.  This is a
hack for installing compilers as non-root on proprietary systems,
where you can't fix the headers.

Other distributions routinely backport compiler compatibility fixes
into glibc (even into stable releases), and I think this is the way it
has to be done.

> Anyway, either glibc or GCC has to be fixed to avoid a non-working compiler.

If I recall correctly, the header is broken anyway because linux is
rewritten into __linux__, due to a fixincludes bug.

It should be possible to hide the header by having a file with an
#include directive with an absolute path in a directory used during
the build.



Bug#767756: glibc: Consider providing a libc build compiled with -fno-omit-frame-pointer to help with profiling

2020-02-16 Thread Florian Weimer
* Aurelien Jarno:

>> I've been running into this myself a lot lately and wonder if
>> anything has happened regarding this since 2014, after all it's
>> been six years.

>> I'm surprised so few people seem to be taking interest in this
>> considering the amount of tools that rely on frame pointers for
>> performant stack traces, which has further increased with the
>> introduction of eBPF.
>
> I understand the need for -fno-omit-frame-pointer, however it has a
> performance impact, so we do not want to do that by default. OTOH
> providing an alternative libc is something tricky if we do not want it
> to do it without breaking systems. Someone has to come with a patch that
> is well tested.

Most unwinders should be able to use asynchronous unwind tables, which
only impact disk size (and the size of VM mappings).



Bug#951191: Backport /proc-based lchmod/fchmodat emulation

2020-02-12 Thread Florian Weimer
Unfortunately, this change tickles an XFS bug:

  



Bug#951191: Backport /proc-based lchmod/fchmodat emulation

2020-02-12 Thread Florian Weimer
Package: src:glibc
Version: 2.28-10

gnulib has added emilation for lchmod/fchmodat.  Since this is a
run-time test, binaries built against glibc with these patches will
not work correctly on older glibc version.  (glibc upstream did not
want symbol version markup for this change.)

The backport consists of these patches:

commit 173ec37bb2af6e30892a141d74d42db5957ddd36
Author: Florian Weimer 
Date:   Sun Feb 9 11:50:44 2020 +0100

support: Add the xlstat function

commit f6233ab412c3bebebacf65745e775e01506dd58d
Author: Florian Weimer 
Date:   Sun Feb 9 11:51:08 2020 +0100

Linux: Add io/tst-o_path-locks test

The O_PATH-based fchmodat emulation will rely on the fact that closing
an O_PATH descriptor never releases POSIX advisory locks, so this
commit adds a test case for this behavior.

commit 6b89c385d8bd0700b25bac2c2d0bebe68d5cc05d
Author: Florian Weimer 
Date:   Wed Jan 22 18:56:04 2020 +0100

io: Implement lchmod using fchmodat [BZ #14578]

commit 752dd17443e55a4535cb9e6baa4e550ede383540
Author: Florian Weimer 
Date:   Wed Jan 22 19:01:20 2020 +0100

Linux: Emulate fchmodat with AT_SYMLINK_NOFOLLOW using O_PATH [BZ #14578]

/proc/self/fd files are special and chmod on O_PATH descriptors
in that directory operates on the symbolic link itself (like lchmod).

commit 47136d6cc38c425b150dda83989303ac55f6443c
Author: Florian Weimer 
Date:   Tue Feb 11 16:22:19 2020 +0100

io: Add io/tst-lchmod covering lchmod and fchmodat



Bug#948396: New glibc broke existing app due to historic stack alignment

2020-01-14 Thread Florian Weimer
* Petr Vandrovec:

> Florian Weimer wrote on 1/7/2020 9:31 PM:
>> * Petr Vandrovec:
>> 
>>> As far as I can tell, while x86-64 ABI requires stack to be aligned
>>> on entry to the functions, x86 ABI does not have any such
>>> requirement, and so glibc should align stack itself if it wants to
>>> use XMM instructions that require aligned values.
>> 
>> The i386 ABI was changed after its initial release to require
>> additional stack alignment.
>
> That's a problem.
>
>> If you want to build glibc for i386 with SSE2 enabled (for example,
>> with -march=x86-64), you need to build it with -mrealignstack as well.
>> I'm not aware of any remaining issues with this combination.
>
> I do not want to rebuild anything, I just want things to work :-( 

This comment was directed at the glibc maintainers, sorry.  (The
option is actually called -mstackrealign.)

> libc6:i386 is built without SSE2 support, and takes precedence over 
> libc6-i386, so that is how I've "solved" problem.
>
> Should be libc6-i386 named (after obsolete/virtual) libc6-i686 if it 
> requires SSE2 and new stack alignment?

I think the proper fix would be to build glibc with -mstackrealign.



Bug#948396: New glibc broke existing app due to historic stack alignment

2020-01-07 Thread Florian Weimer
* Petr Vandrovec:

> As far as I can tell, while x86-64 ABI requires stack to be aligned
> on entry to the functions, x86 ABI does not have any such
> requirement, and so glibc should align stack itself if it wants to
> use XMM instructions that require aligned values.

The i386 ABI was changed after its initial release to require
additional stack alignment.

If you want to build glibc for i386 with SSE2 enabled (for example,
with -march=x86-64), you need to build it with -mrealignstack as well.
I'm not aware of any remaining issues with this combination.



Bug#941277: dispatch function missing in header file generated for RPC service

2019-09-27 Thread Florian Weimer
* Simon Richter:

> while implementing an RPC service (in 2019, no less!) I found out that the
> dispatch function generated by rpcgen is not listed in the generated header
> file, so if the service is generated without a main function or inetd
> interface, the code using it needs to create its own declaration.
>
> The signature is easy to guess, but nonetheless I think it should be
> provided by the header.

Ugh, can you describe exactly what is missing?  Then I can file it
here (or just submit a patch):

  

Thanks.

(I'm not sure if we are going to patch glibc's rpcgen for this; nobody
is supposed to use it these days.)



Bug#874160: Fedora has C.UTF-8

2019-09-27 Thread Florian Weimer
* Adam Borowski:

> Looks like Fedora has C.UTF-8 now, and even backported this change to their
> stable releases:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=902094
>
> They're not upstream, but a good part of distros that are not downstream
> from Debian are downstream from Fedora.  This availability makes defaulting
> to C.UTF-8 that more viable.

Please note that Fedora's C.UTF-8 is distinct from Debian's,
unfortunately.  We really need to get this upstream for consolidation.



Bug#934752: libc6: SEGFAULTs caused by tcache after upgrade to Buster

2019-08-27 Thread Florian Weimer
* Pavel Matěja:

> Sorry for late answer.
>
> On 17. 08. 19 22:18, Florian Weimer wrote:
>> * Pavel Matěja:
>>
>>> The strange means they appear only on 2 servers out of 6.
>>> Servers with Xeon E5606 and Pentium G6950 were running fine while Xeon
>>> E3-1220 v6 produced crashes.
>>> It did not matter if the host Debian was Stretch or Buster.
>> Do you see crashes on stretch as well?  What does the backtrace look
>> like there?

> I newer saw the SEGFAULT when we had Stretch based chroot.
>
> I had just one SEGFAULT on Stretch host but I didn't collect coredumps
> back then.
> Unfortunately the server is already running Buster.
> Since the bug is caused by new libc in chroot I should be able to
> install just kernel from Stretch and wait for the SEGFAULT, right?
> I think the backtrace will be the same anyway.

If I recall correctly, stretch doesn't have the tcache code.  If the
crash happened there as well, it's something else.

>>> SSLv3 and TLS code path looked quite distinct to cause the same problem.
>>> Based on info that SEGFAULTs are related to memory allocation in new
>>> libc and CPU performance I found
>>> http://51.15.138.76/patch/17499/
>>> where Wilco Dijkstra discuss some problems with tcache which "leads to
>>> various crashes in benchtests"
>> I was under the impression that this problem only occurs if one of the
>> tunables has an out-of-bounds value.  Do you set any tunables?

> No, I didn't even know they existed.
> I did not read the libc sources yet so I don't know what does the
> patch actually fixes neither if it helps with my problem.

Then the patch will not help to fix the crash.

(By the way, even if the crash goes away if you use a tunable to disable
the thread cache, it could still be timing-related.  It's definitely
possible that the faster malloc/free implementation exposes pre-existing
data races.)

Thanks,
Florian



Bug#924712: crypt() not available _XOPEN_SOURCE is defined

2019-08-25 Thread Florian Weimer
* Francesco Poli:

> Hello everyone,
> I am sorry to ask, but... I cannot understand what's the status of
> [this bug report].
>
> [this bug report]: 
>
> A serious bug for libc6-dev without any apparent activity since last
> March?  Sure there must have been some hidden progress that I cannot
> see.

We provided a solution acceptable to the reporter.  I do not think
further action is needed on the glibc side.  The manual page needs to
be updated to reflect the change, but that's not part of glibc.



Bug#934752: libc6: SEGFAULTs caused by tcache after upgrade to Buster

2019-08-17 Thread Florian Weimer
* Pavel Matěja:

> The strange means they appear only on 2 servers out of 6.
> Servers with Xeon E5606 and Pentium G6950 were running fine while Xeon 
> E3-1220 v6 produced crashes.
> It did not matter if the host Debian was Stretch or Buster.

Do you see crashes on stretch as well?  What does the backtrace look
like there?

> SSLv3 and TLS code path looked quite distinct to cause the same problem.
> Based on info that SEGFAULTs are related to memory allocation in new 
> libc and CPU performance I found
> http://51.15.138.76/patch/17499/
> where Wilco Dijkstra discuss some problems with tcache which "leads to 
> various crashes in benchtests"

I was under the impression that this problem only occurs if one of the
tunables has an out-of-bounds value.  Do you set any tunables?



Bug#934080: [libc6] Significant degradation in the memory effectivity of the memory allocator

2019-08-14 Thread Florian Weimer
* Roman Savochenko:

>> Is there a way to reproduce your results easily?  Upstream, we're
>> looking for workloads which are difficult to handle for glibc's malloc
>> and its default settings, so that we hopefully can improve things
>> eventually.
>
> This way of the ready builds of the application and LiveDisks is
> simplest one for me, than writing a test application with simulation
> such sort complex load, so you can already install the application,
> start and observer.

I meant: Is there a reproduction recipe someone could use, without being
familiar with the application?

Thanks,
Florian



Bug#934080: [libc6] Significant degradation in the memory effectivity of the memory allocator

2019-08-09 Thread Florian Weimer
* Roman Savochenko:

> Thanks you Florian, setting the environment MALLOC_ARENA_MAX=1 I have
> got the memory effectivity some better even than in Debian 7!

Is there a way to reproduce your results easily?  Upstream, we're
looking for workloads which are difficult to handle for glibc's malloc
and its default settings, so that we hopefully can improve things
eventually.

Thanks,
Florian



Bug#934080: [libc6] Significant degradation in the memory effectivity of the memory allocator

2019-08-07 Thread Florian Weimer
* Roman Savochenko:

> Initial condition of the problem representing is a program in the
> single source code, built on-and for Debian 7, 8, 9, 10 with a result
> in the Live disks.

I think glibc 2.13 as shipped by Debian was not built with
--enable-experimental-malloc, so it doesn't use arenas.  This can
substantially decrease RSS usage compared to later versions.  You can
get similar behavior by setting the MALLOC_ARENA_MAX environment
variable to 1 or 2.

Debian 10 also adds a thread cache, which further increases RSS size.
See the manual

  


for details how to change thread cache behavior.

Thanks,
Florian



Re: Options for 64-bit time_t support on 32-bit architectures

2019-07-21 Thread Florian Weimer
* Simon McVittie:

> On Fri, 19 Jul 2019 at 15:13:00 +0300, Adrian Bunk wrote:
>> Remaining usecases of i386 will be old binaries, some old Linux binaries 
>> but especially old software (including many games) running in Wine.
>> Old Linux binaries will still need the old 32bit time_t.
>
> Based on background from my contributions to the Steam Runtime:
>
> I don't have numbers, but you might be surprised how many Linux-supporting
> games are 32-bit. The Steam client itself is currently also 32-bit
> (with some 64-bit subprocesses); this is somewhat deliberate, to act as
> a canary for whether 32-bit code works at all, particularly when combined
> with graphics.
>
> The Steam Runtime (a LD_LIBRARY_PATH library bundle used to run Steam and
> Steam games) is built on an increasingly ancient version of Ubuntu, but
> it tries to use newer libraries of the same SONAME from the host system
> where available, which they often will be, because people who install
> Steam probably also install Wine, which has 32-bit dependencies. If those
> libraries have an incompatible ABI involving 64-bit time_t, and it is used
> at the ABI "surface" between a host-system library and a Steam Runtime
> library or the game, then 32-bit games, and the Steam client itself,
> will crash.

We could in theory bump soname for these libraries, but that has the
unfortunate side effect that it will likely leak to 64-bit
architectures, creating more work for everyone.

I don't see a good way to maintain those libraries with a single-ABI
approach.  So if that's an important use case, it would be a fairly
strong case against it, I think.



Re: Options for 64-bit time_t support on 32-bit architectures

2019-07-19 Thread Florian Weimer
* Adrian Bunk:

> On Fri, Jul 19, 2019 at 07:13:28PM +0200, Florian Weimer wrote:
>> * Adrian Bunk:
>>...
>> For comparison, the original plan was to provide a macro, perhaps
>> -D_TIME_BITS=32 and -D_TIME_BITS=64, to select at build time which ABI
>> set is used (“dual ABI”).
>
> To me this would sound like more trouble than a clear break,
> similar to the mostly working dual OpenSSL 1.0 and 1.1 support
> in stretch.

Could be.  But it would enable keeping i386 at the old ABI while still
building the distribution with newer glibc versions with current
kernel headers (the libc-alpha discussion is evolving regarding the
precise nature of the enablement approach).

Other 32-bit architectures could opt to do the transition now.

>> Similar to the LFS support, with the
>> additional property that binaries built in either mode should continue
>> to work on kernels which predate support for the *_time64 system
>> calls.
>
> Debian does not support running on kernels older than the one in the
> previous stable release.
>
> E.g. Qt in Debian 9 unconditionally uses the getrandom syscall that is 
> not in kernel 3.16 in Debian 7.

The 64-bit system calls arrived in Linux 5.1, so I think the fallback
will be needed for quite some time.



Re: Options for 64-bit time_t support on 32-bit architectures

2019-07-19 Thread Florian Weimer
* Adrian Bunk:

> [ only speaking for myself ]
>
> On Thu, Jul 18, 2019 at 11:05:53PM +0200, Florian Weimer wrote:
>>...
>> The consequence is that in order to build 32-bit-time_t libraries
>> (Gtk, for example), an old glibc needs to be kept around.  In
>> practice, it would probably mean that it is impossible to maintain a
>> set of 32-bit-time_t libraries in a classic distribution build
>> environment (with a unified buildroot and native builds).
>>...
>> Do you want to build 32-bit libraries (besides glibc) which are
>> compatible with legacy applications, with a 32-bit time_t, in the
>> future?  Or is a world where time_t is pretty much always 64 bit
>> something that would be acceptable?
>
> So this is an ABI-incompatible change that would result in new Debian 
> architectures, similar to arm (OABI), armel (EABI softfp) and armhf 
> (EABI hardfp) being different Debian architectures for 32bit little 
> endian ARM?

Not quite.  glibc would still be able to run binaries from the old ABI
and the new ABI.  But under the proposal, you would have to use an old
glibc (missing new system call wrappers etc.) if you want to build
libraries that provide interfaces involving 32-bit time_t.

So in practice, it would likely mean a new Debian architecture, or a
de-facto ABI bump for i386 and armhfp.

For comparison, the original plan was to provide a macro, perhaps
-D_TIME_BITS=32 and -D_TIME_BITS=64, to select at build time which ABI
set is used (“dual ABI”).  Similar to the LFS support, with the
additional property that binaries built in either mode should continue
to work on kernels which predate support for the *_time64 system
calls.  The should also use the vDSO as before.  All these
requirements make an implementation quite hairy, hence the desire for
simplification.

> There are two current release architectures where it is at least 
> imaginable that they will still be around closer to the year 2038:
> i386 and armhf

Right.

> For i386 the last newly released 32bit-only hardware were some early
> Intel Atoms 10 years ago, and when the AMD Geode goes out of production
> soon there might be no hardware in production left.
> There are still surprisingly many people using Debian on 32bit-only
> hardware, but in 20 years this will have changed.

You have thankfully edited out the Intel Quark. 8-)

> Remaining usecases of i386 will be old binaries, some old Linux binaries 
> but especially old software (including many games) running in Wine.
> Old Linux binaries will still need the old 32bit time_t.
> Which options are viable from a Wine point of view?

I talked to a Wine developer in the office, and Wine doesn't directly
expose the time_t ABI to Windows binaries (which isn't suprising).
It's also already been ported to 32-bit systems with a 64-bit time_t.
I expect that this is not a determining factor.

> For armhf new hardware might be available long enough to come close
> to the year 2038, this might require a new architecture at some point.

The push for a 64-bit time_t definitely comes from the embedded 32-bit
processor direction.

For glibc, providing a dual ABI configurable at build time one or all
32-bit architectures probably does not make much of a different in
terms of overall effort.  This means that if we need to produce the
dual ABI for i386, armhfp will likely get it as well.



Options for 64-bit time_t support on 32-bit architectures

2019-07-18 Thread Florian Weimer
There is an effort under way to enhance glibc so that it can use the
Y2038 support in the kernel.  The result will be that more 32-bit
architectures can use a 64-bit time_t.  (Currently, it's x86-64 x32
only.)

Originally, the plan was to support both ABIs in glibc for building
new applications, similar to what is currently possible with
-D_FILE_OFFSET_BITS=64 for changing the size of off_t.  However, this
turned out to be difficult to implement, and so far, no one has posted
patches which appear to be reasonably correct and complete.

The latest proposal is a single-ABI mode for development:

  

Old binaries with a 32-bit time_t will continue to run, but new
binaries built against a current glibc will always use a 64-bit time_t
under this approach.

The consequence is that in order to build 32-bit-time_t libraries
(Gtk, for example), an old glibc needs to be kept around.  In
practice, it would probably mean that it is impossible to maintain a
set of 32-bit-time_t libraries in a classic distribution build
environment (with a unified buildroot and native builds).

I do not have a strong opinion about this because I personally do not
care about 32-bit architectures at all (sorry).  I would like to
solicit Debian's feedback on this matter.

Do you want to build 32-bit libraries (besides glibc) which are
compatible with legacy applications, with a 32-bit time_t, in the
future?  Or is a world where time_t is pretty much always 64 bit
something that would be acceptable?



Bug#924891: glibc: FTBFS: /<>/build-tree/amd64-libc/conform/UNIX98/ndbm.h/scratch/ndbm.h-test.c:1:10: fatal error: ndbm.h: No such file or directory

2019-03-27 Thread Florian Weimer
retitle 924891 glibc: misc/tst-pkey fails due to cleared PKRU register after 
signal in amd64 32-bit compat mode 
thanks

* Lucas Nussbaum:

> On 27/03/19 at 08:48 +0100, Florian Weimer wrote:
>> > If that's useful, I can easily provide access to an AWS VM to debug this
>> > issue.
>> 
>> Oh, that would be quite helpful indeed.
>
> Can you send your SSH key? (I thought there was a way to get the SSH key
> for a DD, but I cannot find it anymore)
>
> Then you will be able to ssh to root@18.184.55.40.
> There's sbuild and schroot setup on the VM.
>
> When you are done, please 'poweroff' the machine, which will terminate
> it.

The issue reproduces outside the chroot, with the stretch userland.

What happens is that once we get out of the SIGUSR1 signal handler,
the PKRU register has value zero.  This happens around this code in
the test:

  /* Check that in a signal handler, there is no access.  */
  xsignal (SIGUSR1, _handler);
  xraise (SIGUSR1);
  xsignal (SIGUSR1, SIG_DFL);
  TEST_COMPARE (sigusr1_handler_ran, 1);

I checked the following (via a breakpoint in pkey_get; I don't think
GDB can read the PKRU register directly): Inside the SIGUSR1 signal
handler, PKRU has value 0x5554, as expected for this kernel, but
after the return, we get zero.  This is the first time a signal is
delivered on the main thread, so it's consistent with fairly broken
signal handling as far as the PKRU register is concerned.  I guess
clearing PKRU in this way might even constitute a minor security bug
(because the zero value means no restrictions).

This commit looks highly relevant:

commit a4455082dc6f0b5d51a23523f77600e8ede47c79
Author: Dave Hansen 
Date:   Wed Jun 8 10:25:33 2016 -0700

x86/signals: Add missing signal_compat code for x86 features

The 32-bit siginfo is a different binary format than the 64-bit
one.  So, when running 32-bit binaries on 64-bit kernels, we have
to convert the kernel's 64-bit version to a 32-bit version that
userspace can grok.

If the siginfo_t layout is incorrect (with regards to what the
hardware writes), I expect that we might end up copying back the wrong
PKRU value.

I'm not sure what to do here.  This really looks like a kernel bug.
Maybe we should just verify that this is fixed in the buster kernel
and move on?

Lucas, can you run your rebuild tests on newer kernels?



Bug#924891: glibc: FTBFS: /<>/build-tree/amd64-libc/conform/UNIX98/ndbm.h/scratch/ndbm.h-test.c:1:10: fatal error: ndbm.h: No such file or directory

2019-03-27 Thread Florian Weimer
* Lucas Nussbaum:

> On 26/03/19 at 23:10 +0100, Aurelien Jarno wrote:
>> On 2019-03-22 17:30, Florian Weimer wrote:
>> > > About the archive rebuild: The rebuild was done on EC2 VM instances from
>> > > Amazon Web Services, using a clean, minimal and up-to-date chroot. Every
>> > > failed build was retried once to eliminate random failures.
>> > 
>> > I believe the actual test failure is tst-pkey.
>> > 
>> > Presumably, this rebuild was performed on some Xeon SP CPU.  Do you
>> > know which model?  Do you have any information about the kernel and
>> > hypervisor used?
>> > 
>> > 32-bit protection key support has had issues from time to time.
>> 
>> Do you have some more details about the issue? Is it a glibc or a kernel
>> problem?
>> 
>> If we can't fix the issue easily on the libc side, I guess the way to
>> fix that is to XFAIL that test on 32-bit x86. 
>
> If that's useful, I can easily provide access to an AWS VM to debug this
> issue.

Oh, that would be quite helpful indeed.



Re: Glibc 2.28 breaks collation for PostgreSQL (and others?)

2019-03-25 Thread Florian Weimer
* Christoph Berg:

> with the update to glibc 2.28, collation aka sort ordering is
> changing:
>
> $ echo $LANG
> de_DE.utf8
> $ (echo 'a-a'; echo 'a a'; echo 'a+a'; echo 'aa') | sort
>
> stretch:
>   aa
>   a a
>   a-a
>   a+a
>
> buster:
>   a a
>   a+a
>   a-a
>   aa
>
> A vast number of locales is affected, including en_US, possibly all of
> them.
>
> For PostgreSQL, this means that the ordering of indexes on disk is
> becoming corrupt, and all "text" (varchar, char, ...) indexes need to
> be rebuilt. (And worse, if that is not done immediately, the tables
> might become corrupt because some tuples aren't index-visible anymore
> due to the incorrect btree ordering.)

That's fairly normal in a glibc update.  glibc upstream prefers it
this way.  I've discussed it several times with other glibc
maintainers.

My understanding is that ICU provides versioned collation tables,
which would allow you to avoid this issue.

  

> I've been thinking about this for some time, and the best I could come
> up so far is "raise a debconf note that people need to invoke REINDEX
> DATABASE". The open question about this plan is, how should this note
> be triggered.

That might not work for unique indices because locale data changes
could cause strings to sort the same that were distinct before the
update.



Bug#924891: glibc: FTBFS: /<>/build-tree/amd64-libc/conform/UNIX98/ndbm.h/scratch/ndbm.h-test.c:1:10: fatal error: ndbm.h: No such file or directory

2019-03-22 Thread Florian Weimer
> About the archive rebuild: The rebuild was done on EC2 VM instances from
> Amazon Web Services, using a clean, minimal and up-to-date chroot. Every
> failed build was retried once to eliminate random failures.

I believe the actual test failure is tst-pkey.

Presumably, this rebuild was performed on some Xeon SP CPU.  Do you
know which model?  Do you have any information about the kernel and
hypervisor used?

32-bit protection key support has had issues from time to time.

Thanks.



Bug#924712: crypt() not available _XOPEN_SOURCE is defined

2019-03-21 Thread Florian Weimer
* Laurent Bigonville:

> Le 19/03/19 à 19:43, Florian Weimer a écrit :
>> * Laurent Bigonville:
>>
>>> Package: libc6-dev
>>> Version: 2.28-8
>>> Severity: serious
>>>
>>> Hi,
>>>
>>> The crypt.3 manpage, state that _XOPEN_SOURCE should be define for
>>> crypt() to be available.
>>>
>>> But it looks that it's currently the opposite, if _XOPEN_SOURCE is
>>> defined, the function cannot be found.

>> Can you compile the software using _DEFAULT_SOURCE (well, the default)
>> or _GNU_SOURCE instead?
>
> Yes, the software can be compile when _XOPEN_SOURCE is not defined or 
> when _GNU_SOURCE is defined instead

Sorry, what I was trying to ask is whether this would be an acceptable
change for you.



Bug#924712: crypt() not available _XOPEN_SOURCE is defined

2019-03-19 Thread Florian Weimer
* Laurent Bigonville:

> Package: libc6-dev
> Version: 2.28-8
> Severity: serious
>
> Hi,
>
> The crypt.3 manpage, state that _XOPEN_SOURCE should be define for
> crypt() to be available.
>
> But it looks that it's currently the opposite, if _XOPEN_SOURCE is
> defined, the function cannot be found.

Can you compile the software using _DEFAULT_SOURCE (well, the default)
or _GNU_SOURCE instead?

We do not want to provide the CRYPT extension anymore because it
implies not just support for the crypt function, but also for the DES
encryption functions, which definitely do not want.  In _XOPEN_SOURCE
mode, it's either all of these functions or none of them (and we chose
the latter because of DES), otherwise glibc wouldn't conform to the
interface specification.

We definitely should update the manual page, though.



Bug#923802: Acknowledgement (pthread: dead-lock while pthread_cond_destroy())

2019-03-06 Thread Florian Weimer
* Joël Krähemann:

> This happens as you call pthread_cond_destroy() twice on the very same
> cond variable.

Surely that's an application bug.  Why do you think this is a glibc
issue?

Thanks,
Florian



Bug#920047: glibc: CVE-2016-10739: getaddrinfo should reject IP addresses with trailing characters

2019-01-21 Thread Florian Weimer
* Salvatore Bonaccorso:

> CVE-2016-10739[0]:
> | In the GNU C Library (aka glibc or libc6) through 2.28, the getaddrinfo
> | function would successfully parse a string that contained an IPv4
> | address followed by whitespace and arbitrary characters, which could
> | lead applications to incorrectly assume that it had parsed a valid
> | string, without the possibility of embedded HTTP headers or other
> | potentially dangerous substrings.
>
> If you fix the vulnerability please also make sure to include the
> CVE (Common Vulnerabilities & Exposures) id in your changelog entry.
>
> For further information see:
>
> [0] https://security-tracker.debian.org/tracker/CVE-2016-10739
> https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-10739
> [1] https://sourceware.org/bugzilla/show_bug.cgi?id=20018
>
> Please adjust the affected versions in the BTS as needed.

Would it help if I put a backport on the 2.24 upstream branch?



Bug#906917: sem_timedwait could always block and returns ETIMEOUT but decrements the value on i686 architecture

2019-01-16 Thread Florian Weimer
* Андрей Доценко:

> The problem occurs only when using semaphores in a library that is not
> linked against pthread.

Yes, that's expected.  Sorry I didn't see this earlier—we have an
upstream bug about this:

  

In general, underlinking produces broken binaries.

Thanks,
Florian



Bug#761300: libc6: putchar does not follow stdio

2018-12-28 Thread Florian Weimer
* Sven Joachim:

> Am 14.10.2018 um 13:38 schrieb Florian Weimer:
>
>> * Sven Joachim:
>>
>>> This result is rather surprising.  After all, "putchar('x')" is supposed
>>> to do the same as "putc('x', stdout)", but here it does not.
>>
>> Can you reproduce this with something newer than 2.13-38+rpi2+deb7u3?
>> Or on something else besides armhf?
>
> Surely, I tested 2.27-6 on amd64.

Eh, right.  We have this in glibc:

int
putchar (int c)
{
  int result;
  _IO_acquire_lock (_IO_stdout);
  result = _IO_putc_unlocked (c, _IO_stdout);
  _IO_release_lock (_IO_stdout);
  return result;
}

_IO_stdout is the variable to which stdout points to default, so this
code was simply not adjusted when the assignment-to-stdout extension
was implemented.  This should use stdout instead.  The puts function
has the same problem.

It looks to me we can get away with fixing this bug.



Bug#912665: (geen onderwerp)

2018-11-02 Thread Florian Weimer
* Frederik Himpe:

> FYI, this is where it crashes:
>
> #0  0x7fc0172239c6 in _IO_fgets (buf=0x7ffc9b2b1640 "
> /dev/aivmhost3-vg/ceph-node1-storage:ceph-ff35163d-b03f-4dbf-a6ce-155730069dc0:4194304:-1:8:8:-1:4096:511:0:511:ZNexTV-ylVb-ltMP-T8Zu-ZklU-cN1I-dETf5o\n",
> n=1024, fp=0x1a90030)
> at iofgets.c:47

This is the first dereference of the file stream pointer in fgets, so
this suggests a use-after-free application bug (or use-after-fclose in
this case).  It should be visible in valgrind as well.

Thanks,
Florian



Bug#761300: libc6: Printf("%c",'x') does not follow stdio

2018-10-14 Thread Florian Weimer
* Sven Joachim:

> This result is rather surprising.  After all, "putchar('x')" is supposed
> to do the same as "putc('x', stdout)", but here it does not.

Can you reproduce this with something newer than 2.13-38+rpi2+deb7u3?
Or on something else besides armhf?



Bug#785651: glibc: test run times out on ci.debian.net; maybe don't force a build every time

2018-07-30 Thread Florian Weimer
* Paul Gevers:

> Hi Florian,
>
> On 29-07-18 13:26, Florian Weimer wrote:
>> I'm not sure why it is necessary to build glibc three times (unless
>> it's impossible to get multi-arch packages into the buildroot).
>
> I am not sure if I understand what you mean, but currently having
> multiple arches available in the autopkgtest testbed isn't supported. I
> have seen packages try (gnupg2), but this goes easily wrong considering
> the unstable-to-testing migration setup. If there is a real need for
> this, it should come from autopkgtest.

Sorry, I never worked on the Debian toolchain, so my phrasing was
poor.

In concrete terms, what I meant was: Why build libc6-i386 on amd64
when there is a libc6:i386 package as well?

In Fedora, there's a restriction that buildds cannot install foreign
architecture packages.  Some packages need a 32-bit glibc on a 64-bit
builder, too.  (Typical gcc flags, for example, or fake amd64 packages
such as amd64).  That made me wonder if Debian has a similar
restriction for its buildds.



Bug#785651: glibc: test run times out on ci.debian.net; maybe don't force a build every time

2018-07-29 Thread Florian Weimer
* Paul Gevers:

> On Sun, 01 Apr 2018 21:56:33 +0200 Florian Weimer  wrote:
>> > I have no idea. On a fast 4-cores amd64 machine and for the 3 flavours
>> > built on amd64, the glibc takes around 20 minutes to build and the
>> > testsuite around 2h to run.
>> 
>> This is still rather slow.  I see native builds on relatively current
>> hardware taking 2 minutes, plus 12 to 15 minutes to build and run the
>> test suite (all with parallel make, although parallel make for tests
>> is disabled automatically for some subdirectories).  200 minutes on
>> current (amd64) hardware sounds quite excessive.
>
> I just did a retry on our infrastructure and it ran in 57 minutes. But
> it ran on one of the two big workers (8 cores and 30 GB memory). We want
> to make all workers equal and we are going down to 2 cores and 7.2 GB.
>
> Could it be that the memory is the actual problem and/or also an issue?

I looked at the build process, and the amd64 package actually builds
glibc three times (for amd64, i386 and x32).  So 57 minutes is
actually very close to the numbers I gave.

I'm not sure why it is necessary to build glibc three times (unless
it's impossible to get multi-arch packages into the buildroot).  If
you disable kernel support for the i386 and x32 subarchitectures, at
those test suites will not run, which will speed up the build
somewhat.



Bug#902851: libc-bin: ldd stopped working with 32-bit binaries on amd64 machine

2018-07-07 Thread Florian Weimer
* Alexandra N. Kossovsky:

> Please close this bug.  I definitely saw the issue yesterday, but it has 
> somehow gone today.  I'll return to you if I see it again and understand 
> what triggers it.

The related Fedora bug

  https://bugzilla.redhat.com/show_bug.cgi?id=1596312

appears to have been a kernel regression (unconditional exit status
zero from 32-bit processes, or something like that).



Bug#861116: Fixed in glibc 2.28

2018-07-01 Thread Florian Weimer
The issue should be fixed with this upstream commit:

commit c402355dfa7807b8e0adb27c009135a7e2b9f1b0
Author: Florian Weimer 
Date:   Tue Jun 26 10:24:52 2018 +0200

libio: Disable vtable validation in case of interposition [BZ #23313]

<https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commitdiff;h=c402355dfa7807b8e0adb27c009135a7e2b9f1b0>



Re: Arm ports build machines (was Re: Arch qualification for buster: call for DSA, Security, toolchain concerns)

2018-06-29 Thread Florian Weimer
* Luke Kenneth Casson Leighton:

>  that is not a surprise to hear: the massive thrashing caused by the
> linker phase not being possible to be RAM-resident will be absolutely
> hammering the drives beyond reasonable wear-and-tear limits.  which is
> why i'm recommending people try "-Wl,--no-keep-memory".

Note that ld will sometimes stuff everything into a single RWX segment
as a result, which is not desirable.

Unfortunately, without significant investment into historic linker
technologies (with external sorting and that kind of stuff), I don't
think it is viable to build 32-bit software natively in the near
future.  Maybe next year only a few packages will need exceptions, but
the number will grow with each month.  Building on 64-bit kernels will
delay the inevitable because more address space is available to user
space, but that's probably 12 to 18 month extended life-time for
native building.



Re: Arch qualification for buster: call for DSA, Security, toolchain concerns

2018-06-28 Thread Florian Weimer
* Niels Thykier:

> armel/armhf:
> 
>
>  * Undesirable to keep the hardware running beyond 2020.  armhf VM
>support uncertain. (DSA)
>- Source: [DSA Sprint report]

Fedora is facing an issue running armhf under virtualization on arm64:

  
  

Unless the discussion has moved somewhere where I can't follow it, no
one seems to have solid idea what is going on.  It's also not clear
that this configuration has substantial vendor or community support.
This makes me concerned that virtualization is a viable path forward
here.

(The discussion on the GCC list started off with a misdirection, sorry
about that.  The brief assumption that this was a hardware quirk is
likely quite wrong.)

>  * Concern for mips, mips64el, mipsel and ppc64el: no upstream support
>in GCC
>(Raised by the GCC maintainer; carried over from stretch)

I'm surprised to read this.  ppc64el features prominently in the
toolchain work I do (though I personally do not work on the GCC side).
>From my point of view, it's absolutely not in the same category as the
MIPS-based architectures.



Bug#880846: libc-bin: libnss_compat is deprecated and nsswitch should stop using it on new installation

2018-06-20 Thread Florian Weimer
* Laurent Bigonville:

> According to the release note of 2.26, the nss_compat module is
> deprecated[0].

> [0]https://sourceware.org/ml/libc-announce/2017/msg1.html

I think we made changes so that it is no longer deprecated, by
removing the hard NIS dependency.  It shouldn't be used by default
nevertheless.



Bug#900025: /usr/lib/x86_64-linux-gnu/libm.so: invalid ELF header

2018-05-25 Thread Florian Weimer
* Aurelien Jarno:

> Now I don't find this common/utils.c file in the simple-scan sources.

This looks like a known hplip bug:

https://bugzilla.redhat.com/show_bug.cgi?id=1347231

More recent sources try to load libm.so.6 as well.  Note that shared
objects which use libm (or any libc) functions and are not themselves
linked against the relevant shared objects have no ABI guarantees
whatsoever and can break with a glibc update.  (Exceptions are objects
which were linked against glibc 2.0, before symbol versioning was
introduced.)



Bug#620887: Please add a shm_mkstemp() function

2018-05-18 Thread Florian Weimer

On 05/16/2018 10:25 PM, Jakub Wilk wrote:

* Goswin von Brederlow , 2011-04-04, 23:33:

int shm_mkstemp(char *template);


FWIW, this function is available on OpenBSD:
https://man.openbsd.org/shm_mkstemp.3


We have memfd_create nowadays.  It's not exactly identical because it 
creates an unnamed file, but perhaps it is close enough?  If not, we'd 
need actual use cases to justify to add it to glibc.


Thanks,
Florian



Bug#897373: libc6: feof(file) always false when forking after read

2018-05-10 Thread Florian Weimer
* David Beniamine:

> On Thu, May 10, 2018 at 07:06:03PM +0200, Florian Weimer wrote:
>> * David Beniamine:
>> 
>> > int do_fork() {
>> > pid_t pid;
>> >
>> > switch (pid = fork()) {
>> > case -1:
>> > fprintf(stderr, "Fork failed\n");
>> > return -1;
>> > case 0:
>> > exit(-1);
>> 
>> Does the issue go away when you call _exit instead of exit?
>> 
> It goes away with _exit, indeed.

According to POSIX, exit is required to flush all streams, and fflush
is required to reset the position of the underlying file descriptor
(more precisely, the open file description) to the current read
position.  That affects the descriptor in the parent, too.  I'm not
sure if this is actually a glibc bug.



Bug#897373: libc6: feof(file) always false when forking after read

2018-05-10 Thread Florian Weimer
* David Beniamine:

> int do_fork() {
> pid_t pid;
>
> switch (pid = fork()) {
> case -1:
> fprintf(stderr, "Fork failed\n");
> return -1;
> case 0:
> exit(-1);

Does the issue go away when you call _exit instead of exit?



Bug#895981: please cleanup /var/cache/nscd on restart

2018-04-30 Thread Florian Weimer
* Carlos O'Donell:

> Then each registered file, like /etc/resolv.conf, is watched via
> inotify for any changes, and if a change is detected and
> finfo->call_res_init was true (and it's true only for resolv.conf)
> then we call res_init().

But res_init does not flush the nscd cache, doesn't it?



Bug#895981: please cleanup /var/cache/nscd on restart

2018-04-29 Thread Florian Weimer
* Harald Dunkel:

> I am using both systemd and sysvinit-core, but I am not sure which one
> was active when I ran into this problem.
>
> Consider a split DNS setup for a remote network. I had started an IPsec
> connection to the remote side. /etc/resolv.conf was changed to include
> the new internal DNServer on the remote side, but a host lookup gave me
> still the old external address. Stopping nscd did not help, AFAIR.

That's arguably a bug in nscd.  It should flush the cache each time it
detects a change in /etc/resolv.conf (or /etc/gai.conf, for that
matter).



Bug#785651: glibc: test run times out on ci.debian.net; maybe don't force a build every time

2018-04-02 Thread Florian Weimer
* Aurelien Jarno:

> I have no idea. On a fast 4-cores amd64 machine and for the 3 flavours
> built on amd64, the glibc takes around 20 minutes to build and the
> testsuite around 2h to run.

This is still rather slow.  I see native builds on relatively current
hardware taking 2 minutes, plus 12 to 15 minutes to build and run the
test suite (all with parallel make, although parallel make for tests
is disabled automatically for some subdirectories).  200 minutes on
current (amd64) hardware sounds quite excessive.



Bug#887169: libc6: recent upgrade to 2.26-3 broke Steam games (Civ5)

2018-01-15 Thread Florian Weimer

On 01/15/2018 12:22 AM, Aurelien Jarno wrote:

I don't think it is actually the consensus, only Arch Linux has chosen
this solution, and building the whole glibc with this option will have
an impact of the performances for all binaries, not only the broken
Steam ones. I therefore don't think it's the right way to fix the bug.


For Fedora, I disabled multiarch support for the i386 builds because 
it's not entirely unlikely that there will be similar issues.


If we could assume SSE2 support and built the distribution with 
-march=x86-64, we would likely have to go the -mstackrealign route.  But 
this is just an educated guess.


Thanks,
Florian



Bug#879093: Segfault in libc6 while using xrdp-sesman on Stretch

2017-10-24 Thread Florian Weimer
* Gilles MOREL:

> I repported this bug for the package libc6 because the kernel line let
> me think the problem comes from libc6.

It's much more likely that xrdp-sesman calls a glibc function on an
invalid pointer.

> If you want me to provide more log or debugging, please tell me, I
> don't really understand the problem.

You will have to provide a backtrace at least, with debugging symbols
installed.  If you can reproduce the issue on buster, getting
debugging symbols may be easier (I don't know what the current state
of automatic debugging information packaging is on Debian).  Note that
you'll have to install packages with the debugging information for
xrdp-sesman and all its dependencies, not just libc6.



Bug#857909: [libc6-dev] getpid() in child process created using clone(CLONE_VM) returns parent's pid

2017-03-23 Thread Florian Weimer
* John Paul Adrian Glaubitz:

> I would suggest filing a bug report to glibc upstream or posting on
> their mailing list to ask for feedback.

Upstream has since removed the PID cache:

  
  




Bug#858529: libc6: fgets repeats content after fork on stretch only

2017-03-23 Thread Florian Weimer
tags 858529 upstream
forwarded 858529 https://sourceware.org/bugzilla/show_bug.cgi?id=20598
thanks

* Neil Spring:

>   if(fork() == 0) { exit(1); }

exit flushes the stdio buffers in the child.  Upstream concluded that
this leads to undefined behavior:

| Yes, this is about the exit actually.  But reading "2.5.1
| Interaction of File Descriptors and Standard I/O Streams", I think
| this is really undefined, because the required action is not
| performed before the call to fork, and the correct fix is to use
| _exit in the forked child.





Bug#839280: libc6: asprintf(, "%F", 1.0) puts 0.00000 in c on raspberry pi zero v1.3.

2016-10-01 Thread Florian Weimer
* Noah Williams:

>* What led up to the situation? I was building a robot, and needed a 
> raspberry pi zero to send a floating point number formatted as a string to an 
> arduino.
>* What exactly did you do (or not do) that was effective (or
>  ineffective)? I've tried passing bigger numbers than 1.0 (like 100.0) , 
> but it didn't seem to do anything different, "%F %F" does however seem to get 
> the second argument right.
>* What was the outcome of this action? It printed out something like: 
> -0.1 1.0 
>* What outcome did you expect instead? 1.0 0.0

Please post complete, minimal source code, along with a description
how you compiled your program (and for what architecture).  Thanks.



Bug#595790: [Pkg-zfsonlinux-devel] Bug#595790: hostid: useless unless fixed

2016-09-29 Thread Florian Weimer
* Richard Laager:

> Getting back to ZFS and /etc/hostid... I would think that a
> randomly-generated /etc/hostid is probably sufficient. Whether that's
> done in the libc, spl, or zfs package makes no difference to me.

As I tried to explain, the risks of collisions without central
coordination looks rather high.  glibc's current approach, using the
IP address associated with the host name, provides a certain level of
coordination, avoiding duplicates.



Bug#595790: [Pkg-zfsonlinux-devel] Bug#595790: hostid: useless unless fixed

2016-09-28 Thread Florian Weimer
* Michael Stone:

> Other platforms have deprecated gethostid, that's the best way forward
> for linux, IMO.

I agree.  It's the most likely outcome if this issue was reported to
glibc upstream.



Re: segfault in errx

2016-09-28 Thread Florian Weimer
* Florian Weimer:

>> I can reproduce something like this with 2.24-3 on amd64.  valgrind
>> isn't very helpful.
>
> And it needs a UTF-8 locale (C.UTF-8 will do).  Another multi-byte
> locale may work as well.

Bisecting this with the attached script leads to:

commit 18d26750dd8fd328a78cf639fd0ec2494680a2a4
Author: Paul Pluzhnikov <ppluzhni...@google.com>
Date:   Sun Mar 8 09:46:53 2015 -0700

Cleanup: in preparation for fixing BZ #16734, fix memory leaks exposed by
switching fopen()ed streams from mmap to malloc.

This commit has shown up in other context as well:

  <https://sourceware.org/bugzilla/show_bug.cgi?id=20598>

#!/bin/bash

set -x

unset LANG LANGUAGE LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY \
  LC_MESSAGES LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT \
  LC_IDENTIFICATION LC_ALL

fatal () {
exit 255
}

skip () {
exit 255
}

run () {
GCONV_PATH=./iconvdata LOCPATH=. ./elf/ld-linux-x86-64.so.2 --library-path 
.:./math:./elf:./dlfcn:./nss:./nis:./rt:./resolv:./crypt:./mathvec:./nptl "$@"
}

cd ../build || fatal
rm -rf ../build/* || fatal
../git/configure --prefix=/usr --disable-werror || skip
make -j12 || skip
run locale/localedef -f ../git/localedata/charmaps/UTF-8 -i 
../git/localedata/locales/en_US "$PWD/en_US.UTF-8" || skip
LC_ALL=en_US.UTF-8 run /usr/bin/ul /tmp/ul.segf
case $? in
1)
exit 0
;;
139)
exit 1
;;
*)
fatal
;;
esac



Re: segfault in errx

2016-09-28 Thread Florian Weimer
* Florian Weimer:

> * Michael Meskes:
>
>> I recently learned that ul (from bsdmainutils) segfaults when run
>> against the attached file. Some debugging shows that the segfault
>> happens when cleaning up in errx():
>>
>> michael@feivel:~$ ul ul.segf 
>> ul: unknown escape sequence in input: 33, 135
>> Speicherzugriffsfehler
>
> What's the glibc version and architecture?
>
> I can reproduce something like this with 2.24-3 on amd64.  valgrind
> isn't very helpful.

And it needs a UTF-8 locale (C.UTF-8 will do).  Another multi-byte
locale may work as well.

> I'll have to try this on a distribution with better debugging
> information.

Not much luck on Fedora, either.

Based on what ul does, I suspect it's this upstream bug:

  <https://sourceware.org/bugzilla/show_bug.cgi?id=20568>

Or perhaps this one:

  <https://sourceware.org/bugzilla/show_bug.cgi?id=20632>



Re: segfault in errx

2016-09-28 Thread Florian Weimer
* Michael Meskes:

> I recently learned that ul (from bsdmainutils) segfaults when run
> against the attached file. Some debugging shows that the segfault
> happens when cleaning up in errx():
>
> michael@feivel:~$ ul ul.segf 
> ul: unknown escape sequence in input: 33, 135
> Speicherzugriffsfehler

What's the glibc version and architecture?

I can reproduce something like this with 2.24-3 on amd64.  valgrind
isn't very helpful.

ul: unknown escape sequence in input: 33, 135
==16450== Conditional jump or move depends on uninitialised value(s)
==16450==at 0x52A9A63: utf8_internal_loop (loop.c:298)
==16450==by 0x52A9A63: __gconv_transform_utf8_internal (skeleton.c:609)
==16450==by 0x52F26CD: do_length (iofwide.c:463)
==16450== 
==16450== Jump to the invalid address stated on the next line
==16450==at 0x0: ???
==16450==by 0x4224167: ???
==16450==by 0x4024537: ???
==16450==by 0x5833A3F: ???
==16450==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==16450== 
==16450== 
==16450== Process terminating with default action of signal 11 (SIGSEGV)
==16450==  Bad permissions for mapped region at address 0x0
==16450==at 0x0: ???
==16450==by 0x4224167: ???
==16450==by 0x4024537: ???
==16450==by 0x5833A3F: ???
==16450== Jump to the invalid address stated on the next line
==16450==at 0x0: ???
==16450==by 0x42239D7: ??? (in /lib/x86_64-linux-gnu/ld-2.24.so)
==16450==by 0x4025477: ???
==16450==by 0x5833A3F: ???
==16450==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==16450== 
==16450== 
==16450== Process terminating with default action of signal 11 (SIGSEGV)
==16450==  Bad permissions for mapped region at address 0x0
==16450==at 0x0: ???
==16450==by 0x42239D7: ??? (in /lib/x86_64-linux-gnu/ld-2.24.so)
==16450==by 0x4025477: ???
==16450==by 0x5833A3F: ???

I'll have to try this on a distribution with better debugging
information.



  1   2   >