Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-12 Thread Florian Weimer
* Lennart Poettering:

> On Wed, 12.07.17 09:53, Florian Weimer (f...@deneb.enyo.de) wrote:
>
>> * Lennart Poettering:
>> 
>> > On Tue, 11.07.17 21:24, Florian Weimer (f...@deneb.enyo.de) wrote:
>> >
>> >> * Lennart Poettering:
>> >> 
>> >> > This all stems from my experiences with PulseAudio back in the day:
>> >> > People do not grok the effect of fork(): it only duplicates the
>> >> > invoking thread, not any other threads of the process, moreover all
>> >> > data structures are copied as they are, and that's a time bomb really:
>> >> 
>> >> These days, the PID can change even without a fork, so the story is a
>> >> bit different.
>> >
>> > Can you elaborate?
>> 
>> I'm no longer sure that the PID can change with the current kernel,
>> but I cannot rule it out, either.  But other weirdness is possible:
>> for example, after a fork, getpid and getppid could be equal.
>
> Please be less vague. So I understand correctly that the PID cannot
> change without fork after all?

I don't know.  I don't see such a guarantee provided by the kernel.

I think the important aspect is that as far as glibc is concerned, the
PID *can* change even if its fork and vfork functions are never
called, so we cannot reliably invalidate a cache.

We could decide that it is undefined to call the fork/vfork/clone
system calls directly, but that would break systemd functionality, I
think.

>> > Are you talking about cases where you invoke clone() directly, instead
>> > of via glibc's wrappers? We do that too in systemd, but I am not sure
>> > this is really reason enough to introduce this regression in glibc:
>> > this is easily worked around (which we do in systemd), and given that
>> > the time between clone() and execve() should be short, and the
>> > code between the two minimal this isn't really much of a problem.
>> 
>> There are other uses of clone which do not immediately lead to an
>> execve.
>
> Can you elaborate? Have a real-life user of this?

src/nspawn/nspawn.c?  A lot of things happen in the inner_child
function.

>> > Where was this discussed in detail? Do you have any links about the
>> > discussions about this?
>> 
>> It was on libc-alpha and the glibc bug tracker.
>
> Link?

https://sourceware.org/ml/libc-alpha/2016-10/msg00136.html
https://sourceware.org/ml/libc-alpha/2016-10/msg00233.html
https://sourceware.org/ml/libc-alpha/2016-11/msg00247.html
https://sourceware.org/ml/libc-alpha/2016-11/msg00303.html
https://www.postgresql.org/message-id/e1ukzbn-0006c6...@gemulon.postgresql.org

The last reason explains why using getpid for fork detection does not
work in the way one would expect.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-12 Thread Florian Weimer
* Lennart Poettering:

> On Wed, 12.07.17 09:51, Florian Weimer (f...@deneb.enyo.de) wrote:
>
>> * Lennart Poettering:
>> 
>> > On Tue, 11.07.17 21:26, Florian Weimer (f...@deneb.enyo.de) wrote:
>> >
>> >> * Lennart Poettering:
>> >> 
>> >> > Apparently, this regressed between this version and
>> >> > glibc-2.24-9.fc25.x86_64 hence.
>> >> 
>> >> Yes, I backported the fork cache removal to Fedora 25.  There is no
>> >> longer a good way to main such a cache in userspace because glibc
>> >> cannot intercept anymore all the ways that can change the PID of the
>> >> current process because the kernel interfaces for process management
>> >> are incredibly rich these days.
>> >
>> > Please be more specific here. What is this all about?
>> 
>> We got many bug reports over the years about sandboxes and other heavy
>> users of namespaces and clone that the glibc PID cache got out of
>> sync, both in child and parent (!) processes.
>
> have any links?



You guys ran into this as well and wrote a raw_getpid function which
calls the system call.  (You should have reported the bug instead.)

>> > What triggered this specifically? is this about docker? docker is
>> > written in golang anyway, iirc, which doesn't bother with linking to
>> > libc anyway?
>> 
>> It needs glibc for access to the host and user databases.
>
> can you elaborate? I fail to see any relationship between
> unshare()/fork()/getpid() and NSS?

You asked why docker links against glibc.

>> > Is this a glibc upstream choice primarily? Were the regressions this
>> > causes considered?
>> 
>> I raised the problem of applications calling getpid frequently and
>> named OpenSSL as an example.
>
> Link?

See the collection of links in the other message.

> And I am pretty sure the usecase is very valid... And yes,
> even if checking getpid() misses some theoretical corner cases,
> pthread_atfork() or whatever else you propose will miss others too,
> and is much uglier codewise, introduces deps, yadda yadda...

We actually increased the accuracy of your fork detection logic (even
though it's still broken), so I'm puzzled why you keep calling this a
regression.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-12 Thread Reindl Harald



Am 12.07.2017 um 10:23 schrieb Lennart Poettering:

On Wed, 12.07.17 09:51, Florian Weimer (f...@deneb.enyo.de) wrote:

Fork detection using getpid is not reliable.  It gives false negatives
in the case of double-forks, where the process can be different but
the PID is the same due to reuse.  Considering that this use case is
broken, I don't think it's worthwhile to jump through hoops to support
code which is fundamentally broken anyway.


Uh, that's a bit non-chalant, no? Yes, the UNIX PID concept is awfully
designed, but if you argue on that level, you#d have to remove kill(),
and half of the other syscalls that take a PID from glibc too...


don't get me wrong: this is exactly how you argue to all others over 
years "fundamentally broken anyway, fix it"

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-12 Thread Florian Weimer
* Lennart Poettering:

> On Tue, 11.07.17 21:26, Florian Weimer (f...@deneb.enyo.de) wrote:
>
>> * Lennart Poettering:
>> 
>> > Apparently, this regressed between this version and
>> > glibc-2.24-9.fc25.x86_64 hence.
>> 
>> Yes, I backported the fork cache removal to Fedora 25.  There is no
>> longer a good way to main such a cache in userspace because glibc
>> cannot intercept anymore all the ways that can change the PID of the
>> current process because the kernel interfaces for process management
>> are incredibly rich these days.
>
> BTW, with this change you are breaking expressly documented behaviour:
>
> http://man7.org/linux/man-pages/man2/getpid.2.html
>
> "Since glibc version 2.3.4, the glibc wrapper function for
>  getpid() caches PIDs, so as to avoid additional system calls when
>  a process calls getpid() repeatedly."

Note that the manpages project, while extremely valuable, is not
official glibc documentation and sometimes explains library internals
which can change over time.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-12 Thread Michael Chapman

On Wed, 12 Jul 2017, Lennart Poettering wrote:

On Wed, 12.07.17 09:53, Florian Weimer (f...@deneb.enyo.de) wrote:

* Lennart Poettering:

Where was this discussed in detail? Do you have any links about the
discussions about this?


It was on libc-alpha and the glibc bug tracker.


Link?

Lennart


These seem quite relevant:

  https://sourceware.org/ml/libc-alpha/2016-10/msg00136.html
  https://sourceware.org/bugzilla/show_bug.cgi?id=19957

There's a bunch of other related bugs coming up in my searches too... it 
seems that this one change was able to fix many things simultaneously.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-12 Thread Lennart Poettering
On Tue, 11.07.17 21:26, Florian Weimer (f...@deneb.enyo.de) wrote:

> * Lennart Poettering:
> 
> > Apparently, this regressed between this version and
> > glibc-2.24-9.fc25.x86_64 hence.
> 
> Yes, I backported the fork cache removal to Fedora 25.  There is no
> longer a good way to main such a cache in userspace because glibc
> cannot intercept anymore all the ways that can change the PID of the
> current process because the kernel interfaces for process management
> are incredibly rich these days.

BTW, with this change you are breaking expressly documented behaviour:

http://man7.org/linux/man-pages/man2/getpid.2.html

"Since glibc version 2.3.4, the glibc wrapper function for
 getpid() caches PIDs, so as to avoid additional system calls when
 a process calls getpid() repeatedly."

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-12 Thread Mantas Mikulėnas
On Tue, Jul 11, 2017 at 7:06 PM, Tomasz Torcz  wrote:

> On Tue, Jul 11, 2017 at 05:20:10PM +0200, Lennart Poettering wrote:
> > On Tue, 11.07.17 16:55, Tomasz Torcz (to...@pipebreaker.pl) wrote:
> > > > Forgot to mention:
> > > >
> > > > $ rpm -qa glibc
> > > > glibc-2.24-4.fc25.x86_64
> > > >
> > > > Apparently, this regressed between this version and
> > > > glibc-2.24-9.fc25.x86_64 hence.
> > > >
> > >
> > >   From glibc changelog:
> > >
> > > * Wed Jun 07 2017 Arjun Shankar  - 2.24-6
> > > - Auto-sync with upstream release/2.24/master,
> > >   commit 7b60553e360731338631ccdda71590ac5deca137, fixing:
> > > - Remove the PID cache  (#1443976)
> >
> > I commented on that bug now. It doesn#t really have a proper
> > explanation, all it says is that "The glibc PID cache negatively
> > interacts with setting up containers and namespaces."...
>
>   Upstream commit has longer rationale:
> http://repo.or.cz/glibc.git/commit/c579f48edba88380635ab98cb61203
> 0e3ed8691e


Looks like they're basically saying that the same could be done via VDSO in
a much simpler way.

-- 
Mantas Mikulėnas 
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-12 Thread Lennart Poettering
On Wed, 12.07.17 09:53, Florian Weimer (f...@deneb.enyo.de) wrote:

> * Lennart Poettering:
> 
> > On Tue, 11.07.17 21:24, Florian Weimer (f...@deneb.enyo.de) wrote:
> >
> >> * Lennart Poettering:
> >> 
> >> > This all stems from my experiences with PulseAudio back in the day:
> >> > People do not grok the effect of fork(): it only duplicates the
> >> > invoking thread, not any other threads of the process, moreover all
> >> > data structures are copied as they are, and that's a time bomb really:
> >> 
> >> These days, the PID can change even without a fork, so the story is a
> >> bit different.
> >
> > Can you elaborate?
> 
> I'm no longer sure that the PID can change with the current kernel,
> but I cannot rule it out, either.  But other weirdness is possible:
> for example, after a fork, getpid and getppid could be equal.

Please be less vague. So I understand correctly that the PID cannot
change without fork after all? So why then regress so badly? And what
does getppid have to do with all this?

> > Are you talking about cases where you invoke clone() directly, instead
> > of via glibc's wrappers? We do that too in systemd, but I am not sure
> > this is really reason enough to introduce this regression in glibc:
> > this is easily worked around (which we do in systemd), and given that
> > the time between clone() and execve() should be short, and the
> > code between the two minimal this isn't really much of a problem.
> 
> There are other uses of clone which do not immediately lead to an
> execve.

Can you elaborate? Have a real-life user of this?

> > Where was this discussed in detail? Do you have any links about the
> > discussions about this?
> 
> It was on libc-alpha and the glibc bug tracker.

Link?

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-12 Thread Lennart Poettering
On Wed, 12.07.17 09:51, Florian Weimer (f...@deneb.enyo.de) wrote:

> * Lennart Poettering:
> 
> > On Tue, 11.07.17 21:26, Florian Weimer (f...@deneb.enyo.de) wrote:
> >
> >> * Lennart Poettering:
> >> 
> >> > Apparently, this regressed between this version and
> >> > glibc-2.24-9.fc25.x86_64 hence.
> >> 
> >> Yes, I backported the fork cache removal to Fedora 25.  There is no
> >> longer a good way to main such a cache in userspace because glibc
> >> cannot intercept anymore all the ways that can change the PID of the
> >> current process because the kernel interfaces for process management
> >> are incredibly rich these days.
> >
> > Please be more specific here. What is this all about?
> 
> We got many bug reports over the years about sandboxes and other heavy
> users of namespaces and clone that the glibc PID cache got out of
> sync, both in child and parent (!) processes.

have any links?

> > What triggered this specifically? is this about docker? docker is
> > written in golang anyway, iirc, which doesn't bother with linking to
> > libc anyway?
> 
> It needs glibc for access to the host and user databases.

can you elaborate? I fail to see any relationship between
unshare()/fork()/getpid() and NSS?

> > Is this a glibc upstream choice primarily? Were the regressions this
> > causes considered?
> 
> I raised the problem of applications calling getpid frequently and
> named OpenSSL as an example.

Link?

> > I mean, the getpid() checking code is not only in use in systemd, but
> > in various other bits, in particular PulseAudio, where I started
> > adding these checks for a good reason. It sounds pretty strange to me
> > to just regress all that...
> 
> Fork detection using getpid is not reliable.  It gives false negatives
> in the case of double-forks, where the process can be different but
> the PID is the same due to reuse.  Considering that this use case is
> broken, I don't think it's worthwhile to jump through hoops to support
> code which is fundamentally broken anyway.

Uh, that's a bit non-chalant, no? Yes, the UNIX PID concept is awfully
designed, but if you argue on that level, you#d have to remove kill(),
and half of the other syscalls that take a PID from glibc too...

The primary intention of checking the PID in our calls is to filter
out cases where people assume they can use our context objects across
fork()s: a clean, early error code is a ton better than memory
corruption. And I am pretty sure the usecase is very valid... And yes,
even if checking getpid() misses some theoretical corner cases,
pthread_atfork() or whatever else you propose will miss others too,
and is much uglier codewise, introduces deps, yadda yadda...

Gah, this is all so ugly. I understand systemd is not a program you
are particularly interested in, but making us chase around your
regressions is just mean...

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-12 Thread Florian Weimer
* Lennart Poettering:

> On Tue, 11.07.17 21:24, Florian Weimer (f...@deneb.enyo.de) wrote:
>
>> * Lennart Poettering:
>> 
>> > This all stems from my experiences with PulseAudio back in the day:
>> > People do not grok the effect of fork(): it only duplicates the
>> > invoking thread, not any other threads of the process, moreover all
>> > data structures are copied as they are, and that's a time bomb really:
>> 
>> These days, the PID can change even without a fork, so the story is a
>> bit different.
>
> Can you elaborate?

I'm no longer sure that the PID can change with the current kernel,
but I cannot rule it out, either.  But other weirdness is possible:
for example, after a fork, getpid and getppid could be equal.

> Are you talking about cases where you invoke clone() directly, instead
> of via glibc's wrappers? We do that too in systemd, but I am not sure
> this is really reason enough to introduce this regression in glibc:
> this is easily worked around (which we do in systemd), and given that
> the time between clone() and execve() should be short, and the
> code between the two minimal this isn't really much of a problem.

There are other uses of clone which do not immediately lead to an
execve.

> Where was this discussed in detail? Do you have any links about the
> discussions about this?

It was on libc-alpha and the glibc bug tracker.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-12 Thread Florian Weimer
* Lennart Poettering:

> On Tue, 11.07.17 21:26, Florian Weimer (f...@deneb.enyo.de) wrote:
>
>> * Lennart Poettering:
>> 
>> > Apparently, this regressed between this version and
>> > glibc-2.24-9.fc25.x86_64 hence.
>> 
>> Yes, I backported the fork cache removal to Fedora 25.  There is no
>> longer a good way to main such a cache in userspace because glibc
>> cannot intercept anymore all the ways that can change the PID of the
>> current process because the kernel interfaces for process management
>> are incredibly rich these days.
>
> Please be more specific here. What is this all about?

We got many bug reports over the years about sandboxes and other heavy
users of namespaces and clone that the glibc PID cache got out of
sync, both in child and parent (!) processes.

> What triggered this specifically? is this about docker? docker is
> written in golang anyway, iirc, which doesn't bother with linking to
> libc anyway?

It needs glibc for access to the host and user databases.

> Is this a glibc upstream choice primarily? Were the regressions this
> causes considered?

I raised the problem of applications calling getpid frequently and
named OpenSSL as an example.

> I mean, the getpid() checking code is not only in use in systemd, but
> in various other bits, in particular PulseAudio, where I started
> adding these checks for a good reason. It sounds pretty strange to me
> to just regress all that...

Fork detection using getpid is not reliable.  It gives false negatives
in the case of double-forks, where the process can be different but
the PID is the same due to reuse.  Considering that this use case is
broken, I don't think it's worthwhile to jump through hoops to support
code which is fundamentally broken anyway.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-12 Thread Lennart Poettering
On Tue, 11.07.17 21:26, Florian Weimer (f...@deneb.enyo.de) wrote:

> * Lennart Poettering:
> 
> > Apparently, this regressed between this version and
> > glibc-2.24-9.fc25.x86_64 hence.
> 
> Yes, I backported the fork cache removal to Fedora 25.  There is no
> longer a good way to main such a cache in userspace because glibc
> cannot intercept anymore all the ways that can change the PID of the
> current process because the kernel interfaces for process management
> are incredibly rich these days.

Please be more specific here. What is this all about? What triggered
this specifically? is this about docker? docker is written in golang
anyway, iirc, which doesn't bother with linking to libc anyway?

Is this a glibc upstream choice primarily? Were the regressions this
causes considered?

I mean, the getpid() checking code is not only in use in systemd, but
in various other bits, in particular PulseAudio, where I started
adding these checks for a good reason. It sounds pretty strange to me
to just regress all that...

Were there considerations to solve this differently? (for example,
expose some concept how getpid() caching can be turned out
specifically for apps needing that? it appears to me that after all
the number of apps calling clone() directly are very much numbered,
while a ton of apps will benefit from getpid() caching...)

Please elaborate!

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-12 Thread Lennart Poettering
On Tue, 11.07.17 21:24, Florian Weimer (f...@deneb.enyo.de) wrote:

> * Lennart Poettering:
> 
> > This all stems from my experiences with PulseAudio back in the day:
> > People do not grok the effect of fork(): it only duplicates the
> > invoking thread, not any other threads of the process, moreover all
> > data structures are copied as they are, and that's a time bomb really:
> 
> These days, the PID can change even without a fork, so the story is a
> bit different.

Can you elaborate?

Are you talking about cases where you invoke clone() directly, instead
of via glibc's wrappers? We do that too in systemd, but I am not sure
this is really reason enough to introduce this regression in glibc:
this is easily worked around (which we do in systemd), and given that
the time between clone() and execve() should be short, and the
code between the two minimal this isn't really much of a problem.

Where was this discussed in detail? Do you have any links about the
discussions about this?

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-11 Thread Florian Weimer
* Lennart Poettering:

> Apparently, this regressed between this version and
> glibc-2.24-9.fc25.x86_64 hence.

Yes, I backported the fork cache removal to Fedora 25.  There is no
longer a good way to main such a cache in userspace because glibc
cannot intercept anymore all the ways that can change the PID of the
current process because the kernel interfaces for process management
are incredibly rich these days.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-11 Thread Florian Weimer
* Lennart Poettering:

> This all stems from my experiences with PulseAudio back in the day:
> People do not grok the effect of fork(): it only duplicates the
> invoking thread, not any other threads of the process, moreover all
> data structures are copied as they are, and that's a time bomb really:

These days, the PID can change even without a fork, so the story is a
bit different.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-11 Thread Florian Weimer
* Michael Chapman:

> It's a pity glibc doesn't provide an equivalent for pthread_atfork() 
> outside of the pthread library. Having a notification that a fork has just 
> occurred would allow us to do the PID caching ourselves.

In fact, it does, as a public symbol: __register_atfork.

It's just not really documented, so it lives a bit a vacuum, like the
old __secure_getenv call.  To fix this, we need to make the dynamic
linker disregard sonames when resolving non-interposed versioned
symbols, and then move the definition from libpthread to libc.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-11 Thread Tomasz Torcz
On Tue, Jul 11, 2017 at 05:20:10PM +0200, Lennart Poettering wrote:
> On Tue, 11.07.17 16:55, Tomasz Torcz (to...@pipebreaker.pl) wrote:
> > > Forgot to mention:
> > > 
> > > $ rpm -qa glibc
> > > glibc-2.24-4.fc25.x86_64
> > > 
> > > Apparently, this regressed between this version and
> > > glibc-2.24-9.fc25.x86_64 hence.
> > > 
> > 
> >   From glibc changelog:
> > 
> > * Wed Jun 07 2017 Arjun Shankar  - 2.24-6
> > - Auto-sync with upstream release/2.24/master,
> >   commit 7b60553e360731338631ccdda71590ac5deca137, fixing:
> > - Remove the PID cache  (#1443976)
> 
> I commented on that bug now. It doesn#t really have a proper
> explanation, all it says is that "The glibc PID cache negatively
> interacts with setting up containers and namespaces."...

  Upstream commit has longer rationale:
http://repo.or.cz/glibc.git/commit/c579f48edba88380635ab98cb612030e3ed8691e

But I'm not competent enough to judge t.


-- 
Tomasz TorczTo co nierealne -- tutaj jest normalne.
xmpp: zdzich...@chrome.pl  Ziomale na życie mają tu patenty specjalne.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-11 Thread Lennart Poettering
On Tue, 11.07.17 16:55, Tomasz Torcz (to...@pipebreaker.pl) wrote:

> On Tue, Jul 11, 2017 at 04:10:38PM +0200, Lennart Poettering wrote:
> > On Tue, 11.07.17 16:07, Lennart Poettering (lenn...@poettering.net) wrote:
> > > Hmm, so I run a slightly older glibc, as I haven#t updated my system
> > > in a while:
> > > 
> > > $ strace -c journalctl --since -1hour 2>&1 >/dev/null | head -10
> > > % time seconds  usecs/call callserrors syscall
> > > -- --- --- - - 
> > >  25.950.001276   7   195   mmap
> > >  23.210.001141   7   16430 open
> > >  22.290.001096   9   119   munmap
> > >   6.530.000321   2   134   close
> > >   6.100.000300   2   135   fstat
> > >   5.150.000253   556   mprotect
> > >   4.880.000240   2   102   fstatfs
> > >   2.300.000113   432   read
> > > 
> > > getpid() is nowhere to be seen in this... Seems Fedora regressed on
> > > this too recently. Meh.
> > 
> > Forgot to mention:
> > 
> > $ rpm -qa glibc
> > glibc-2.24-4.fc25.x86_64
> > 
> > Apparently, this regressed between this version and
> > glibc-2.24-9.fc25.x86_64 hence.
> > 
> 
>   From glibc changelog:
> 
> * Wed Jun 07 2017 Arjun Shankar  - 2.24-6
> - Auto-sync with upstream release/2.24/master,
>   commit 7b60553e360731338631ccdda71590ac5deca137, fixing:
> - Remove the PID cache  (#1443976)

I commented on that bug now. It doesn#t really have a proper
explanation, all it says is that "The glibc PID cache negatively
interacts with setting up containers and namespaces."...

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-11 Thread Tomasz Torcz
On Tue, Jul 11, 2017 at 04:10:38PM +0200, Lennart Poettering wrote:
> On Tue, 11.07.17 16:07, Lennart Poettering (lenn...@poettering.net) wrote:
> > Hmm, so I run a slightly older glibc, as I haven#t updated my system
> > in a while:
> > 
> > $ strace -c journalctl --since -1hour 2>&1 >/dev/null | head -10
> > % time seconds  usecs/call callserrors syscall
> > -- --- --- - - 
> >  25.950.001276   7   195   mmap
> >  23.210.001141   7   16430 open
> >  22.290.001096   9   119   munmap
> >   6.530.000321   2   134   close
> >   6.100.000300   2   135   fstat
> >   5.150.000253   556   mprotect
> >   4.880.000240   2   102   fstatfs
> >   2.300.000113   432   read
> > 
> > getpid() is nowhere to be seen in this... Seems Fedora regressed on
> > this too recently. Meh.
> 
> Forgot to mention:
> 
> $ rpm -qa glibc
> glibc-2.24-4.fc25.x86_64
> 
> Apparently, this regressed between this version and
> glibc-2.24-9.fc25.x86_64 hence.
> 

  From glibc changelog:

* Wed Jun 07 2017 Arjun Shankar  - 2.24-6
- Auto-sync with upstream release/2.24/master,
  commit 7b60553e360731338631ccdda71590ac5deca137, fixing:
- Remove the PID cache  (#1443976)

-- 
Tomasz TorczTo co nierealne -- tutaj jest normalne.
xmpp: zdzich...@chrome.pl  Ziomale na życie mają tu patenty specjalne.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-11 Thread Lennart Poettering
On Tue, 11.07.17 16:07, Lennart Poettering (lenn...@poettering.net) wrote:
> Hmm, so I run a slightly older glibc, as I haven#t updated my system
> in a while:
> 
> $ strace -c journalctl --since -1hour 2>&1 >/dev/null | head -10
> % time seconds  usecs/call callserrors syscall
> -- --- --- - - 
>  25.950.001276   7   195   mmap
>  23.210.001141   7   16430 open
>  22.290.001096   9   119   munmap
>   6.530.000321   2   134   close
>   6.100.000300   2   135   fstat
>   5.150.000253   556   mprotect
>   4.880.000240   2   102   fstatfs
>   2.300.000113   432   read
> 
> getpid() is nowhere to be seen in this... Seems Fedora regressed on
> this too recently. Meh.

Forgot to mention:

$ rpm -qa glibc
glibc-2.24-4.fc25.x86_64

Apparently, this regressed between this version and
glibc-2.24-9.fc25.x86_64 hence.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-11 Thread Lennart Poettering
On Tue, 11.07.17 21:59, Michael Chapman (m...@very.puzzling.org) wrote:

> On Tue, 11 Jul 2017, Lennart Poettering wrote:
> > On Tue, 11.07.17 12:55, Uoti Urpala (uoti.urp...@pp1.inet.fi) wrote:
> > 
> > > On Tue, 2017-07-11 at 09:35 +0200, Lennart Poettering wrote:
> > > > Normally it's dead cheap to check that, it's just reading and
> > > > comparing one memory location. It's a pitty that this isn't the case
> > > > currently on Debian, but as it appears this is an oversight on their
> > > > side, and I am sure it will be eventually fixed there, if it hasn't
> > > > already.
> > > 
> > > Are you sure about those "Debian only" and "will be 'fixed'" parts? The
> > > Debian patch seems to be a cherry pick from upstream glibc. Is there
> > > evidence of some error that would cause effects only visible on Debian
> > > and nowhere else? And/or has the change been reverted or behavior
> > > otherwise modified upstream to limit the range of relevant versions?
> > 
> > See the links Vito provided:
> > 
> > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857909
> > https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commit;h=0cb313f7cb0e418b3d56f3a2ac69790522ab825d
> > 
> > i.e. Debian undid the PID caching to fix some issue that has been fix
> > properly now, and hence the PID caching should be turned on again.
> > 
> > On Fedora at least getpid() is not visible in strace, and is fully
> > cached, as it should be.
> 
> I just tested this on F25 and F26 beta, and it's certainly visible for me on
> both of them:
> 
>   # cat /etc/system-release
>   Fedora release 25 (Twenty Five)
>   # rpm -q glibc
>   glibc-2.24-9.fc25.x86_64
>   # strace -c journalctl --since -1hour 2>&1 >/dev/null | head -10
>   % time seconds  usecs/call callserrors syscall
>   -- --- --- - - 
>93.930.167020   2 83761   getpid
> 3.930.006983   2  3025   write
> 0.540.000953  1097   mmap
> 0.390.000696  1352 8 open
> 0.310.000558  1440   munmap
> 0.190.000332   842   mprotect
> 0.150.000264   645   fstat
> 0.140.000246   644   close

Hmm, so I run a slightly older glibc, as I haven#t updated my system
in a while:

$ strace -c journalctl --since -1hour 2>&1 >/dev/null | head -10
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 25.950.001276   7   195   mmap
 23.210.001141   7   16430 open
 22.290.001096   9   119   munmap
  6.530.000321   2   134   close
  6.100.000300   2   135   fstat
  5.150.000253   556   mprotect
  4.880.000240   2   102   fstatfs
  2.300.000113   432   read

getpid() is nowhere to be seen in this... Seems Fedora regressed on
this too recently. Meh.

Somebody should probably file a bug about this regression and get
clarification if this is going to remain slow, or if they are going to
fix that again... If this is going to remain slow, we shoud probably
find a different way to detect forks... we could install a
pthread_atfork() handler and set some flag...

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-11 Thread Zbigniew Jędrzejewski-Szmek
On Tue, Jul 11, 2017 at 09:59:45PM +1000, Michael Chapman wrote:
> On Tue, 11 Jul 2017, Lennart Poettering wrote:
> >On Tue, 11.07.17 12:55, Uoti Urpala (uoti.urp...@pp1.inet.fi) wrote:
> >
> >>On Tue, 2017-07-11 at 09:35 +0200, Lennart Poettering wrote:
> >>>Normally it's dead cheap to check that, it's just reading and
> >>>comparing one memory location. It's a pitty that this isn't the case
> >>>currently on Debian, but as it appears this is an oversight on their
> >>>side, and I am sure it will be eventually fixed there, if it hasn't
> >>>already.
> >>
> >>Are you sure about those "Debian only" and "will be 'fixed'" parts? The
> >>Debian patch seems to be a cherry pick from upstream glibc. Is there
> >>evidence of some error that would cause effects only visible on Debian
> >>and nowhere else? And/or has the change been reverted or behavior
> >>otherwise modified upstream to limit the range of relevant versions?
> >
> >See the links Vito provided:
> >
> >https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857909
> >https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commit;h=0cb313f7cb0e418b3d56f3a2ac69790522ab825d
> >
> >i.e. Debian undid the PID caching to fix some issue that has been fix
> >properly now, and hence the PID caching should be turned on again.
> >
> >On Fedora at least getpid() is not visible in strace, and is fully
> >cached, as it should be.
> 
> I just tested this on F25 and F26 beta, and it's certainly visible
> for me on both of them:
> 
>   # cat /etc/system-release
>   Fedora release 25 (Twenty Five)
>   # rpm -q glibc
>   glibc-2.24-9.fc25.x86_64
>   # strace -c journalctl --since -1hour 2>&1 >/dev/null | head -10
>   % time seconds  usecs/call callserrors syscall
>   -- --- --- - - 
>93.930.167020   2 83761   getpid
> 3.930.006983   2  3025   write
> 0.540.000953  1097   mmap
> 0.390.000696  1352 8 open
> 0.310.000558  1440   munmap
> 0.190.000332   842   mprotect
> 0.150.000264   645   fstat
> 0.140.000246   644   close
> 
>   # cat /etc/system-release
>   Fedora release 26 (Twenty Six)
>   # rpm -q glibc
>   glibc-2.25-7.fc26.x86_64
>   # strace -c journalctl --since -1hour 2>&1 >/dev/null | head -10
>   % time seconds  usecs/call callserrors syscall
>   -- --- --- - - 
>62.840.007874   4  2063   getpid
> 7.960.000998  1286   mmap
> 7.850.000983  2048 8 open
> 3.850.000483   954   mprotect
> 2.800.000351   571   write
> 2.750.000345  1132   read
> 2.690.000337   841   fstat
> 2.660.000333   840   close
> 
> The second machine had just been started, which is why the numbers
> are a lot lower. Nevertheless, getpid is still taking by far the
> most amount of time in syscalls.
> 
> Both of these are on Fedora's testing branch, but I don't think
> Fedora's regular branch has a significantly different version of
> glibc.

Yep, I can confirm that: many many getpid() syscalls on F26 and rawhide,
and none on F24.

Zbyszek
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-11 Thread Michael Chapman

On Tue, 11 Jul 2017, Lennart Poettering wrote:

On Tue, 11.07.17 12:55, Uoti Urpala (uoti.urp...@pp1.inet.fi) wrote:


On Tue, 2017-07-11 at 09:35 +0200, Lennart Poettering wrote:

Normally it's dead cheap to check that, it's just reading and
comparing one memory location. It's a pitty that this isn't the case
currently on Debian, but as it appears this is an oversight on their
side, and I am sure it will be eventually fixed there, if it hasn't
already.


Are you sure about those "Debian only" and "will be 'fixed'" parts? The
Debian patch seems to be a cherry pick from upstream glibc. Is there
evidence of some error that would cause effects only visible on Debian
and nowhere else? And/or has the change been reverted or behavior
otherwise modified upstream to limit the range of relevant versions?


See the links Vito provided:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857909
https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commit;h=0cb313f7cb0e418b3d56f3a2ac69790522ab825d

i.e. Debian undid the PID caching to fix some issue that has been fix
properly now, and hence the PID caching should be turned on again.

On Fedora at least getpid() is not visible in strace, and is fully
cached, as it should be.


I just tested this on F25 and F26 beta, and it's certainly visible for me 
on both of them:


  # cat /etc/system-release
  Fedora release 25 (Twenty Five)
  # rpm -q glibc
  glibc-2.24-9.fc25.x86_64
  # strace -c journalctl --since -1hour 2>&1 >/dev/null | head -10
  % time seconds  usecs/call callserrors syscall
  -- --- --- - - 
   93.930.167020   2 83761   getpid
3.930.006983   2  3025   write
0.540.000953  1097   mmap
0.390.000696  1352 8 open
0.310.000558  1440   munmap
0.190.000332   842   mprotect
0.150.000264   645   fstat
0.140.000246   644   close

  # cat /etc/system-release
  Fedora release 26 (Twenty Six)
  # rpm -q glibc
  glibc-2.25-7.fc26.x86_64
  # strace -c journalctl --since -1hour 2>&1 >/dev/null | head -10
  % time seconds  usecs/call callserrors syscall
  -- --- --- - - 
   62.840.007874   4  2063   getpid
7.960.000998  1286   mmap
7.850.000983  2048 8 open
3.850.000483   954   mprotect
2.800.000351   571   write
2.750.000345  1132   read
2.690.000337   841   fstat
2.660.000333   840   close

The second machine had just been started, which is why the numbers are a 
lot lower. Nevertheless, getpid is still taking by far the most 
amount of time in syscalls.


Both of these are on Fedora's testing branch, but I don't think Fedora's 
regular branch has a significantly different version of glibc.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-11 Thread Uoti Urpala
Resend with correct list address
On Tue, 2017-07-11 at 12:00 +0200, Lennart Poettering wrote:
> On Tue, 11.07.17 12:55, Uoti Urpala (uoti.urp...@pp1.inet.fi) wrote:
> > Are you sure about those "Debian only" and "will be 'fixed'" parts? The
> > Debian patch seems to be a cherry pick from upstream glibc. Is there
> > evidence of some error that would cause effects only visible on Debian
> > and nowhere else? And/or has the change been reverted or behavior
> > otherwise modified upstream to limit the range of relevant versions?
> 
> See the links Vito provided:
> 
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857909
> https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commit;h=0cb313f7cb0e418b3d56f3a2ac69790522ab825d
> 
> i.e. Debian undid the PID caching to fix some issue that has been fix
> properly now, and hence the PID caching should be turned on again.

That seems backwards: the commit cherry-picked by Debian seems to be
c579f48edba88380635a, which is NEWER than above 0cb313f7cb0e418b3d56.
In other words, it seems 0cb313 was a failed attempt at a fix and the
patch cherry-picked by Debian was needed to properly fix things.

> On Fedora at least getpid() is not visible in strace, and is fully
> cached, as it should be.

Is that glibc 2.25? It seems to contain c579f48 at least; could be a
Fedora-specific change though?
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-11 Thread Uoti Urpala
On Tue, 2017-07-11 at 09:35 +0200, Lennart Poettering wrote:
> Normally it's dead cheap to check that, it's just reading and
> comparing one memory location. It's a pitty that this isn't the case
> currently on Debian, but as it appears this is an oversight on their
> side, and I am sure it will be eventually fixed there, if it hasn't
> already.

Are you sure about those "Debian only" and "will be 'fixed'" parts? The
Debian patch seems to be a cherry pick from upstream glibc. Is there
evidence of some error that would cause effects only visible on Debian
and nowhere else? And/or has the change been reverted or behavior
otherwise modified upstream to limit the range of relevant versions?

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-11 Thread Lennart Poettering
On Tue, 11.07.17 12:55, Uoti Urpala (uoti.urp...@pp1.inet.fi) wrote:

> On Tue, 2017-07-11 at 09:35 +0200, Lennart Poettering wrote:
> > Normally it's dead cheap to check that, it's just reading and
> > comparing one memory location. It's a pitty that this isn't the case
> > currently on Debian, but as it appears this is an oversight on their
> > side, and I am sure it will be eventually fixed there, if it hasn't
> > already.
> 
> Are you sure about those "Debian only" and "will be 'fixed'" parts? The
> Debian patch seems to be a cherry pick from upstream glibc. Is there
> evidence of some error that would cause effects only visible on Debian
> and nowhere else? And/or has the change been reverted or behavior
> otherwise modified upstream to limit the range of relevant versions?

See the links Vito provided:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857909
https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commit;h=0cb313f7cb0e418b3d56f3a2ac69790522ab825d

i.e. Debian undid the PID caching to fix some issue that has been fix
properly now, and hence the PID caching should be turned on again.

On Fedora at least getpid() is not visible in strace, and is fully
cached, as it should be.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-11 Thread Lennart Poettering
On Mon, 10.07.17 14:04, vcap...@pengaru.com (vcap...@pengaru.com) wrote:

> > > Except there's always a risk of these things regressing to normal 
> > > syscalls,
> > > and one has to weigh the utility against that.  It's unclear to me what
> > > significant utility having the sd-journal API police changing pids by
> > > calling getpid() at every public entrypoint is bringing to the table.
> > 
> > So it seems the issue has been fixed in glibc upstream more than a year
> > ago, and it doesn't seem to make sense to optimize current systemd git for
> > that.
> > 
> 
> Can you provide a commit id?  I took a glance at 
> sourcewaire.org/git/gitweb.cgi
> for getpid commits and didn't see anything relevant since the removal[1].
> 
> 
> > I see the argument that the getpid() checks are a bit excessive. Is their
> > overhead actually noticeable with current glibc?
> > 
> 
> On my spare arch system I still see gratuitous getpid() calls from
> journalctl, which is on glibc 2.5-2.
> 
> The pollution of strace output alone due to these checks is nuisance enough
> for me to want the checks removed, considering their only value is to catch
> programmer errors.  There's an abundance of potential programmer errors
> we're not making any effort to prevent, why is this one so privileged that
> it warrants policing?

strace doesn't show getpid() calls if they are properly cached. While
stracing journalctl I never have seen a single one of them...

> I appreciate Lennart's point about the hazards of forking from threaded
> programs.  It just doesn't seem like a valid rationalization for sprinkling
> a system library's entrypoints with getpid() calls to catch this in
> production.

Normally it's dead cheap to check that, it's just reading and
comparing one memory location. It's a pitty that this isn't the case
currently on Debian, but as it appears this is an oversight on their
side, and I am sure it will be eventually fixed there, if it hasn't already.

> Considering the associated potential costs, and the historic controversy
> surrounding the caching of this particular syscall[2] I'm a bit confused by
> the status quo.

Well, normally there's next to no cost to this, on pretty much any non-Debian
distro there isn't any price for this to pay...

I mean, we could certainly cache that value in our code too, but given
that glibc does that already in the normal case I think this is better
left to be fixed in glibc rather than our code.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-10 Thread vcaputo
On Sat, Jul 08, 2017 at 03:49:11AM +, Zbigniew Jędrzejewski-Szmek wrote:
> On Fri, Jul 07, 2017 at 03:54:09PM -0700, vcap...@pengaru.com wrote:
> > On Fri, Jul 07, 2017 at 10:34:22PM +, Zbigniew Jędrzejewski-Szmek wrote:
> > > On Fri, Jul 07, 2017 at 02:35:16PM -0700, vcap...@pengaru.com wrote:
> > > > On Fri, Jul 07, 2017 at 01:49:54PM -0700, vcap...@pengaru.com wrote:
> > > > > On Fri, Jul 07, 2017 at 08:37:08PM +, Mantas Mikulėnas wrote:
> > > > > > Back when that commit was made, didn't glibc cache the getpid() 
> > > > > > result in
> > > > > > userspace? That would explain why it was not noticed.
> > > > >
> > > > > Hmm, this crossed my mind, and come to think of it I did a 
> > > > > dist-upgrade
> > > > > from Debian jessie to stretch overnight machine and haven't rebooted.
> > > > > 
> > > > > Perhaps the vdso isn't working and the costly getpid() is a red 
> > > > > herring, will
> > > > > reboot and retest to confirm.
> > > > > 
> > > > 
> > > > It appears Debian has a glibc patch to disable the caching (I was 
> > > > unaware
> > > > such an elaborate dance was being performed to cache this!)
> > > > 
> > > > https://anonscm.debian.org/cgit/pkg-glibc/glibc.git/commit/debian/patches/any?id=5850253f509604dd46a6131acc057ea26e1588ba
> > > 
> > > Do we know the justification for this patch?
> > > 
> > 
> > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857909
> > 
> > Which references this upstream glibc bug:
> > 
> > https://sourceware.org/bugzilla/show_bug.cgi?id=19957
> > https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commit;h=0cb313f7cb0e418b3d56f3a2ac69790522ab825d
> > 
> > 
> > > > Unsure where I stand on core system software assuming certain syscalls 
> > > > are
> > > > always going to be exceptionally cheap though...
> > > 
> > > Optimization is never in a vacuum. If glibc does something cheaply, it
> > > seems reasonable to take advantage of it.
> > > 
> > 
> > Except there's always a risk of these things regressing to normal syscalls,
> > and one has to weigh the utility against that.  It's unclear to me what
> > significant utility having the sd-journal API police changing pids by
> > calling getpid() at every public entrypoint is bringing to the table.
> 
> So it seems the issue has been fixed in glibc upstream more than a year
> ago, and it doesn't seem to make sense to optimize current systemd git for
> that.
> 

Can you provide a commit id?  I took a glance at sourcewaire.org/git/gitweb.cgi
for getpid commits and didn't see anything relevant since the removal[1].


> I see the argument that the getpid() checks are a bit excessive. Is their
> overhead actually noticeable with current glibc?
> 

On my spare arch system I still see gratuitous getpid() calls from
journalctl, which is on glibc 2.5-2.

The pollution of strace output alone due to these checks is nuisance enough
for me to want the checks removed, considering their only value is to catch
programmer errors.  There's an abundance of potential programmer errors
we're not making any effort to prevent, why is this one so privileged that
it warrants policing?

I appreciate Lennart's point about the hazards of forking from threaded
programs.  It just doesn't seem like a valid rationalization for sprinkling
a system library's entrypoints with getpid() calls to catch this in
production.

Considering the associated potential costs, and the historic controversy
surrounding the caching of this particular syscall[2] I'm a bit confused by
the status quo.

Cheers,
Vito Caputo

1: 
https://sourceware.org/git/gitweb.cgi?p=glibc.git=search=HEAD=commit=getpid
2: http://yarchive.net/comp/linux/getpid_caching.html

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-10 Thread Michael Chapman

On Mon, 10 Jul 2017, Lennart Poettering wrote:

On Mon, 10.07.17 21:51, Michael Chapman (m...@very.puzzling.org) wrote:


This all stems from my experiences with PulseAudio back in the day:
People do not grok the effect of fork(): it only duplicates the
invoking thread, not any other threads of the process, moreover all
data structures are copied as they are, and that's a time bomb really:
consider one of our context objects is being used by one thread at the
moment another thread invokes fork(): the thread using the object is
busy making changes to the object, rearranging some datastructure (for
example, rehashing a hash table, because it hit its fill limit) and
suchlike. Now the fork() happens while it is doing that: the data
structure will be copied in its half-written, half-updated status quo,
and in the child process there's no thread that could finish what has
been started, and there's neither a way to rollback the changes that
are in progress.

[...]

Thanks, that really does clear things up.

It's a pity glibc doesn't provide an equivalent for pthread_atfork() outside
of the pthread library. Having a notification that a fork has just occurred
would allow us to do the PID caching ourselves.


Well, pthread_atfork() is probably more a source of problems than a solution
for them.

Mutexes and fork() do not mix well: if you have a thread that acquired
a mutex right before a fork() then it will cease to exist but the
mutex remains locked. Now, you could use pthread_atfork() to unlock
it, but that really works only in trivial cases, with trivial data
structures, and otherwise creates ABBA problems and similar. I mean,
mutexes are supposed to make pieces of code atomic from the outside
view: but if you duplicate a process without the thread it will appear
aborted to the outside, and that's quite far from "atomic"...


I understand that... which is why I was only talking about PID caching. 
That is, it could be used to avoid the getpid() calls.


Anyway, it's all moot as I don't think we'd want to use pthread_atfork in 
any systemd APIs -- I'm not sure if they all link to libpthread yet 
anyway.



Of course, there's still a problem with people calling the clone syscall
directly... but I think once people start doing that we have to trust them
to know what they're doing.


Yes: if you invoke clone() directly, you should really invoke execve()
too soon, and in the time between these two syscalls you should not
invoke getpid() and limit yourself to known safe calls.

Lennart


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-10 Thread Lennart Poettering
On Mon, 10.07.17 21:51, Michael Chapman (m...@very.puzzling.org) wrote:

> > This all stems from my experiences with PulseAudio back in the day:
> > People do not grok the effect of fork(): it only duplicates the
> > invoking thread, not any other threads of the process, moreover all
> > data structures are copied as they are, and that's a time bomb really:
> > consider one of our context objects is being used by one thread at the
> > moment another thread invokes fork(): the thread using the object is
> > busy making changes to the object, rearranging some datastructure (for
> > example, rehashing a hash table, because it hit its fill limit) and
> > suchlike. Now the fork() happens while it is doing that: the data
> > structure will be copied in its half-written, half-updated status quo,
> > and in the child process there's no thread that could finish what has
> > been started, and there's neither a way to rollback the changes that
> > are in progress.
> [...]
> 
> Thanks, that really does clear things up.
> 
> It's a pity glibc doesn't provide an equivalent for pthread_atfork() outside
> of the pthread library. Having a notification that a fork has just occurred
> would allow us to do the PID caching ourselves.

Well, pthread_atfork() is probably more a source of problems than a solution
for them.

Mutexes and fork() do not mix well: if you have a thread that acquired
a mutex right before a fork() then it will cease to exist but the
mutex remains locked. Now, you could use pthread_atfork() to unlock
it, but that really works only in trivial cases, with trivial data
structures, and otherwise creates ABBA problems and similar. I mean,
mutexes are supposed to make pieces of code atomic from the outside
view: but if you duplicate a process without the thread it will appear
aborted to the outside, and that's quite far from "atomic"...

> Of course, there's still a problem with people calling the clone syscall
> directly... but I think once people start doing that we have to trust them
> to know what they're doing.

Yes: if you invoke clone() directly, you should really invoke execve()
too soon, and in the time between these two syscalls you should not
invoke getpid() and limit yourself to known safe calls.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-10 Thread Michael Chapman

On Mon, 10 Jul 2017, Lennart Poettering wrote:

On Sat, 08.07.17 16:24, Michael Chapman (m...@very.puzzling.org) wrote:


On Sat, 8 Jul 2017, vcap...@pengaru.com wrote:

In doing some casual journalctl profiling and stracing, it became apparent
that `journalctl -b --no-pager` runs across a significant quantity of logs,
~10% of the time was thrown away on getpid() calls due to commmit a65f06b.

As-is:
# time ./journalctl -b --no-pager > /dev/null

real0m11.033s
user0m10.084s
sys 0m0.943s


After changing journal_pid_changed() to simply return 1:
# time ./journalctl -b --no-pager > /dev/null

 real0m9.641s
 user0m9.449s
 sys 0m0.191s


[...]


As this is public sd-journal API, it's somewhat set in stone.


So it's arguable whether making an API work in _more_ situations than it
previously did is a "breaking" change.

I've tried to go through the history for the various *_pid_changed()
functions in the APIs systemd presents, and I'm struggling to find a good
justification for them. It seems like it was originally added for sd-bus in:

  
https://github.com/systemd/systemd/commit/d5a2b9a6f455468a0f29483303657ab4fd7013d8

And then other APIs copied it to be consistent with sd-bus:

  
https://github.com/systemd/systemd/commit/a65f06bb27688a6738f2f94b7f055f4c66768d63
  
https://github.com/systemd/systemd/commit/eaa3cbef3b8c214cd5c2d75b04e70ad477187e17
  
https://github.com/systemd/systemd/commit/adf412b9ec7292e0c83aaf9ab93e08c2c8bd524a

Unfortunately none of these commits describe what will go wrong if one of
these APIs is used across fork. Does anybody know what specifically is the
problem being addressed here? Can we detect this problem in some
other way?


This all stems from my experiences with PulseAudio back in the day:
People do not grok the effect of fork(): it only duplicates the
invoking thread, not any other threads of the process, moreover all
data structures are copied as they are, and that's a time bomb really:
consider one of our context objects is being used by one thread at the
moment another thread invokes fork(): the thread using the object is
busy making changes to the object, rearranging some datastructure (for
example, rehashing a hash table, because it hit its fill limit) and
suchlike. Now the fork() happens while it is doing that: the data
structure will be copied in its half-written, half-updated status quo,
and in the child process there's no thread that could finish what has
been started, and there's neither a way to rollback the changes that
are in progress.

[...]

Thanks, that really does clear things up.

It's a pity glibc doesn't provide an equivalent for pthread_atfork() 
outside of the pthread library. Having a notification that a fork has just 
occurred would allow us to do the PID caching ourselves.


Of course, there's still a problem with people calling the clone syscall 
directly... but I think once people start doing that we have to trust them 
to know what they're doing.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-10 Thread Lennart Poettering
On Sat, 08.07.17 16:24, Michael Chapman (m...@very.puzzling.org) wrote:

> On Sat, 8 Jul 2017, vcap...@pengaru.com wrote:
> > In doing some casual journalctl profiling and stracing, it became apparent
> > that `journalctl -b --no-pager` runs across a significant quantity of logs,
> > ~10% of the time was thrown away on getpid() calls due to commmit a65f06b.
> > 
> > As-is:
> > # time ./journalctl -b --no-pager > /dev/null
> > 
> > real0m11.033s
> > user0m10.084s
> > sys 0m0.943s
> > 
> > 
> > After changing journal_pid_changed() to simply return 1:
> > # time ./journalctl -b --no-pager > /dev/null
> > 
> >  real0m9.641s
> >  user0m9.449s
> >  sys 0m0.191s
> 
> [...]
> 
> > As this is public sd-journal API, it's somewhat set in stone.
> 
> So it's arguable whether making an API work in _more_ situations than it
> previously did is a "breaking" change.
> 
> I've tried to go through the history for the various *_pid_changed()
> functions in the APIs systemd presents, and I'm struggling to find a good
> justification for them. It seems like it was originally added for sd-bus in:
> 
>   
> https://github.com/systemd/systemd/commit/d5a2b9a6f455468a0f29483303657ab4fd7013d8
> 
> And then other APIs copied it to be consistent with sd-bus:
> 
>   
> https://github.com/systemd/systemd/commit/a65f06bb27688a6738f2f94b7f055f4c66768d63
>   
> https://github.com/systemd/systemd/commit/eaa3cbef3b8c214cd5c2d75b04e70ad477187e17
>   
> https://github.com/systemd/systemd/commit/adf412b9ec7292e0c83aaf9ab93e08c2c8bd524a
> 
> Unfortunately none of these commits describe what will go wrong if one of
> these APIs is used across fork. Does anybody know what specifically is the
> problem being addressed here? Can we detect this problem in some
> other way?

This all stems from my experiences with PulseAudio back in the day:
People do not grok the effect of fork(): it only duplicates the
invoking thread, not any other threads of the process, moreover all
data structures are copied as they are, and that's a time bomb really:
consider one of our context objects is being used by one thread at the
moment another thread invokes fork(): the thread using the object is
busy making changes to the object, rearranging some datastructure (for
example, rehashing a hash table, because it hit its fill limit) and
suchlike. Now the fork() happens while it is doing that: the data
structure will be copied in its half-written, half-updated status quo,
and in the child process there's no thread that could finish what has
been started, and there's neither a way to rollback the changes that
are in progress.

In order to protect us against the time bomb we hence have these
checks that error our early and explicitly on all our calls, with
clean error codes as soon as a fork() is detected but people try to
reuse our objects.

Now, you could argue, that on purely single-threaded processes all
this is too restrictive, but it's really not like that anymore, as many
libraries (including our own, for example in sd-resolve) fork off
background threads, and invoke our code in them. Never forget that our
libraries such as sd-bus and so on or invoked by NSS, which is very
frequently invoked in threads (because NSS is synchronous, which is
very much incompatible with most graphical spps).

So yeah, we never know in which context our stuff is called, and hence
I think it's safe to refuse in a friendly way early on rather than
just wait for the time bomb to explode when our code touches those
objects again.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-10 Thread Lennart Poettering
On Fri, 07.07.17 14:35, vcap...@pengaru.com (vcap...@pengaru.com) wrote:

> On Fri, Jul 07, 2017 at 01:49:54PM -0700, vcap...@pengaru.com wrote:
> > On Fri, Jul 07, 2017 at 08:37:08PM +, Mantas Mikulėnas wrote:
> > > Back when that commit was made, didn't glibc cache the getpid() result in
> > > userspace? That would explain why it was not noticed.
> >
> > Hmm, this crossed my mind, and come to think of it I did a dist-upgrade
> > from Debian jessie to stretch overnight machine and haven't rebooted.
> > 
> > Perhaps the vdso isn't working and the costly getpid() is a red herring, 
> > will
> > reboot and retest to confirm.
> > 
> 
> It appears Debian has a glibc patch to disable the caching (I was unaware
> such an elaborate dance was being performed to cache this!)
> 
> https://anonscm.debian.org/cgit/pkg-glibc/glibc.git/commit/debian/patches/any?id=5850253f509604dd46a6131acc057ea26e1588ba
> 
> Unsure where I stand on core system software assuming certain syscalls are
> always going to be exceptionally cheap though...

Hmm, we generally assume that a couple of syscalls are indeed very
cheap. That's getpid(), getuid() as well as the various ways to get
the system time.

Yes, glibc caching of the PID is very annoying in some cases (as it
complicates code that invokes clone() through a direct syscall
invocation), but still, I think it should be OK assume that it is
cheap to invoke it.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-08 Thread Michael Chapman

On Sat, 8 Jul 2017, vcap...@pengaru.com wrote:

In doing some casual journalctl profiling and stracing, it became apparent
that `journalctl -b --no-pager` runs across a significant quantity of logs,
~10% of the time was thrown away on getpid() calls due to commmit a65f06b.

As-is:
# time ./journalctl -b --no-pager > /dev/null

real0m11.033s
user0m10.084s
sys 0m0.943s


After changing journal_pid_changed() to simply return 1:
# time ./journalctl -b --no-pager > /dev/null

 real0m9.641s
 user0m9.449s
 sys 0m0.191s


[...]


As this is public sd-journal API, it's somewhat set in stone.


So it's arguable whether making an API work in _more_ situations than it 
previously did is a "breaking" change.


I've tried to go through the history for the various *_pid_changed() 
functions in the APIs systemd presents, and I'm struggling to find a good 
justification for them. It seems like it was originally added for sd-bus 
in:


  
https://github.com/systemd/systemd/commit/d5a2b9a6f455468a0f29483303657ab4fd7013d8

And then other APIs copied it to be consistent with sd-bus:

  
https://github.com/systemd/systemd/commit/a65f06bb27688a6738f2f94b7f055f4c66768d63
  
https://github.com/systemd/systemd/commit/eaa3cbef3b8c214cd5c2d75b04e70ad477187e17
  
https://github.com/systemd/systemd/commit/adf412b9ec7292e0c83aaf9ab93e08c2c8bd524a

Unfortunately none of these commits describe what will go wrong if one of 
these APIs is used across fork. Does anybody know what specifically is the 
problem being addressed here? Can we detect this problem in some other 
way?


My guess is that it has something to do with epoll's boneheaded design, 
which basically requires code to recreate its epoll sets completely after 
a fork. But surely if sd-event is the only thing that uses epoll we can 
just have only it check the PID and let it propagate ECHILD back up to 
those APIs that use it, rather than testing the PID in each and every API 
individually?



   However,
there's nothing preventing the systemd-internal tooling from linking with
a less defensive/conformant underlying implementation shared with the public
API implementation where these kinds of overheads can be preserved.


This would be useful for external users of these APIs as well.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-07 Thread Zbigniew Jędrzejewski-Szmek
On Fri, Jul 07, 2017 at 03:54:09PM -0700, vcap...@pengaru.com wrote:
> On Fri, Jul 07, 2017 at 10:34:22PM +, Zbigniew Jędrzejewski-Szmek wrote:
> > On Fri, Jul 07, 2017 at 02:35:16PM -0700, vcap...@pengaru.com wrote:
> > > On Fri, Jul 07, 2017 at 01:49:54PM -0700, vcap...@pengaru.com wrote:
> > > > On Fri, Jul 07, 2017 at 08:37:08PM +, Mantas Mikulėnas wrote:
> > > > > Back when that commit was made, didn't glibc cache the getpid() 
> > > > > result in
> > > > > userspace? That would explain why it was not noticed.
> > > >
> > > > Hmm, this crossed my mind, and come to think of it I did a dist-upgrade
> > > > from Debian jessie to stretch overnight machine and haven't rebooted.
> > > > 
> > > > Perhaps the vdso isn't working and the costly getpid() is a red 
> > > > herring, will
> > > > reboot and retest to confirm.
> > > > 
> > > 
> > > It appears Debian has a glibc patch to disable the caching (I was unaware
> > > such an elaborate dance was being performed to cache this!)
> > > 
> > > https://anonscm.debian.org/cgit/pkg-glibc/glibc.git/commit/debian/patches/any?id=5850253f509604dd46a6131acc057ea26e1588ba
> > 
> > Do we know the justification for this patch?
> > 
> 
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857909
> 
> Which references this upstream glibc bug:
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=19957
> https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commit;h=0cb313f7cb0e418b3d56f3a2ac69790522ab825d
> 
> 
> > > Unsure where I stand on core system software assuming certain syscalls are
> > > always going to be exceptionally cheap though...
> > 
> > Optimization is never in a vacuum. If glibc does something cheaply, it
> > seems reasonable to take advantage of it.
> > 
> 
> Except there's always a risk of these things regressing to normal syscalls,
> and one has to weigh the utility against that.  It's unclear to me what
> significant utility having the sd-journal API police changing pids by
> calling getpid() at every public entrypoint is bringing to the table.

So it seems the issue has been fixed in glibc upstream more than a year
ago, and it doesn't seem to make sense to optimize current systemd git for
that.

I see the argument that the getpid() checks are a bit excessive. Is their
overhead actually noticeable with current glibc?

Zbyszek

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-07 Thread vcaputo
On Fri, Jul 07, 2017 at 10:34:22PM +, Zbigniew Jędrzejewski-Szmek wrote:
> On Fri, Jul 07, 2017 at 02:35:16PM -0700, vcap...@pengaru.com wrote:
> > On Fri, Jul 07, 2017 at 01:49:54PM -0700, vcap...@pengaru.com wrote:
> > > On Fri, Jul 07, 2017 at 08:37:08PM +, Mantas Mikulėnas wrote:
> > > > Back when that commit was made, didn't glibc cache the getpid() result 
> > > > in
> > > > userspace? That would explain why it was not noticed.
> > >
> > > Hmm, this crossed my mind, and come to think of it I did a dist-upgrade
> > > from Debian jessie to stretch overnight machine and haven't rebooted.
> > > 
> > > Perhaps the vdso isn't working and the costly getpid() is a red herring, 
> > > will
> > > reboot and retest to confirm.
> > > 
> > 
> > It appears Debian has a glibc patch to disable the caching (I was unaware
> > such an elaborate dance was being performed to cache this!)
> > 
> > https://anonscm.debian.org/cgit/pkg-glibc/glibc.git/commit/debian/patches/any?id=5850253f509604dd46a6131acc057ea26e1588ba
> 
> Do we know the justification for this patch?
> 

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857909

Which references this upstream glibc bug:

https://sourceware.org/bugzilla/show_bug.cgi?id=19957
https://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commit;h=0cb313f7cb0e418b3d56f3a2ac69790522ab825d


> > Unsure where I stand on core system software assuming certain syscalls are
> > always going to be exceptionally cheap though...
> 
> Optimization is never in a vacuum. If glibc does something cheaply, it
> seems reasonable to take advantage of it.
> 

Except there's always a risk of these things regressing to normal syscalls,
and one has to weigh the utility against that.  It's unclear to me what
significant utility having the sd-journal API police changing pids by
calling getpid() at every public entrypoint is bringing to the table.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-07 Thread Zbigniew Jędrzejewski-Szmek
On Fri, Jul 07, 2017 at 02:35:16PM -0700, vcap...@pengaru.com wrote:
> On Fri, Jul 07, 2017 at 01:49:54PM -0700, vcap...@pengaru.com wrote:
> > On Fri, Jul 07, 2017 at 08:37:08PM +, Mantas Mikulėnas wrote:
> > > Back when that commit was made, didn't glibc cache the getpid() result in
> > > userspace? That would explain why it was not noticed.
> >
> > Hmm, this crossed my mind, and come to think of it I did a dist-upgrade
> > from Debian jessie to stretch overnight machine and haven't rebooted.
> > 
> > Perhaps the vdso isn't working and the costly getpid() is a red herring, 
> > will
> > reboot and retest to confirm.
> > 
> 
> It appears Debian has a glibc patch to disable the caching (I was unaware
> such an elaborate dance was being performed to cache this!)
> 
> https://anonscm.debian.org/cgit/pkg-glibc/glibc.git/commit/debian/patches/any?id=5850253f509604dd46a6131acc057ea26e1588ba

Do we know the justification for this patch?

> Unsure where I stand on core system software assuming certain syscalls are
> always going to be exceptionally cheap though...

Optimization is never in a vacuum. If glibc does something cheaply, it
seems reasonable to take advantage of it.

Zbyszek
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-07 Thread vcaputo
On Fri, Jul 07, 2017 at 01:49:54PM -0700, vcap...@pengaru.com wrote:
> On Fri, Jul 07, 2017 at 08:37:08PM +, Mantas Mikulėnas wrote:
> > Back when that commit was made, didn't glibc cache the getpid() result in
> > userspace? That would explain why it was not noticed.
>
> Hmm, this crossed my mind, and come to think of it I did a dist-upgrade
> from Debian jessie to stretch overnight machine and haven't rebooted.
> 
> Perhaps the vdso isn't working and the costly getpid() is a red herring, will
> reboot and retest to confirm.
> 

It appears Debian has a glibc patch to disable the caching (I was unaware
such an elaborate dance was being performed to cache this!)

https://anonscm.debian.org/cgit/pkg-glibc/glibc.git/commit/debian/patches/any?id=5850253f509604dd46a6131acc057ea26e1588ba

Unsure where I stand on core system software assuming certain syscalls are
always going to be exceptionally cheap though...
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-07 Thread vcaputo
On Fri, Jul 07, 2017 at 01:49:54PM -0700, vcap...@pengaru.com wrote:
> On Fri, Jul 07, 2017 at 08:37:08PM +, Mantas Mikulėnas wrote:
> > Back when that commit was made, didn't glibc cache the getpid() result in
> > userspace? That would explain why it was not noticed.
>
> Hmm, this crossed my mind, and come to think of it I did a dist-upgrade
> from Debian jessie to stretch overnight machine and haven't rebooted.
> 
> Perhaps the vdso isn't working and the costly getpid() is a red herring, will
> reboot and retest to confirm.

Nope, same getpid() performance hit after rebooting.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-07 Thread Mantas Mikulėnas
Back when that commit was made, didn't glibc cache the getpid() result in
userspace? That would explain why it was not noticed.

On Fri, Jul 7, 2017, 23:18  wrote:

> In doing some casual journalctl profiling and stracing, it became apparent
> that `journalctl -b --no-pager` runs across a significant quantity of logs,
> ~10% of the time was thrown away on getpid() calls due to commmit a65f06b.
>
> As-is:
>  # time ./journalctl -b --no-pager > /dev/null
>
>  real0m11.033s
>  user0m10.084s
>  sys 0m0.943s
>
>
> After changing journal_pid_changed() to simply return 1:
>  # time ./journalctl -b --no-pager > /dev/null
>
>   real0m9.641s
>   user0m9.449s
>   sys 0m0.191s
>
>
> More system time is being expended in repeated getpid() calls than write(),
> see the strace:
>
>  12:51:56.939287 write(1, "Jul 07 09:25:23 x61s unknown"..., 57) = 57
> <0.001276>
>  12:51:56.940633 getpid()= 10713 <0.32>
>  12:51:56.940732 getpid()= 10713 <0.12>
>  12:51:56.940801 getpid()= 10713 <0.32>
>  12:51:56.940867 getpid()= 10713 <0.41>
>  12:51:56.940942 getpid()= 10713 <0.41>
>  12:51:56.941047 getpid()= 10713 <0.12>
>  12:51:56.941117 getpid()= 10713 <0.12>
>  12:51:56.941185 getpid()= 10713 <0.11>
>  12:51:56.941253 getpid()= 10713 <0.11>
>  12:51:56.941320 getpid()= 10713 <0.39>
>  12:51:56.941395 getpid()= 10713 <0.41>
>  12:51:56.941494 getpid()= 10713 <0.11>
>  12:51:56.941561 getpid()= 10713 <0.12>
>  12:51:56.941629 getpid()= 10713 <0.39>
>  12:51:56.942942 write(1, "Jul 07 09:25:23 x61s unknown"..., 57) = 57
> <0.58>
>  12:51:56.943052 getpid()= 10713 <0.39>
>  12:51:56.943156 getpid()= 10713 <0.17>
>  12:51:56.943230 getpid()= 10713 <0.18>
>  12:51:56.943305 getpid()= 10713 <0.12>
>  12:51:56.943374 getpid()= 10713 <0.17>
>  12:51:56.943449 getpid()= 10713 <0.11>
>  12:51:56.943517 getpid()= 10713 <0.11>
>  12:51:56.943585 getpid()= 10713 <0.11>
>  12:51:56.943652 getpid()= 10713 <0.11>
>  12:51:56.943721 getpid()= 10713 <0.30>
>  12:51:56.943796 getpid()= 10713 <0.41>
>  12:51:56.943870 getpid()= 10713 <0.41>
>  12:51:56.943944 getpid()= 10713 <0.41>
>  12:51:56.944061 getpid()= 10713 <0.001334>
>  12:51:56.945459 write(1, "Jul 07 09:25:23 x61s unknown"..., 56) = 56
> <0.18>
>  12:51:56.945544 getpid()= 10713 <0.17>
>  12:51:56.945620 getpid()= 10713 <0.17>
>  12:51:56.945694 getpid()= 10713 <0.12>
>  12:51:56.945763 getpid()= 10713 <0.11>
>  12:51:56.945832 getpid()= 10713 <0.12>
>  12:51:56.945901 getpid()= 10713 <0.11>
>  12:51:56.945969 getpid()= 10713 <0.11>
>  12:51:56.946048 getpid()= 10713 <0.13>
>  12:51:56.946118 getpid()= 10713 <0.24>
>  12:51:56.946188 getpid()= 10713 <0.47>
>  12:51:56.946277 getpid()= 10713 <0.41>
>  12:51:56.946353 getpid()= 10713 <0.41>
>  12:51:56.946428 getpid()= 10713 <0.40>
>  12:51:56.946539 getpid()= 10713 <0.001363>
>
> As this is public sd-journal API, it's somewhat set in stone.  However,
> there's nothing preventing the systemd-internal tooling from linking with
> a less defensive/conformant underlying implementation shared with the
> public
> API implementation where these kinds of overheads can be preserved.
>
> For the curious; the logs being processed for this boot are 48 * 8MiB on
> SSD,
> 1.8Ghz Core2 Duo, 4.12 kernel.
>
> Regards,
> Vito Caputo
> ___
> systemd-devel mailing list
> systemd-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/systemd-devel
>
-- 

Mantas Mikulėnas 
Sent from my phone
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Significant performance loss caused by commit a65f06b: journal: return -ECHILD after a fork

2017-07-07 Thread vcaputo
Hmm, this crossed my mind, and come to think of it I did a dist-upgrade
from Debian jessie to stretch overnight machine and haven't rebooted.

Perhaps the vdso isn't working and the costly getpid() is a red herring, will
reboot and retest to confirm.


On Fri, Jul 07, 2017 at 08:37:08PM +, Mantas Mikulėnas wrote:
> Back when that commit was made, didn't glibc cache the getpid() result in
> userspace? That would explain why it was not noticed.
> 
> On Fri, Jul 7, 2017, 23:18  wrote:
> 
> > In doing some casual journalctl profiling and stracing, it became apparent
> > that `journalctl -b --no-pager` runs across a significant quantity of logs,
> > ~10% of the time was thrown away on getpid() calls due to commmit a65f06b.
> >
> > As-is:
> >  # time ./journalctl -b --no-pager > /dev/null
> >
> >  real0m11.033s
> >  user0m10.084s
> >  sys 0m0.943s
> >
> >
> > After changing journal_pid_changed() to simply return 1:
> >  # time ./journalctl -b --no-pager > /dev/null
> >
> >   real0m9.641s
> >   user0m9.449s
> >   sys 0m0.191s
> >
> >
> > More system time is being expended in repeated getpid() calls than write(),
> > see the strace:
> >
> >  12:51:56.939287 write(1, "Jul 07 09:25:23 x61s unknown"..., 57) = 57
> > <0.001276>
> >  12:51:56.940633 getpid()= 10713 <0.32>
> >  12:51:56.940732 getpid()= 10713 <0.12>
> >  12:51:56.940801 getpid()= 10713 <0.32>
> >  12:51:56.940867 getpid()= 10713 <0.41>
> >  12:51:56.940942 getpid()= 10713 <0.41>
> >  12:51:56.941047 getpid()= 10713 <0.12>
> >  12:51:56.941117 getpid()= 10713 <0.12>
> >  12:51:56.941185 getpid()= 10713 <0.11>
> >  12:51:56.941253 getpid()= 10713 <0.11>
> >  12:51:56.941320 getpid()= 10713 <0.39>
> >  12:51:56.941395 getpid()= 10713 <0.41>
> >  12:51:56.941494 getpid()= 10713 <0.11>
> >  12:51:56.941561 getpid()= 10713 <0.12>
> >  12:51:56.941629 getpid()= 10713 <0.39>
> >  12:51:56.942942 write(1, "Jul 07 09:25:23 x61s unknown"..., 57) = 57
> > <0.58>
> >  12:51:56.943052 getpid()= 10713 <0.39>
> >  12:51:56.943156 getpid()= 10713 <0.17>
> >  12:51:56.943230 getpid()= 10713 <0.18>
> >  12:51:56.943305 getpid()= 10713 <0.12>
> >  12:51:56.943374 getpid()= 10713 <0.17>
> >  12:51:56.943449 getpid()= 10713 <0.11>
> >  12:51:56.943517 getpid()= 10713 <0.11>
> >  12:51:56.943585 getpid()= 10713 <0.11>
> >  12:51:56.943652 getpid()= 10713 <0.11>
> >  12:51:56.943721 getpid()= 10713 <0.30>
> >  12:51:56.943796 getpid()= 10713 <0.41>
> >  12:51:56.943870 getpid()= 10713 <0.41>
> >  12:51:56.943944 getpid()= 10713 <0.41>
> >  12:51:56.944061 getpid()= 10713 <0.001334>
> >  12:51:56.945459 write(1, "Jul 07 09:25:23 x61s unknown"..., 56) = 56
> > <0.18>
> >  12:51:56.945544 getpid()= 10713 <0.17>
> >  12:51:56.945620 getpid()= 10713 <0.17>
> >  12:51:56.945694 getpid()= 10713 <0.12>
> >  12:51:56.945763 getpid()= 10713 <0.11>
> >  12:51:56.945832 getpid()= 10713 <0.12>
> >  12:51:56.945901 getpid()= 10713 <0.11>
> >  12:51:56.945969 getpid()= 10713 <0.11>
> >  12:51:56.946048 getpid()= 10713 <0.13>
> >  12:51:56.946118 getpid()= 10713 <0.24>
> >  12:51:56.946188 getpid()= 10713 <0.47>
> >  12:51:56.946277 getpid()= 10713 <0.41>
> >  12:51:56.946353 getpid()= 10713 <0.41>
> >  12:51:56.946428 getpid()= 10713 <0.40>
> >  12:51:56.946539 getpid()= 10713 <0.001363>
> >
> > As this is public sd-journal API, it's somewhat set in stone.  However,
> > there's nothing preventing the systemd-internal tooling from linking with
> > a less defensive/conformant underlying implementation shared with the
> > public
> > API implementation where these kinds of overheads can be preserved.
> >
> > For the curious; the logs being processed for this boot are 48 * 8MiB on
> > SSD,
> > 1.8Ghz Core2 Duo, 4.12 kernel.
> >
> > Regards,
> > Vito Caputo
> > ___
> > systemd-devel mailing list
> > systemd-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/systemd-devel
> >
> -- 
> 
> Mantas Mikulėnas 
> Sent from my phone
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org