On Mon, 10 Jul 2017, Lennart Poettering wrote:
On Sat, 08.07.17 16:24, Michael Chapman (m...@very.puzzling.org) wrote:

On Sat, 8 Jul 2017, vcap...@pengaru.com wrote:
In doing some casual journalctl profiling and stracing, it became apparent
that `journalctl -b --no-pager` runs across a significant quantity of logs,
~10% of the time was thrown away on getpid() calls due to commmit a65f06b.

As-is:
# time ./journalctl -b --no-pager > /dev/null

real    0m11.033s
user    0m10.084s
sys     0m0.943s


After changing journal_pid_changed() to simply return 1:
# time ./journalctl -b --no-pager > /dev/null

 real    0m9.641s
 user    0m9.449s
 sys     0m0.191s

[...]

As this is public sd-journal API, it's somewhat set in stone.

So it's arguable whether making an API work in _more_ situations than it
previously did is a "breaking" change.

I've tried to go through the history for the various *_pid_changed()
functions in the APIs systemd presents, and I'm struggling to find a good
justification for them. It seems like it was originally added for sd-bus in:

  
https://github.com/systemd/systemd/commit/d5a2b9a6f455468a0f29483303657ab4fd7013d8

And then other APIs copied it to be consistent with sd-bus:

  
https://github.com/systemd/systemd/commit/a65f06bb27688a6738f2f94b7f055f4c66768d63
  
https://github.com/systemd/systemd/commit/eaa3cbef3b8c214cd5c2d75b04e70ad477187e17
  
https://github.com/systemd/systemd/commit/adf412b9ec7292e0c83aaf9ab93e08c2c8bd524a

Unfortunately none of these commits describe what will go wrong if one of
these APIs is used across fork. Does anybody know what specifically is the
problem being addressed here? Can we detect this problem in some
other way?

This all stems from my experiences with PulseAudio back in the day:
People do not grok the effect of fork(): it only duplicates the
invoking thread, not any other threads of the process, moreover all
data structures are copied as they are, and that's a time bomb really:
consider one of our context objects is being used by one thread at the
moment another thread invokes fork(): the thread using the object is
busy making changes to the object, rearranging some datastructure (for
example, rehashing a hash table, because it hit its fill limit) and
suchlike. Now the fork() happens while it is doing that: the data
structure will be copied in its half-written, half-updated status quo,
and in the child process there's no thread that could finish what has
been started, and there's neither a way to rollback the changes that
are in progress.
[...]

Thanks, that really does clear things up.

It's a pity glibc doesn't provide an equivalent for pthread_atfork() outside of the pthread library. Having a notification that a fork has just occurred would allow us to do the PID caching ourselves.

Of course, there's still a problem with people calling the clone syscall directly... but I think once people start doing that we have to trust them to know what they're doing.
_______________________________________________
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Reply via email to