Date: Sun, 11 Mar 2018 11:06:33 +0100
From: Martin Husemann <mar...@duskware.de>
| I don't get this part - how would we end up with the new process using
| the same pid?
>From what I can see from glancing at the code, the issue is an attempt to
monitor an unrelated process - one that is neither a child, nor being ptrace'd.
That process can exit, its zombie be cleaned up, and then a new process
created which happens to have the same pid as the previous one had
(most of this is intended to happen quite quickly, but there's no guarantee
of that - the process doing the monitoring could be suspended and wake up
days after the process it was looking for vanished).
All this is inherantly unreliable, and nothing that is done, beyond adding a
whole new mechanism to hold a process that some other process has an
interest it, will ever fix it.
Kamil: What I don't understand is how you were ever getting the process
returned twice? You're using the sysctl to look for a specific pid right.?
When that pid is found, the sysctl code should simply copy out the relevant
datea, and return. There's no point searching further in the lists, one pid
can only exist once at a time - once found it is found....
If that is not the way the sysctl lookup code is working then we should
probably fix it. There cannot be 2 processes with pid N at any one
instant, so looking for a specific pid should only ever be able to return 1
(or "not found" of course).
It is possible to not find it, depending on what kind of locking the finding
code is doing (for this, just being an "observation" interface, I'd assume the
minimum possible) even though it exists, if the lists are changing underneath
the search - but given the nature of what happens to a process, a search
of zombproc, allproc, zombproc (stopping when found) will either find the
process or the process does not exist - and possibly just allproc followed
by zombproc searches would work as well.