https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=293183
Bug ID: 293183
Summary: pwait returns before process actually terminates
Product: Base System
Version: 15.0-RELEASE
Hardware: amd64
OS: Any
Status: New
Severity: Affects Some People
Priority: ---
Component: bin
Assignee: [email protected]
Reporter: [email protected]
Perhaps I'm "holding it wrong," but it looks like pwait exits a split-second
before the target process has actually terminated.
I have a few custom rc.d scripts to handle various daemons. These scripts use
the built-in stop/restart functionality of rc.subr, which uses wait_for_pids to
wait until the daemon process has fully stopped before trying to start it
again.
Occasionally (maybe a bit less than half the time?) running `service foo
restart` returns an error (during the "start" phase) that my daemon is already
running (and shows the old PID, which should be stopped.
After some investigation, it looks like pwait is actually returning a split
second before the target process actually terminates and is removed from the
process table. This appears to be some kind of race condition.
I can reproduce fairly reliably.
I have a process running under pid 97891:
# ps -p 97891
PID TT STAT TIME COMMAND
97891 - SJ 0:01.19 /usr/local/invidious/invidious.git/invidious
Now, let's terminate it and use pwait to wait until it's terminated. Right
after the pwait returns, we'll run `ps` to show that the process is, in fact,
STILL THERE!
# kill 97891; pwait 97891; ps -p 97891
PID TT STAT TIME COMMAND
97891 - REJ 0:01.20 /usr/local/invidious/invidious.git/invidious
One second later, the process is indeed gone:
# kill -0 97891
kill: 97891: No such process
Unfortunately this breaks rc.subr's "restart" functionality, because "service
foo stop" returns before the daemon is actually terminated, causing "service
foo start" to fail.
I have hacked around the problem with the following poststop function in my
rc.d script:
invidious_companion_poststop(){
for i in $(seq 5); do
pid=$(check_pidfile "$pidfile" "$procname")
if [ -z "$pid" ]; then
return
else
echo "pwait bug..."
sleep 1
fi
done
}
--
You are receiving this mail because:
You are the assignee for the bug.