El mar., 30 abr. 2019 a las 5:55, Laurent Bercot escribió: > > >haven't you claimed process #1 should supervise long running > >child processes ? runit fulfils exactly this requirement by > >supervising the supervisor. > > Not exactly, no. > If something kills runsvdir, then runit immediately enters > stage 3, and reboots the system. This is an acceptable response > to the scanner dying, but is not the same thing as supervising > it. If runsvdir's death is accidental, the system goes through > an unnecessary reboot.
If the /etc/runit/2 process exits with code 111 or gets killed by a signal, the runit program is actually supposed to respawn it, according to its man page. I believe this counts as supervising at least one process, so it would put runit in the "correct init" camp :) There is code that checks the 'wstat' value returned by a wait_nohang(&wstat) call that reaps the /etc/runit/2 process, however, it is executed only if wait_exitcode(wstat) != 0. On my computer, wait_exitcode() returns 0 if its argument is the wstat of a process killed by a signal, so runit indeed spawns /etc/runit/3 instead of respawning /etc/runit/2 when, for example, I point a gun at runsvdir on purpose and use a kill -int command specifying its PID. Changing the condition to wait_crashed(wstat) || (wait_exitcode(wstat) != 0) makes things work as intended. G.
