Hi Ludo! Ludovic Courtès <[email protected]> writes:
> Hello, > > Maxim Cournoyer <[email protected]> skribis: > >> + herd -s t-socket-1651 status root >> Started: >> + root >> + herd -s t-socket-1651 stop root >> ++ cat t-pid-1651 >> + kill 1896 >> + exit 1 >> + rm -f t-socket-1651 >> + test -f t-pid-1651 >> ++ cat t-pid-1651 >> + kill 1896 >> + rm -f t-pid-1651 >> FAIL tests/no-home.sh (exit status: 1) > > What happens here is that the shepherd process is still alive after > ‘herd stop root’ has completed, contrary to what’s expected: > > $herd stop root > > if kill `cat "$pid"` > then > exit 1 > fi Yes! [...] > Maybe there’s a chance that the shell hasn’t processed the shepherd’s > SIGCHLD when it evaluates the “if kill `cat "$pid"`” condition; in that > case, the shepherd process still exists as a zombie. > > A more robust approach might be to use the shell’s builtin ‘wait’, > because then I suppose the shell will be forced to process pending > SIGCHLDs: > > diff --git a/tests/no-home.sh b/tests/no-home.sh > index 85b6116..5a8c278 100644 > --- a/tests/no-home.sh > +++ b/tests/no-home.sh > @@ -1,5 +1,5 @@ > # GNU Shepherd --- Make sure shepherd doesn't fail when $HOME is not > writable. > -# Copyright © 2014, 2016 Ludovic Courtès <[email protected]> > +# Copyright © 2014, 2016, 2022 Ludovic Courtès <[email protected]> > # > # This file is part of the GNU Shepherd. > # > @@ -46,7 +46,4 @@ kill -0 `cat "$pid"` > $herd status root > $herd stop root > > -if kill `cat "$pid"` > -then > - exit 1 > -fi > +wait `cat "$pid"` As I wrote, I was also unable to reproduce this (but when I had a high load of packages to build at the same time, I could get it to happen a couple times upon retrying). Your analysis (and the narrow window which would allow for a failure) makes sense to me, along with the proposed fix. I think you should commit it and tentatively mark this bug as fixed :-). Thank you for looking into it! Maxim
