Re: s6-svscan does not terminate after SIGTERM under amd64 (Docker) qemulation

2022-02-06 Thread Saj Goonatilleke via skaware

On 4/2/2022 23:00, Laurent Bercot wrote:

  So, SIGTERM does nothing because posix_spawn() is lying to s6-svscan,
pretending to have succeeded when it doesn't know it yet (and is going
to fail), and goading it into not doing anything. And it's lying
because something in qemu is messing with the semantics of CLONE_VFORK.


Amazing.  Thank you, Laurent!  It probably would have taken me a while 
to turn my suspicions on such a crucial library function.


For any future s6 users who happen to stumble upon this thread in an 
archive search, the upstream qemu bug appears to be here:


https://gitlab.com/qemu-project/qemu/-/issues/140#note_567553059


  I'm sorry, but I don't have an easy solution for you


No problem at all.  Given the proximate cause, I can avoid the fault 
without too much trouble.


Sometimes, the little devil on my shoulder yelling 'that is not right' 
makes it hard to ignore even the most practically insignificant bugs. 
(Not that posix_spawn() is, in any way, insignificant, but we do not use 
qemu in production.)  Your comprehensive reply help put my mind at ease.


Thanks again -- for this, and for s6.


Re: s6-svscan does not terminate after SIGTERM under amd64 (Docker) qemulation

2022-02-04 Thread Laurent Bercot

 I stumbled on what might be an odd bug.


 Hi Saj,

 Thank you for such a detailed bug-report!



The call to posix_spawn(3) appears evident there, but there is no evidence of 
any follow-up to the failure.  s6-svscan should call term() and die soon 
afterwards but that never happens.

Repeating the test with no .s6-svscan/SIGTERM file produced a similar result 
(albeit with errno=2 on exec).

At this stage, I am unsure where the problem lies.  It may not be in s6.


 Congratulations: you have found a bug in qemu! (unfortunately, there
are many of those.)

 posix_spawn(3) uses clone(CLONE_VM|CLONE_VFORK) + execve().
 Normally, when clone() is run with the CLONE_VFORK option, the
parent immediately stops execution (the clone() call doesn't even
return) until the child has completed, or failed, its execve().
Then the parent can resume, and check whether the child has succeeded
and is now running the new process.

 You can check the glibc source code doing it here:
https://elixir.bootlin.com/glibc/latest/source/sysdeps/unix/sysv/linux/spawni.c#L373

 But this is not what is happening here. Your strace shows that
clone() returns 25 before the child thread runs, and the
parent execution returns to s6-svscan: read(6, 0x1808550,128) is
s6-svscan reading on its signalfd, checking for another signal,
and returning to its ppoll() loop when it fails.

 From s6-svscan's point of view, posix_spawn() has succeeded, so
the SIGTERM is being handled and the default term() routine should
not be called - even though the child hasn't execve()'d yet. And when
the execve() happens and fails, it's just the child process terminating,
nothing to see here, nothing to do but wait4() it.

 So, SIGTERM does nothing because posix_spawn() is lying to s6-svscan,
pretending to have succeeded when it doesn't know it yet (and is going
to fail), and goading it into not doing anything. And it's lying
because something in qemu is messing with the semantics of CLONE_VFORK.

 I'm sorry, but I don't have an easy solution for you, apart from
fixing qemu (which I'm sure everyone here has already done twice
before breakfast).

 The only workaround I can suggest is to rebuild the s6 stack
(skalibs/execline/s6) with posix_spawn() disabled; to do that, add
"--with-sysdep-posixspawn=no" to the ./configure command line when
building skalibs. But of course, that makes it impossible to use
prebuilt packages.

--
 Laurent