On Fri, Dec 6, 2013 at 3:13 AM, Stefan Hajnoczi <stefa...@gmail.com> wrote: > On Thu, Dec 05, 2013 at 10:12:00AM -0600, Will Drewry wrote: >> On Thu, Dec 5, 2013 at 7:15 AM, Stefan Hajnoczi <stefa...@gmail.com> wrote: >> > On Wed, Dec 04, 2013 at 11:21:12AM -0200, Eduardo Otubo wrote: >> >> On 12/04/2013 07:39 AM, Stefan Hajnoczi wrote: >> >> >On Fri, Nov 22, 2013 at 11:00:24AM -0500, Paul Moore wrote: >> >> >>>Developers will only be happy with seccomp if it's easy and rewarding >> >> >>>to >> >> >>>support/debug. >> >> >> >> >> >>Agreed. >> >> >> >> >> >>As a developer, how do you feel about the audit/syslog based approach I >> >> >>mentioned earlier? >> >> > >> >> >I used the commands you posted (I think that's what you mean). They >> >> >produce useful output. >> >> > >> >> >The problem is that without an error message on stderr or from the >> >> >shell, no one will think "QEMU process dead and hung == check seccomp" >> >> >immediately. It's frustrating to deal with a "silent" failure. >> >> >> >> The process dies with a SIGKILL, and sig handling in Qemu is hard to >> >> implement due to dozen of external linked libraries that has their >> >> own signal masks and conflicts with seccomp. I've already tried this >> >> approach in the past (you can find in the list by searching for >> >> debug mode) >> > >> > I now realize we may be talking past each other. Dying with >> > SIGKILL/SIGSYS is perfectly reasonable and I would be happy with that >> > :-). >> > >> > But I think there's a bug in seccomp: a multi-threaded process can be >> > left in a zombie state. In my case the primary thread was killed by >> > seccomp but another thread was deadlocked on a futex. >> > >> > The result is the process isn't quite dead yet. The shell will not reap >> > it and we're stuck with a zombie. >> > >> > I can reproduce it reliably when I run "qemu-system-x86_64 -sandbox on" >> > on Fedora 20 (qemu-system-x86-1.6.1-2). >> > >> > Should seccomp use do_group_exit() for SIGKILL? >> >> Is the problem that the SECCOMP_RET_KILL didn't take down the thread >> group (which would be a departure from how seccomp(mode=1) worked) and >> causes the deadlock somehow, or is it that the other thread is >> deadlocked? > > The former. > > When the first thread is killed by seccomp, the second thread in the > process is left waiting on a futex forever. Therefore the process never > exits after the seccomp violation occurs. > > Directing the signal at a thread makes perfect sense for > SECCOMP_RET_TRAP since the thread can handle the signal and recover. > But for SECCOMP_RET_KILL it's probably more useful to kill the entire > process rather than just a single thread.
Would it be possible to just have the offending thread die with SECCOMP_RET_TRAP then have a SIGSYS handler that calls tgkill? > >> Regardless, adding a SECCOMP_RET_TGKILL probably isn't a bad idea :) > > Yes. Do you have time for that or would you like me to send a patch? A straight SECCOMP_RET_TGKILL code could be a bit awkward, but it might make sense to add tgkill as "data" OR'd with RET_KILL since those 16 bits of data are still unused. I didn't have a clear plan for those bits with RET_KILL, but I think they would be fair game for this sort of extension if it is really impractical to use a trap. I'll play with a patch next week, but I'd be happy to see alternative approaches too! (I know I've been kicking around a separate patch for apply per-task behavioral flags for filters that this could fit in too, but there will be some ABI challenges that have led me to approach it _very_ slowly.) thanks!