On Thu, Dec 05, 2013 at 10:12:00AM -0600, Will Drewry wrote: > On Thu, Dec 5, 2013 at 7:15 AM, Stefan Hajnoczi <stefa...@gmail.com> wrote: > > On Wed, Dec 04, 2013 at 11:21:12AM -0200, Eduardo Otubo wrote: > >> On 12/04/2013 07:39 AM, Stefan Hajnoczi wrote: > >> >On Fri, Nov 22, 2013 at 11:00:24AM -0500, Paul Moore wrote: > >> >>>Developers will only be happy with seccomp if it's easy and rewarding to > >> >>>support/debug. > >> >> > >> >>Agreed. > >> >> > >> >>As a developer, how do you feel about the audit/syslog based approach I > >> >>mentioned earlier? > >> > > >> >I used the commands you posted (I think that's what you mean). They > >> >produce useful output. > >> > > >> >The problem is that without an error message on stderr or from the > >> >shell, no one will think "QEMU process dead and hung == check seccomp" > >> >immediately. It's frustrating to deal with a "silent" failure. > >> > >> The process dies with a SIGKILL, and sig handling in Qemu is hard to > >> implement due to dozen of external linked libraries that has their > >> own signal masks and conflicts with seccomp. I've already tried this > >> approach in the past (you can find in the list by searching for > >> debug mode) > > > > I now realize we may be talking past each other. Dying with > > SIGKILL/SIGSYS is perfectly reasonable and I would be happy with that > > :-). > > > > But I think there's a bug in seccomp: a multi-threaded process can be > > left in a zombie state. In my case the primary thread was killed by > > seccomp but another thread was deadlocked on a futex. > > > > The result is the process isn't quite dead yet. The shell will not reap > > it and we're stuck with a zombie. > > > > I can reproduce it reliably when I run "qemu-system-x86_64 -sandbox on" > > on Fedora 20 (qemu-system-x86-1.6.1-2). > > > > Should seccomp use do_group_exit() for SIGKILL? > > Is the problem that the SECCOMP_RET_KILL didn't take down the thread > group (which would be a departure from how seccomp(mode=1) worked) and > causes the deadlock somehow, or is it that the other thread is > deadlocked?
The former. When the first thread is killed by seccomp, the second thread in the process is left waiting on a futex forever. Therefore the process never exits after the seccomp violation occurs. Directing the signal at a thread makes perfect sense for SECCOMP_RET_TRAP since the thread can handle the signal and recover. But for SECCOMP_RET_KILL it's probably more useful to kill the entire process rather than just a single thread. > Regardless, adding a SECCOMP_RET_TGKILL probably isn't a bad idea :) Yes. Do you have time for that or would you like me to send a patch? Stefan