Re: [Qemu-devel] [PATCH for-1.7] seccomp: setting "-sandbox on" by default

Will Drewry Fri, 06 Dec 2013 07:43:07 -0800

On Fri, Dec 6, 2013 at 3:13 AM, Stefan Hajnoczi <stefa...@gmail.com> wrote:
> On Thu, Dec 05, 2013 at 10:12:00AM -0600, Will Drewry wrote:
>> On Thu, Dec 5, 2013 at 7:15 AM, Stefan Hajnoczi <stefa...@gmail.com> wrote:
>> > On Wed, Dec 04, 2013 at 11:21:12AM -0200, Eduardo Otubo wrote:
>> >> On 12/04/2013 07:39 AM, Stefan Hajnoczi wrote:
>> >> >On Fri, Nov 22, 2013 at 11:00:24AM -0500, Paul Moore wrote:
>> >> >>>Developers will only be happy with seccomp if it's easy and rewarding 
>> >> >>>to
>> >> >>>support/debug.
>> >> >>
>> >> >>Agreed.
>> >> >>
>> >> >>As a developer, how do you feel about the audit/syslog based approach I
>> >> >>mentioned earlier?
>> >> >
>> >> >I used the commands you posted (I think that's what you mean).  They
>> >> >produce useful output.
>> >> >
>> >> >The problem is that without an error message on stderr or from the
>> >> >shell, no one will think "QEMU process dead and hung == check seccomp"
>> >> >immediately.  It's frustrating to deal with a "silent" failure.
>> >>
>> >> The process dies with a SIGKILL, and sig handling in Qemu is hard to
>> >> implement due to dozen of external linked libraries that has their
>> >> own signal masks and conflicts with seccomp. I've already tried this
>> >> approach in the past (you can find in the list by searching for
>> >> debug mode)
>> >
>> > I now realize we may be talking past each other.  Dying with
>> > SIGKILL/SIGSYS is perfectly reasonable and I would be happy with that
>> > :-).
>> >
>> > But I think there's a bug in seccomp: a multi-threaded process can be
>> > left in a zombie state.  In my case the primary thread was killed by
>> > seccomp but another thread was deadlocked on a futex.
>> >
>> > The result is the process isn't quite dead yet.  The shell will not reap
>> > it and we're stuck with a zombie.
>> >
>> > I can reproduce it reliably when I run "qemu-system-x86_64 -sandbox on"
>> > on Fedora 20 (qemu-system-x86-1.6.1-2).
>> >
>> > Should seccomp use do_group_exit() for SIGKILL?
>>
>> Is the problem that the SECCOMP_RET_KILL didn't take down the thread
>> group (which would be a departure from how seccomp(mode=1) worked) and
>> causes the deadlock somehow, or is it that the other thread is
>> deadlocked?
>
> The former.
>
> When the first thread is killed by seccomp, the second thread in the
> process is left waiting on a futex forever.  Therefore the process never
> exits after the seccomp violation occurs.
>
> Directing the signal at a thread makes perfect sense for
> SECCOMP_RET_TRAP since the thread can handle the signal and recover.
> But for SECCOMP_RET_KILL it's probably more useful to kill the entire
> process rather than just a single thread.


Would it be possible to just have the offending thread die with
SECCOMP_RET_TRAP then have a SIGSYS handler that calls tgkill?

>
>> Regardless, adding a SECCOMP_RET_TGKILL probably isn't a bad idea :)
>
> Yes.  Do you have time for that or would you like me to send a patch?

A straight SECCOMP_RET_TGKILL code could be a bit awkward, but it
might make sense to add tgkill as "data" OR'd with RET_KILL since
those 16 bits of data are still unused.  I didn't have a clear plan
for those bits with RET_KILL, but I think they would be fair game for
this sort of extension if it is really impractical to use a trap.
I'll play with a patch next week, but I'd be happy to see alternative
approaches too!  (I know I've been kicking around a separate patch for
apply per-task behavioral flags for filters that this could fit in
too, but there will be some ABI challenges that have led me to
approach it _very_ slowly.)

thanks!

Re: [Qemu-devel] [PATCH for-1.7] seccomp: setting "-sandbox on" by default

Reply via email to