On Fri, Jan 27, 2017 at 01:48:30PM -0800, Kees Cook wrote: > On Wed, Jan 25, 2017 at 12:05 PM, Kees Cook <keesc...@chromium.org> wrote: > > On Tue, Jan 24, 2017 at 4:53 PM, Andrei Vagin <ava...@virtuozzo.com> wrote: > >> Hi, > >> > >> One of CRIU tests fails with this patch: > >> https://github.com/xemul/criu/blob/master/test/zdtm/static/seccomp_filter_tsync.c > >> > >> Before this patch only a thread which called a "wrong" syscall is killed. > >> Now a whole process is killed if one of threads called a "wrong" syscall. > > > > Oh ew. I wonder what is causing this? In other do_coredump() callers, > > they explicitly call do_group_exit(). Hmmm > > We need to find a way to fix this or remove the coredump change from > -next. We have a few things that have come up recently (coincident > with the coredump change): > > - some folks would like seccomp kills to kill the entire process not > just the thread > - on a full-process kill, there needs to be a way to get a coredump > - on a kill, it would be nice to have reliable logging > > Getting a coredump requires a full-process kill. It is possible to do > this already with RET_TRAP and just not catch the SIGSYS. However, > this isn't sufficient if you want to be _sure_ the entire process gets > killed since RET_TRAP depends on cooperation from the process. > > Getting reliable logging out of seccomp for non-RET_KILL is > non-trivial because syscall-audit doesn't track forks. > > The RET_* values are part of the UAPI, so changing or adding to them > requires care. > > Right now we have very little room in the RET_* values (the lower > bytes are for the RET_DATA which is ignored for RET_KILL and the > semantics of changing that is very difficult): > > #define SECCOMP_RET_KILL 0x00000000U /* kill the task immediately */ > #define SECCOMP_RET_TRAP 0x00030000U /* disallow and force a SIGSYS */ > > Killing the entire process is more aggressive than RET_KILL currently, > so the question becomes, should we upgrade RET_KILL to > RET_KILL_PROCESS and add RET_KILL_THREAD? Are there people that WANT > only a thread to be killed? Andrei, does CRIU depend on this behavior, > or is it "just" a regression test detail?
It is just a regression test detail. We can change the test if you decide to kill a whole process. > > We can add two more RET_* values... however, it seems like only thread > vs process is a filter-level concept. Whether or not to dump core > seems to be an external aspect of the process (think ulimit -c), and > logging seems to be a system-level thing. > > For logging, I think audit needs to grow fork-tracking, and/or have a > new "is under seccomp" test that can be exposed to auditctl. Then the > system owner can issue either "tell me about all seccomp kills" or > "tell me about seccomp kills in this process tree". As such, I don't > think we should be making filter-level changes to deal with the needs > of seccomp logging. > > For coredumping, I remain a bit stumped. Strictly speaking, > coredumping exposes more kernel code to the running process, so it is > "less safe" than the instant kill RET_KILL performs. Though the > current patch is actually more aggressive in that it causes the entire > process to die as part of the coredumping. > > I think that if there is a move to make RET_KILL kill the process, > then we can add coredumping. If not, I'm less sure how to proceed... > > -Kees > > -- > Kees Cook > Nexus Security