Hi!

I have noticed that when a QEMU build from git master is started with
"-seccomp on", the seccomp policy is only applied to the main thread,
the vcpu worker thread and the VNC thread (I'm using VNC in my
config); the seccomp policy is not applied to e.g. the RCU thread
because it is created before the seccomp policy is applied and
SECCOMP_FILTER_FLAG_TSYNC isn't used. In practice, this makes the
seccomp policy essentially useless. You'll probably want to add
something like seccomp_attr_set(ctx, SCMP_FLTATR_CTL_TSYNC, 1) for
systems with kernel >=3.17; if you have to support seccomp on older
kernels, you'll have to either construct the other threads after
initializing seccomp or manually activate seccomp on those threads.


This shows up in strace as follows:
$ strace -f -e trace=seccomp,clone,prctl
x86_64-softmmu/qemu-system-x86_64 -enable-kvm -drive
file=isos/debian-live-8.7.1-amd64-standard.iso,format=raw -m 4G
-sandbox on -name testvm,debug-threads=on
clone(child_stack=0x7f199af49bf0,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7f199af4a9d0, tls=0x7f199af4a700,
child_tidptr=0x7f199af4a9d0) = 10948
strace: Process 10948 attached
[pid 10947] prctl(PR_MCE_KILL, PR_MCE_KILL_SET, PR_MCE_KILL_EARLY, 0, 0) = 0
[pid 10947] clone(child_stack=0x7f199a748bf0,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7f199a7499d0, tls=0x7f199a749700,
child_tidptr=0x7f199a7499d0) = 10949
strace: Process 10949 attached
[pid 10947] prctl(PR_SET_TIMERSLACK, 1) = 0
[pid 10947] prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) = 0
[pid 10947] seccomp(SECCOMP_SET_MODE_STRICT, 1, NULL) = -1 EINVAL
(Invalid argument)
[pid 10947] seccomp(SECCOMP_SET_MODE_FILTER, 0, {len=39,
filter=0x55a968927830}) = 0
[pid 10947] clone(strace: Process 10950 attached
child_stack=0x7f1999d65bf0,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7f1999d669d0, tls=0x7f1999d66700,
child_tidptr=0x7f1999d669d0) = 10950
[pid 10950] prctl(PR_SET_NAME, "CPU 0/KVM") = 0
[pid 10947] clone(strace: Process 10952 attached
child_stack=0x7f1993bfebf0,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7f1993bff9d0, tls=0x7f1993bff700,
child_tidptr=0x7f1993bff9d0) = 10952
[pid 10952] prctl(PR_SET_NAME, "worker") = 0
[pid 10947] clone(strace: Process 10953 attached
child_stack=0x7f19933fdbf0,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7f19933fe9d0, tls=0x7f19933fe700,
child_tidptr=0x7f19933fe9d0) = 10953
[pid 10953] prctl(PR_SET_NAME, "vnc_worker") = 0
VNC server running on ::1:5900
[pid 10952] +++ exited with 0 +++


Checking in procfs confirms that two of the threads aren't sandboxed:

$ for task in /proc/10947/task/*; do cat $task/comm; grep Seccomp
$task/status; done
qemu-system-x86
Seccomp: 2
qemu-system-x86
Seccomp: 0
qemu-system-x86
Seccomp: 0
CPU 0/KVM
Seccomp: 2
vnc_worker
Seccomp: 2

Reply via email to