TL;DR: As we've said a few times already, privileged containers shouldn't be considered root safe, here's one more example of that. Please use unprivileged containers whenever possible or if you can't, don't trust anyone with root in your containers!
Hey everyone, I'm sure some of you saw the exploit posted at: http://stealth.openwall.net/xSports/shocker.c This was designed to show how to escape a standard docker container (running docker 0.11) with a standard kernel. It can be adapted to apply to LXC by changing the /.dockerinit to some other valid path inside your container. Now as for how this affects LXC 1.0 and higher: - The exploit doesn't work with unprivileged containers, which are the only kind of containers which you should ever give root access to users you wouldn't trust with root access to the host. In those containers, the kernel returns EPERM as expected and are therefore NOT AFFECTED. - The exploit will work with privileged containers if: - There's any bind-mount setup from the partition the exploit is trying to access. That means that if you have a separate /home partition on your host and bind-mount /home/user inside the container, this attack will only let you access files within /home of the host. - The open_by_handle_at syscall isn't blocked by a seccomp policy. - The CAP_DAC_READ_SEARCH capability wasn't dropped. Due to the need to use Apparmor in disconnected mode to workaround some limitations of its policies, there's currently no way for us to prevent this kind of access. However the Apparmor team has been contacted and they have work scheduled to address this kind of issue in the near future. I haven't been able to check whether using SELinux prevents this attack. Recommended ways to mitigate this specific issue are: - If at all possibles, run your workloads in unprivileged containers. - If using privileged containers, assume root in the container is the same as root outside of it, so avoid running tasks as root. - If you need to run untrusted tasks as root in the container, either use seccomp to block open_by_handle_at (make a blacklist policy file and set lxc.seccomp to its path) or add lxc.cap.drop = dac_read_search to your config. Note that both of those options may cause some userspace failures. In my tests I didn't spot any obvious one but that was basically just creating, starting and stopping a container. In general, if your distribution supports it, make sure to run privileged LXC containers under AppArmor as it does prevent all the other attacks we've been made aware of so far (though we expect there are a few more we haven't heard of yet...). The same is probably true of SELinux, however my knowledge there is pretty limited, so maybe Dwight can give a quick update on the state of things. Additionally, Serge is currently working on a default seccomp profile which will block syscalls we know to be problematic in privileged containers. I'm planning on getting this changeset into the stable branch and tag 1.0.5 once we're happy with them. Unless templates or distro maintainers oppose to it, I'd like that seccomp profile to be set by default by all templates (or for those using the new style configs, in their respective .common.conf). This would only apply to privileged containers. -- Stéphane Graber Ubuntu developer http://www.ubuntu.com
signature.asc
Description: Digital signature
_______________________________________________ lxc-users mailing list [email protected] http://lists.linuxcontainers.org/listinfo/lxc-users
