On Thu, Jul 10, 2014 at 08:19:36AM -0700, James Bottomley wrote: > On Thu, 2014-07-10 at 14:47 +0100, Daniel P. Berrange wrote: > > On Thu, Jul 10, 2014 at 05:36:59PM +0400, Dmitry Guryanov wrote: > > > I have a question about mounts - in OpenVZ project each container has its > > > own > > > filesystem in an image file. So to start a container we mount this > > > filesystem > > > in host OS (because all containers share the same linux kernel). Is it a > > > security problem from the Openstack's developers vision? > > > > > > > > > I have this question, because libvirt's driver uses libguestfs to copy > > > some > > > files into guest filesystem instead of simple mount on host. Mounting > > > with > > > libguestfs is slower, then mount on host, so there should be strong > > > reasons, > > > why libvirt driver does it. > > > > We consider mounting untrusted filesystems on the host kernel to be > > an unacceptable security risk. A user can craft a malicious filesystem > > that expliots bugs in the kernel filesystem drivers. This is particularly > > bad if you allow the kernel to probe for filesystem type since Linux > > has many many many filesystem drivers most of which are likely not > > audited enough to be considered safe against malicious data. Even the > > mainstream ext4 driver had a crasher bug present for many years > > > > https://lwn.net/Articles/538898/ > > http://libguestfs.org/guestfs.3.html#security-of-mounting-filesystems > > Actually, there's a hidden assumption here that makes this statement not > necessarily correct for containers. You're assuming the container has > to have raw access to the device it's mounting. For hypervisors, this > is true, but it doesn't have to be for containers because the mount > operation is separate from raw read and write so we can allow or deny > them granularly.
I wasn't actually. In the Libvirt LXC case, Nova takes an image from glance and mounts it on the host, and then sets up the container to have its root at the filesystem on the host where it mounted the image. So the container does not have any raw block access, but Nova is still mounting an untrusted image from Glance in the host which is a risk. > Consider the old use case, where the container root is actually a > subdirectory of the host filesystem which gets bind mounted. The > container has no possibility of altering the underlying block device > there. For block roots, which we also do, at least in the VPS world, > they're mostly initialised by the hosting provider and the VPS > environment doesn't actually get to read or write directly to them > (there's often a block on this). Of course, they *can* be set up so the > VPS has raw access and I believe some are, but it's a choice not a > requirement. Where you could avoid the risk is if the image you're getting from glance is not in fact a filesystem, but rather a tar.gz of the container filesystem. Then Nova would simply be extracting the contents of the tar archive and not accessing an untrusted filessytem image from glance. IIUC, this is more or less what Docker does. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| _______________________________________________ OpenStack-dev mailing list OpenStackfirstname.lastname@example.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev