On Tue, Feb 03, 2015 at 03:41:22PM +0100, Lennart Poettering wrote: > On Tue, 30.12.14 06:49, Simon Peeters (peeters.si...@gmail.com) wrote: > > > 2014-12-29 14:14 GMT+00:00 Tom Gundersen <t...@jklm.no>: > > > On Mon, Dec 29, 2014 at 2:34 PM, Lennart Poettering > > > <lenn...@poettering.net> wrote: > > <snip> > > >> I am open to adding support for this, but I think the allocation of > > >> the UID ranges should really happen automatically, and not be > > >> something the admin has to manually assign. > > >> > > >> Which means we'd enter dynamic UID allocation terroritory, and that > > >> opens a huge can of worms... > > > > > > Would we not also need to support explicit assignment, in case someone > > > has a preexisting image they want to match in a specific way? In that > > > case we could start off without the dynamic allocation and add that > > > later. It certainly would make testing a lot simpler if we had userns > > > support sooner rather than later (at least in the case of netlink it > > > appears to be quite a mess). > > > > Inspired by this topic I wrote a quick'n'dirty uid allocator[1] > > this allocator manages the upper 2G uid's, which using Matthias Urlichs > > example > > of 2048 uid's per container, still allows for 1M containers. > > > > It curently can't persist these allocations, but that is on my > > "0.0.1" todolist. > > Hmm, so, I thought a lot about this in the past weeks. I think the way > I'd really like to see this work in the end is that we never have to > persist the UID mappings. This could work if the kernel would provide > us with the ability to bind mount a file system into the container > applying a fixed UID shift. That way, the shifted UIDs would never hit > the actual disk, and hence we wouldn't have to persist their mappings. > > Instead on each container startup we'd look for a new UID range, and > release it entirely when the container shuts down. The bind mount with > UID shift would then shift the UIDs up, the userns stuff would shift > it down from inside the container again. > > Of course, this all depends on whether the kernel will get an > extension to apply uid shifts to bind mounts. I hear they want to > provide this, but let's see.
I would dearly love to see that happen. Having to recursively change the UID/GID on entire filesystem sub-trees given to containers with userns is a real unpleasant thing to have to deal with. I'd not want the filesystem UID shift to only apply to bind mounts though. It is not uncommon to use a disk image[1] for a container's filesystem, so being able to request a UID shift on *any* filesystem mount is pretty desirable, rather than having to mount the image and then bind mount it onto itself just to apply the UID shift. Regards, Daniel [1] Using a separate disk image per container means a container can't DOS other containers by exhausting inodes for example with $millions of small files. -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| _______________________________________________ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel