On Feb 16, 2014, at 8:30 PM, Stéphane Graber <stgra...@ubuntu.com> wrote:
> On Sun, Feb 16, 2014 at 08:22:40PM -0500, Brian Campbell wrote: >> >> On Feb 16, 2014, at 12:53 PM, Stéphane Graber <stgra...@ubuntu.com> wrote: >> >>> On Sun, Feb 16, 2014 at 12:49:44PM -0500, Brian Campbell wrote: >>>> On Feb 16, 2014, at 12:23 PM, Stéphane Graber <stgra...@ubuntu.com> wrote: >>>> >>>>> On Sun, Feb 16, 2014 at 03:51:50AM -0500, Brian Campbell wrote: >>>>>> I'm running Debian Jessie (testing), and compiled lxc from a fresh git >>>>>> clone (7da8ab1: close inherited fds when we still have proc mounted). I >>>>>> would like to create a user container without using root privileges, so >>>>>> I set up UID mappings such that my user ID would map to root within the >>>>>> container. From what I can tell, this is all that should be necessary to >>>>>> get it to use user namespaces to operate unprivileged: >>>>>> >>>>>> lambda@gherkin:lxc$ cat ~/.config/lxc/default.conf >>>>>> lxc.id_map = u 0 1000 9999 >>>>>> lxc.id_map = g 0 1000 9999 >>>>>> lambda@gherkin:lxc$ id >>>>>> uid=1000(lambda) gid=1000(lambda) >>>>>> groups=1000(lambda),24(cdrom),25(floppy),27(sudo),29(audio),30(dip),44(video),46(plugdev),104(scanner),109(bluetooth),112(netdev),125(vboxusers) >>>>> >>>>> From the above, it seems like you didn't configure /etc/subuid and >>>>> /etc/subgid. Without those (and a version of the shadow package which >>>>> supports them), you won't be able to switch to those UID ranges. >>>> >>>> Nope, I haven't done anything with them, and it looks like Debian's passwd >>>> doesn't have subuid/subgid support. Taking a look at the Ubuntu changelog, >>>> it looks like they were added as a patch to the Ubuntu package in >>>> 1:4.1.5.1-1ubuntu5. Is there a Debian package already available for this, >>>> or should I try to extract the patches from the Ubuntu package and build >>>> my own? >>>> >>>> Ah, looks like I should have read this: >>>> https://s3hh.wordpress.com/2013/07/19/creating-and-using-containers-without-privilege/ >>>> before trying this; all I had seen was >>>> https://www.mail-archive.com/lxc-users@lists.sourceforge.net/msg05859.html >>>> which didn't mention anything about /etc/subuid and /etc/subgid. >>> >>> The shadow change was submitted to Debian at the same time we pushed it >>> to Ubuntu, but last I checked it was still in an unreleased git >>> branch... >> >> Ah, the Debian packaging is in Git. The package metadata still refers to >> Subversion, and I'd looked there but not seen any recent updates. But I've >> now tracked down the Debian git tree in >> git://git.debian.org/git/pkg-shadow/shadow. However, it looks like this git >> tree is based on an unreleased upstream version 4.2, and only contains the >> packaging changes, not the actual source changes. >> >> After some further sleuthing, I discovered that there's a new upstream >> repository at https://github.com/shadow-maint/shadow >> >> I was able to then build a new package by running "make dist" in the >> upstream repo to generate shadow-4.2.tar.bz2, then in the packaging repo >> using git-buildpackage >> >> $ gbp import-orig ../shadow-4.2.tar.bz2 --no-pristine-tar --no-sign-tags >> $ gbp buildpackage -us -uc >> >> Just documenting this here so that if anyone else finds this thread they'll >> be able to do the same. >> >> However, even after installing the above, it still gives the same error. I >> tried rerunning configure and rebuilding lxc after installing the new shadow >> package in case its configuration depended on that existing, but that made >> no difference. >> >> lambda@gherkin:lxc$ grep lambda /etc/sub* 2>/dev/null >> /etc/subgid:lambda:100000:65536 >> /etc/subuid:lambda:100000:65536 >> lambda@gherkin:lxc$ cat ~/.config/lxc/default.conf >> lxc.id_map = u 0 100000 65536 >> lxc.id_map = g 0 100000 65536 >> lambda@gherkin:lxc$ lxc-create -l DEBUG -o lxc.log --name precise-test -t >> download -- -d ubuntu -r precise -a amd64 >> unshare: Operation not permitted >> read pipe: No such file or directory >> lxc-create: Error chowning >> /home/lambda/.local/share/lxc/precise-test/rootfs to container root >> lxc-create: Error creating backing store type (none) for precise-test >> lxc-create: Error creating container precise-test >> lambda@gherkin:lxc$ cat lxc.log >> lxc-create 1392583412.774 WARN lxc_log - lxc_log_init called with >> log already initialized >> lxc-create 1392583412.774 INFO lxc_confile - read uid map: type u >> nsid 0 hostid 100000 range 65536 >> lxc-create 1392583412.774 INFO lxc_confile - read uid map: type g >> nsid 0 hostid 100000 range 65536 >> lxc-create 1392583412.776 ERROR lxc_container - Error chowning >> /home/lambda/.local/share/lxc/precise-test/rootfs to container root >> lxc-create 1392583412.776 ERROR lxc_container - Error creating >> backing store type (none) for precise-test >> lxc-create 1392583412.776 ERROR lxc_create_ui - Error creating >> container precise-test >> >> I've tried stracing to see if I could get any more useful information. I >> think this is the right incantation; I need to run strace as root in order >> to be able to trace any suid root executables that lxc-create might call, >> but I need to set up the environment like mine since sudo clears it and run >> the actual process as my UID with strace -u in order to use the unprivileged >> code paths: >> >> lambda@gherkin:lxc$ sudo env HOME=/home/lambda >> LD_LIBRARY_PATH=/usr/local/lib strace -u lambda -v -tt -f -o lxc.trace >> lxc-create -l DEBUG -o lxc.log --name precise-test -t download -- -d ubuntu >> -r precise -a amd64 >> unshare: Operation not permitted >> read pipe: No such file or directory >> lxc-create: Error chowning /home/lambda/.local/share/lxc/precise-test/rootfs >> to container root >> lxc-create: Error creating backing store type (none) for precise-test >> lxc-create: Error creating container precise-test >> >> The full trace is available at http://ephemera.continuation.org/lxc.trace; >> this looks like the relevant bit: >> >> 6969 19:51:07.737140 pipe([3, 5]) = 0 >> 6969 19:51:07.737163 pipe([6, 7]) = 0 >> 6969 19:51:07.737186 clone(child_stack=0, >> flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, >> child_tidptr=0x7f06b20f2b10) = 6970 >> 6970 19:51:07.737276 set_robust_list(0x7f06b20f2b20, 0x18 <unfinished ...> >> 6969 19:51:07.737285 close(5 <unfinished ...> >> 6970 19:51:07.737293 <... set_robust_list resumed> ) = 0 >> 6969 19:51:07.737300 <... close resumed> ) = 0 >> 6969 19:51:07.737312 close(6) = 0 >> 6969 19:51:07.737334 read(3, <unfinished ...> >> 6970 19:51:07.737344 close(3) = 0 >> 6970 19:51:07.737366 close(7) = 0 >> 6970 19:51:07.737389 open("/dev/pts/13", O_RDWR|O_NONBLOCK) = 3 >> 6970 19:51:07.737421 fcntl(3, F_GETFL) = 0x8802 (flags >> O_RDWR|O_NONBLOCK|O_LARGEFILE) >> 6970 19:51:07.737444 fcntl(3, F_SETFL, O_RDWR|O_LARGEFILE) = 0 >> 6970 19:51:07.737466 close(0) = 0 >> 6970 19:51:07.737487 close(1) = 0 >> 6970 19:51:07.737507 close(2) = 0 >> 6970 19:51:07.737535 dup2(3, 0) = 0 >> 6970 19:51:07.737557 dup2(3, 1) = 1 >> 6970 19:51:07.737578 dup2(3, 2) = 2 >> 6970 19:51:07.737599 close(3) = 0 >> 6970 19:51:07.737623 unshare(CLONE_NEWNS|0x10000000) = -1 EPERM (Operation >> not permitted) >> 6970 19:51:07.737649 dup(2) = 3 >> 6970 19:51:07.737673 fcntl(3, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE) >> 6970 19:51:07.737708 fstat(3, {st_dev=makedev(0, 11), st_ino=16, >> st_mode=S_IFCHR|0620, st_nlink=1, st_uid=1000, st_gid=5, st_blksize=1024, >> st_blocks=0, st_rdev=makedev(136, 13), st_atime=2014/02/16-19:51:04, >> st_mtime=2014/02/16-19:51:04, st_ctime=2014/02/16-15:34:12}) = 0 >> 6970 19:51:07.737745 mmap(NULL, 4096, PROT_READ|PROT_WRITE, >> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f06b2111000 >> 6970 19:51:07.737771 lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) >> 6970 19:51:07.737812 write(3, "unshare: Operation not permitted"..., 33) = >> 33 >> 6970 19:51:07.737849 close(3) = 0 >> 6970 19:51:07.737871 munmap(0x7f06b2111000, 4096) = 0 >> 6970 19:51:07.737964 exit_group(1) = ? >> >> >>> For unprivileged containers with current kernel and LXC (and a distro >>> with the new shadow), there's also an article I wrote a little while >>> back at: >>> https://www.stgraber.org/2014/01/17/lxc-1-0-unprivileged-containers/ >> >> Thanks! I had been looking around to see if there was any documentation on >> this or an explanation for how it should work, but hadn't found that post. >> That's quite helpful. >> >> I note from the above that you mention the following are necessary to get >> this to fully work: >> >>> • Kernel: 3.13 + a couple of staging patches (which Ubuntu has in its >>> kernel) >>> • User namespaces enabled in the kernel >>> • A very recent version of shadow that supports subuid/subgid >>> • Per-user cgroups on all controllers (which I turned on a couple of >>> weeks ago) >>> • LXC 1.0 beta2 or higher (released two days ago) >>> • A version of PAM with a loginuid patch that’s yet to be in any >>> released version >> >> >> I have user namespaces enabled in the kernel, after finding and building the >> package I have the new version of shadow, and I'm building lxc from Git, so >> those bases should be covered. So I'm wondering if one of those other items >> is what I'm missing. > > Did you install uidmap from that new version of shadow? It may be that > you have support for the options and the configfile but not the setuid > tools to actually set the uid ranges. > > In Ubuntu this is done with two separate setuid tools (newuidmap and > newgidmap) which are both contained in the uidmap package from that > newer shadow source. > > LXC in Ubuntu depends on this (well, Recommends technically) but if you > built everything by hand, it's possible you somehow missed this. Yes, I installed it after building it: lambda@gherkin:lxc$ dpkg -l uidmap Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-==================================-======================-======================-========================================================================= ii uidmap 1:4.2-1 amd64 programs to help use subuids lambda@gherkin:lxc$ which newuidmap /usr/bin/newuidmap lambda@gherkin:lxc$ which newgidmap /usr/bin/newgidmap >From the strace, it doesn't look like these are ever being called by >lxc-create. Taking a look through the source, they are supposed to be called >by lxc-usernsexec in the parent process after forking the child process and >the child process has called unshare(). The parent waits for the child to >write to the pipe before calling map_child_uids(): close(pipe1[1]); close(pipe2[0]); if (read(pipe1[0], buf, 1) < 1) { perror("read pipe"); exit(1); } buf[0] = '1'; if (map_child_uids(pid, active_map)) { fprintf(stderr, "error mapping child\n"); ret = 0; } if (write(pipe2[1], buf, 1) < 0) { perror("write to pipe"); exit(1); } And before the child writes to the pipe, it calls unshare(CLONE_NEWUSER | CLONE_NEWNS): // Child. close(pipe1[0]); close(pipe2[1]); opentty(ttyname); ret = unshare(flags); if (ret < 0) { perror("unshare"); return 1; } buf[0] = '1'; if (write(pipe1[1], buf, 1) < 1) { perror("write pipe"); exit(1); } if (read(pipe2[0], buf, 1) < 1) { perror("read pipe"); exit(1); } if (buf[0] != '1') { fprintf(stderr, "parent had an error, child exiting\n"); exit(1); } close(pipe1[1]); close(pipe2[0]); return do_child((void*)argv); So this is failing before it's even had a chance to map the UIDs/GIDs. I tried the demo_userns.c example code from this LWN article https://lwn.net/Articles/532593/ and got the same result: lambda@gherkin:userns$ ./demo_userns clone: Operation not permitted So it looks like something is preventing me from calling clone(CLONE_NEWUSER) or unshare(CLONE_NEWUSER). I can't find any documentation on CLONE_NEWUSER outside of that LWN article, and it indicates that as of 3.8, no privilege should be needed to call clone(CLONE_NEWUSER), so I'm somewhat puzzled as to why this is failing. >> >> I'm running 3.12, not 3.13, nor whatever patches Ubuntu has. Looking through >> the kernel history for anything mentioning "namespace" between 3.12 and 3.13 >> it looks like there are some features relevant to networking but not to >> basic functionality like this. Is there anything I need from the Ubuntu >> patches? I also don't have the PAM loginuid patch mentioned, but looking at >> that it seems to only affect SSHing into a user container. Can you provide >> more details on the per-user cgroups on all controllers? What handles that >> on Ubuntu? >> >> -- Brian >> >> _______________________________________________ >> lxc-devel mailing list >> lxc-devel@lists.linuxcontainers.org >> http://lists.linuxcontainers.org/listinfo/lxc-devel > > -- > Stéphane Graber > Ubuntu developer > http://www.ubuntu.com > _______________________________________________ > lxc-devel mailing list > lxc-devel@lists.linuxcontainers.org > http://lists.linuxcontainers.org/listinfo/lxc-devel _______________________________________________ lxc-devel mailing list lxc-devel@lists.linuxcontainers.org http://lists.linuxcontainers.org/listinfo/lxc-devel