Re: [systemd-devel] Unable to run systemd in an LXC / cgroup container.
On Mon, 2012-10-22 at 16:11 +0200, Lennart Poettering wrote: Note that there are reports that LXC has issues with the fact that newer systemd enables shared mount propagation for all mounts by default (this should actually be beneficial for containers as this ensures that new mounts appear in the containers). LXC when run on such a system fails as soon as it tries to use pivot_root(), as that is incompatible with shared mount propagation. The needs fixing in LXC: it should use MS_MOVE or MS_BIND to place the new root dir in / instead. A short term work-around is to simply remount the root tree to private before invoking LXC. In another thread, Serge had some heartburn over this shared mount propagation which then rang a bell in my head about past problems we have seen. On Mon, 2012-11-05 at 08:51 -0600, Serge Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): ... This was from another threat with the systemd guys. On Mon, 2012-10-22 at 16:11 +0200, Lennart Poettering wrote: Note that there are reports that LXC has issues with the fact that newer systemd enables shared mount propagation for all mounts by default (this should actually be beneficial for containers as this ensures that new mounts appear in the containers). LXC when run on such a system fails MS_SLAVE does this as well. MS_SHARED means container mounts also propagate into the host, which is less desirable in most cases. Here's where we've seen some problems in the past. It's not just mounts that are propagated but remounts as well. The problem arose that some of us had our containers on a separate partition. When we would shut a container down, that container tried to remount its file systems ro which then propagated back into the host causing the hosts file system to be ro (doesn't happen if you are running on the host's root fs for the containers) and from there across into the other containers. Are you using MS_SHARED or MS_SLAVE for this? If you are using MS_SHARED do you create a potential security problem where actions in the container can bleed into the state of the host and into other containers. That's highly undesirable. If a mount in a propagates back into the host and is then reflected to another container sharing that same mount tree (I have shared partitions specific to that sort of thing) does that create an information disclosure situation of one container mounts a new file system and the other container sees the new mount? I don't know if the mount propagation would reflect back up the shared tree or not but I have certainly seen remounts do this. I don't see that as desirable. Maybe I'm misunderstand how this is suppose to work but I intend to test out those scenarios when I have a chance. I do know that, when testing that ro problem, I was able to remount a partition ro in one container and it would switch in the host and the other container and I could the remount it rw in the other container and have it propagate back. Not good. Can you offer any clarity on this? as soon as it tries to use pivot_root(), as that is incompatible with shared mount propagation. The needs fixing in LXC: it should use MS_MOVE or MS_BIND to place the new root dir in / instead. A short term Actually not quite sure how this would work. It should be possible to set up a set of conditions to work around this, but the kernel checks at do_pivotroot are pretty harsh - mnt-mnt_parent of both the new root and current root have to be not shared. So perhaps we actually first chroot into a dir whose parent is non-shared, then pivot_root from there? :) (Simple chroot in place of pivot_root still does not suffice, not only because of chroot escapes, but also different results in /proc/pid/mountinfo and friends) Comments on Serge's points? At this point, we see where this will become problematical in Fedora 18 but appears to already be problematical in NixOS that another user is running and which containers systemd 195 in the host. We've had problems with chroot in the past due to chroot escapes and other problems years ago as Serge mentioned. Lennart -- Lennart Poettering - Red Hat, Inc. Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] One of my fundamental problems with systemd...
Lennart, I'm going to quickly top post here so forgive me for being rude. Serge and I worked out some patches that seem to address the devtmpfs issues nicely. Still got a few oddities to iron out but the big fundamental one got fixed and is working. Many thanks for that. The problem with error messages and stderr never did get resolved. The error output from the script did not show up on stderr nor in the journal but showed up with the command was issued to manually restart the network. That problem seems to have evaporated once the devtmpfs problem was resolved and may well have been related. The console vs gettys on ttys remains problematical but far less severe now that we can get the network started normally. If there is some way to enable those, that would help as it would make our lxc-console program useful again in terms of when you need more than one console when the network is non-functional or an access service (sshd) fails. I have started experimenting with 195 from Fedora Rawhide, Fedora 17 is at 44 with a whole lotta back commits, and haven't tried the git version as yet. I do have 44 working as of right now in a Fedora 17 container, so that's a really good thing as we do not want to have to customize any other packages to get LXC to work if at all possible. Serge has some patches he's going to work into the code tree and I'm continuing to test and debug at this point, but we have made some serious progress over the last week. Thank you very much! Regards, Mike On Tue, 2012-10-30 at 03:08 +0100, Lennart Poettering wrote: On Fri, 26.10.12 18:39, Michael H. Warfield (m...@wittsend.com) wrote: My most fundamental problem with systemd is its insistence in hiding and obfuscating errors in ways that makes debugging almost impossible. Almost every upgrade problem I've had in Fedora has been related to systemd's failure to provide comprehendable error messages to things like errors in fstab (#1 fsck up). We have been trying hard to make the boot of systemd actually as understandable as possible, and are still working on that. The main reason why the journal exists is that we can collect stdout/stderr of all services cleanly. We also are working hard on making system boot cleanly in containers. In fact, the systemd test system will create an OS image that we boot up both in a KVM and a linux container, and verify that it boots up cleanly. Now, it's of course disappointing when that work didn't really bcome visible to you yet. But a couple of notes on that: a) of course, the more recent versions of systemd will have the most complete support for these things, so, please before reporting issues, check if things are fixed already, there's a very good chance they are. b) For containers we focus on systemd-nspawn and libvirt-lxc as container managers (which is entirely different from LXC, actually, and shares no code, just the name!), but not 'classic' LXC. The non-libvirt LXC tool set is a very low-level tool-set that gives you plenty of rope to hang yourself with, you can use it to set up containers, but you need to know a lot of things for that, about low-level system stuff in systemd and elsewhere. We tried to remove a lot of complexity in this area, which is why we came up with the container iface doc i already linked. The requirements are implemented implicitly in nspawn and libvirt-lxc, but if you use raw LXC then you have to do this yourself, which is why we documented that stuff. c) LXC made a couple of questionnable choices that are not compatible with the way systemd expects things. For example, the attempt of redirecting for /dev/tty1 (and friends), and /dev/console is a bit mislead if I may say so. The latter is problematic, since /dev/tty1 is just one interface to a kernel object that is visible in other ways too, and used that way, for example in /sys, /dev/vcs1, and so on. Just redirecting one part of the iface will break stuff that tries to do more than just the most basic things on TTYs, which systemd actually tries to do. It also has the effect that things like $TERM are incorrectly initialized. Now, the existing guidelines for LXC ignore these issues and sysvinit due to its static design works well enough on that, but systemd doesn't. That doesn't mean LXC was generally incompatible with systemd, but you probably need to do more stuff manually, that will work out-of-the-box with the other container managers. Anyway, please have a look at the newer versions of systemd, and possibly nspawn or libvirt-lxc. Things might already work much better if you use those. Lennart -- Lennart Poettering - Red Hat, Inc. -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure
Re: [systemd-devel] One of my fundamental problems with systemd...
On Sat, 2012-10-27 at 13:01 +0200, Zbigniew Jędrzejewski-Szmek wrote: On Sat, Oct 27, 2012 at 11:04:42AM +0200, Kay Sievers wrote: On Sat, Oct 27, 2012 at 6:02 AM, Michael H. Warfield m...@wittsend.com wrote: On Sat, 2012-10-27 at 05:24 +0200, Olav Vitters wrote: sent too quickly.. On Sat, Oct 27, 2012 at 05:22:30AM +0200, Olav Vitters wrote: On Fri, Oct 26, 2012 at 10:16:47PM -0400, Michael H. Warfield wrote: BTW... Not to drop names (which I'm about to do) or anything and I know in a big organization not everybody knows everyone but... I prefer judging myself. And at the moment I see you using words like: - crap - POS - barf - fsck (though meaning fuck) It is not needed to tell what you did. Just be nice and people will show the same courtesy. Offense not intended. It's the terms we use in several other forums with no offense take. In fact, in the Openswan forum barf has a specific command and meaning. If you have a problem with Openswan you are quite often asked for the barf (dump). Maybe I'm too old and too abrasive and too use to the old culture where this was common. We are actually proud here not to act or need to talk like people on LKML. We generally enjoy to talk like respectful and civilized people, and want to keep it that way. Hi, Kay, Olav, I think that we're being a bit unfair towards Michael. There's an implication that his posts were offensive, but they weren't. Overly verbose, yes, repetitive, yes, agitated, etc, but not intended as rude. Michael started by reporting valid problems, with a setup in which systemd is involved, even if not the culprit. So closing the discussion by concentrating on style is not the right way, imho. Michael, please step back, compile the latest (git) version of systemd, and report _specific_ things that can be improved. You got it. Will do. A good nights sleep helped. I've been working on getting this container running for the last two days and was getting overly frustrated by it. My thanks to you all. I've got to do the same thing with the latest git version of lxc as well. Par for the course on the bleeding edge. Zbyszek Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] One of my fundamental problems with systemd...
On Sat, 2012-10-27 at 17:24 +0200, Olav Vitters wrote: On Sat, Oct 27, 2012 at 01:01:36PM +0200, Zbigniew Jędrzejewski-Szmek wrote: Kay, Olav, I think that we're being a bit unfair towards Michael. There's an implication that his posts were offensive, but they weren't. Overly verbose, yes, repetitive, yes, agitated, etc, but not intended as rude. Michael started by reporting valid problems, with a I wasn't trying to implicate that it was offensive. Anyway, this is not about systemd, so I prefer to end it here.. My apologies to all for being so brusk last night. It's been a frustrating couple of days trying to get this to work and this is done on my own personal time. -- Regards, Olav ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Lxc-users] Unable to run systemd in an LXC / cgroup container.
On Fri, 2012-10-26 at 09:07 -0500, Serge Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): Adding in the lxc-devel list. On Thu, 2012-10-25 at 22:59 -0400, Michael H. Warfield wrote: On Thu, 2012-10-25 at 15:42 -0400, Michael H. Warfield wrote: On Thu, 2012-10-25 at 14:02 -0500, Serge Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): On Thu, 2012-10-25 at 13:23 -0400, Michael H. Warfield wrote: Hey Serge, On Thu, 2012-10-25 at 11:19 -0500, Serge Hallyn wrote: ... Oh, sorry - I take back that suggestion :) Note that we have mount hooks, so templates could install a mount hook to mount a tmpfs onto /dev and populate it. Ok... I've done some cursory search and turned up nothing but some comments about pre mount hooks. Where is the documentation about this feature and how I might use / implement it? Some examples would probably suffice. Is there a require release version of lxc-utils? I think I found what I needed in the changelog here: http://www.mail-archive.com/lxc-devel@lists.sourceforge.net/msg01490.html I'll play with it and report back. Also the Lifecycle management hooks section in https://help.ubuntu.com/12.10/serverguide/lxc.html This isn't working... Based on what was in both of those articles, I added this entry to another container (Plover) to test... lxc.hook.mount = /var/lib/lxc/Plover/mount When I run lxc-start -n Plover, I see this: [root@forest ~]# lxc-start -n Plover lxc-start: unknow key lxc.hook.mount lxc-start: failed to read configuration file I'm running the latest rc... [root@forest ~]# rpm -qa | grep lxc lxc-0.8.0.rc2-1.fc16.x86_64 lxc-libs-0.8.0.rc2-1.fc16.x86_64 lxc-doc-0.8.0.rc2-1.fc16.x86_64 Is it something in git that hasn't made it to a release yet? nm... I see it. It's in git and hasn't made it to a release. I'm working on a git build to test now. If this is something that solves some of this, we need to move things along here and get these things moved out. According to git, 0.8.0rc2 was 7 months ago? What's the show stoppers here? While the git repo says 7 months ago, the date stamp on the lxc-0.8.0-rc2 tarball is from July 10, so about 3-1/2 months ago. Sounds like we've accumulated some features (like the hooks) we are going to need like months ago to deal with this systemd debacle. How close are we to either 0.8.0rc3 or 0.8.0? Any blockers or are we just waiting on some more features? Daniel has simply been too busy. Stéphane has made a new branch which cherrypicks 50 bugfixes for 0.8.0, with the remaining patches (about twice as many) left for 0.9.0. I'm hoping we get 0.8.0 next week :) Trying to build latest from git. This is not good... checking sys/apparmor.h usability... no checking sys/apparmor.h presence... no checking for sys/apparmor.h... no configure: error: You must install the AppArmor development package in order to compile lxc What am I suppose to do on Fedora where we don't have that package? Is it available in another repo somewhere? I'm looking and not finding. Regards, Mike -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct ___ Lxc-users mailing list lxc-us...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Lxc-users] Unable to run systemd in an LXC / cgroup container.
On Sat, 2012-10-27 at 12:45 -0400, Michael H. Warfield wrote: On Fri, 2012-10-26 at 09:07 -0500, Serge Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): Adding in the lxc-devel list. On Thu, 2012-10-25 at 22:59 -0400, Michael H. Warfield wrote: On Thu, 2012-10-25 at 15:42 -0400, Michael H. Warfield wrote: On Thu, 2012-10-25 at 14:02 -0500, Serge Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): On Thu, 2012-10-25 at 13:23 -0400, Michael H. Warfield wrote: Hey Serge, On Thu, 2012-10-25 at 11:19 -0500, Serge Hallyn wrote: ... Oh, sorry - I take back that suggestion :) Note that we have mount hooks, so templates could install a mount hook to mount a tmpfs onto /dev and populate it. Ok... I've done some cursory search and turned up nothing but some comments about pre mount hooks. Where is the documentation about this feature and how I might use / implement it? Some examples would probably suffice. Is there a require release version of lxc-utils? I think I found what I needed in the changelog here: http://www.mail-archive.com/lxc-devel@lists.sourceforge.net/msg01490.html I'll play with it and report back. Also the Lifecycle management hooks section in https://help.ubuntu.com/12.10/serverguide/lxc.html This isn't working... Based on what was in both of those articles, I added this entry to another container (Plover) to test... lxc.hook.mount = /var/lib/lxc/Plover/mount When I run lxc-start -n Plover, I see this: [root@forest ~]# lxc-start -n Plover lxc-start: unknow key lxc.hook.mount lxc-start: failed to read configuration file I'm running the latest rc... [root@forest ~]# rpm -qa | grep lxc lxc-0.8.0.rc2-1.fc16.x86_64 lxc-libs-0.8.0.rc2-1.fc16.x86_64 lxc-doc-0.8.0.rc2-1.fc16.x86_64 Is it something in git that hasn't made it to a release yet? nm... I see it. It's in git and hasn't made it to a release. I'm working on a git build to test now. If this is something that solves some of this, we need to move things along here and get these things moved out. According to git, 0.8.0rc2 was 7 months ago? What's the show stoppers here? While the git repo says 7 months ago, the date stamp on the lxc-0.8.0-rc2 tarball is from July 10, so about 3-1/2 months ago. Sounds like we've accumulated some features (like the hooks) we are going to need like months ago to deal with this systemd debacle. How close are we to either 0.8.0rc3 or 0.8.0? Any blockers or are we just waiting on some more features? Daniel has simply been too busy. Stéphane has made a new branch which cherrypicks 50 bugfixes for 0.8.0, with the remaining patches (about twice as many) left for 0.9.0. I'm hoping we get 0.8.0 next week :) Trying to build latest from git. This is not good... checking sys/apparmor.h usability... no checking sys/apparmor.h presence... no checking for sys/apparmor.h... no configure: error: You must install the AppArmor development package in order to compile lxc What am I suppose to do on Fedora where we don't have that package? Is it available in another repo somewhere? I'm looking and not finding. nm... I see that --enable-apparmor is defaulted to on. I just had to add an option to --disable-apparmor. Sorry for the noise. Regards, Mike Mike -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct ___ Lxc-users mailing list lxc-us...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users -- WINDOWS 8 is here. Millions of people. Your app in 30 days. Visit The Windows 8 Center at Sourceforge for all your go to resources. http://windows8center.sourceforge.net/ join-generation-app-and-make-money-coding-fast/ ___ Lxc-users mailing list lxc-us...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel
Re: [systemd-devel] Unable to run systemd in an LXC / cgroup container.
On Sat, 2012-10-27 at 19:44 +0100, Colin Guthrie wrote: 'Twas brillig, and Michael H. Warfield at 26/10/12 18:18 did gyre and gimble: What the hell is this? /var/run is symlinked to /run and is mounted with a tmpfs. Yup, that's how /var/run and /run is being handled these days. It provides a consistent space to pass info from the initrd over to the main system and has various other uses also. Interesting. I hadn't considered that aspect of it before. Very interesting. If you want to ensure files are created in this folder, just drop a config file in to /usr/lib/tmpfiles.d/ in the package in question. See man systemd-tmpfiles for more info. NOW THAT is something else I needed to know about! Thank you very very much! Learned something new. This whole thing has been a massive learning experience getting this container kick started. Could be some packages are not fully upgraded to this concept in F17. As a non-fedora user, I can't really comment on that specifically. As it turns out, the kernel has had some of our patches applied that I wasn't aware of vis-a-vis reboot/halt and this should no longer be an issue. I'm still struggling with the tmpfs on /dev thing and have run into a catch-22 with regards to that. I can mount tmpfs on /dev just fine and can populate it just fine in a post mount hook but, then, we're trying to mount a devpts file system on /dev/pts before we've had a chance to populate it and it's then crashing on the mount. Sigh... I think that's going to now have to wait for Serge or Daniel to comment on. Col -- Colin Guthrie gmane(at)colin.guthr.ie http://colin.guthr.ie/ Day Job: Tribalogic Limited http://www.tribalogic.net/ Open Source: Mageia Contributor http://www.mageia.org/ PulseAudio Hacker http://www.pulseaudio.org/ Trac Hacker http://trac.edgewall.org/ Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Lxc-users] Unable to run systemd in an LXC / cgroup container.
Adding in the lxc-devel list. On Thu, 2012-10-25 at 22:59 -0400, Michael H. Warfield wrote: On Thu, 2012-10-25 at 15:42 -0400, Michael H. Warfield wrote: On Thu, 2012-10-25 at 14:02 -0500, Serge Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): On Thu, 2012-10-25 at 13:23 -0400, Michael H. Warfield wrote: Hey Serge, On Thu, 2012-10-25 at 11:19 -0500, Serge Hallyn wrote: ... Oh, sorry - I take back that suggestion :) Note that we have mount hooks, so templates could install a mount hook to mount a tmpfs onto /dev and populate it. Ok... I've done some cursory search and turned up nothing but some comments about pre mount hooks. Where is the documentation about this feature and how I might use / implement it? Some examples would probably suffice. Is there a require release version of lxc-utils? I think I found what I needed in the changelog here: http://www.mail-archive.com/lxc-devel@lists.sourceforge.net/msg01490.html I'll play with it and report back. Also the Lifecycle management hooks section in https://help.ubuntu.com/12.10/serverguide/lxc.html This isn't working... Based on what was in both of those articles, I added this entry to another container (Plover) to test... lxc.hook.mount = /var/lib/lxc/Plover/mount When I run lxc-start -n Plover, I see this: [root@forest ~]# lxc-start -n Plover lxc-start: unknow key lxc.hook.mount lxc-start: failed to read configuration file I'm running the latest rc... [root@forest ~]# rpm -qa | grep lxc lxc-0.8.0.rc2-1.fc16.x86_64 lxc-libs-0.8.0.rc2-1.fc16.x86_64 lxc-doc-0.8.0.rc2-1.fc16.x86_64 Is it something in git that hasn't made it to a release yet? nm... I see it. It's in git and hasn't made it to a release. I'm working on a git build to test now. If this is something that solves some of this, we need to move things along here and get these things moved out. According to git, 0.8.0rc2 was 7 months ago? What's the show stoppers here? While the git repo says 7 months ago, the date stamp on the lxc-0.8.0-rc2 tarball is from July 10, so about 3-1/2 months ago. Sounds like we've accumulated some features (like the hooks) we are going to need like months ago to deal with this systemd debacle. How close are we to either 0.8.0rc3 or 0.8.0? Any blockers or are we just waiting on some more features? Note that I'm thinking that having lxc-start guess how to fill in /dev is wrong, because different distros and even different releases of the same distros have different expectations. For instance ubuntu lucid wants /dev/shm to be a directory, while precise+ wants a symlink. So somehow the template should get involved, be it by adding a hook, or simply specifying a configuration file which lxc uses internally to decide how to create /dev. I agree this needs to be by some sort of convention or template that we can adjust. Personally I'd prefer if /dev were always populated by the templates, and containers (i.e. userspace) didn't mount a fresh tmpfs for /dev. But that does complicate userspace, and we've seen it in debian/ubuntu as well (i.e. at certain package upgrades which rely on /dev being cleared after a reboot). -serge Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Lxc-users] Unable to run systemd in an LXC / cgroup container.
On Fri, 2012-10-26 at 09:07 -0500, Serge Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): nm... I see it. It's in git and hasn't made it to a release. I'm working on a git build to test now. If this is something that solves some of this, we need to move things along here and get these things moved out. According to git, 0.8.0rc2 was 7 months ago? What's the show stoppers here? While the git repo says 7 months ago, the date stamp on the lxc-0.8.0-rc2 tarball is from July 10, so about 3-1/2 months ago. Sounds like we've accumulated some features (like the hooks) we are going to need like months ago to deal with this systemd debacle. How close are we to either 0.8.0rc3 or 0.8.0? Any blockers or are we just waiting on some more features? Daniel has simply been too busy. Don't I know THAT feeling all too well. Over on the Samba Team (where I'm the chief security consultant on the team) we're all too busy with juggling our domain and our web cert. On top of that, I've got my day job (of course). On top of that, I've got about six other OpenSource projects I'm juggling (including this one). On top of that, I've got a consulting customer that's going through fits. And the beat goes on. I'll test out things as fast as I can. I need this. This suddenly got very interesting as soon as we had a thread to pick at on the systemd ball of yarn. Stéphane has made a new branch which cherrypicks 50 bugfixes for 0.8.0, with the remaining patches (about twice as many) left for 0.9.0. I'm hoping we get 0.8.0 next week :) I'm hoping the hook patches are in that cherry picked basket. We really need them if that's what it takes to make this work. Looking forward to it. :-)=) I'm going to look further into this whole redirect /dev/console to a log hang thing. That's not good and may need to be resolved soon as well. I can live with losing the vty's although I disagree with Stéphan's arguments. They (systemd) are behaving significantly different from sysvinit and upstart and they claim they want to be transparent? Not. No matter. We need to make that work properly as well, agree with them or disagree with them. Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Unable to run systemd in an LXC / cgroup container.
On Thu, 2012-10-25 at 23:38 +0200, Lennart Poettering wrote: On Thu, 25.10.12 11:59, Michael H. Warfield (m...@wittsend.com) wrote: http://wiki.1tux.org/wiki/Lxc/Installation#Additional_notes Unfortunately, in our case, merely getting a mount in there is a complication in that it also has to be populated but, at least, we understand the problem set now. Ok... Serge and I were corresponding on the lxc-users list and he had a suggestion that worked but I consider to be a bit of a sub-optimal workaround. Ironically, it was to mount devtmpfs on /dev. We don't (currently) have a method to auto-populate a tmpfs mount with the needed devices and this provided it. It does have a problem that makes me uncomfortable in that the container now has visibility into the hosts /dev system. I'm a security expert and I'm not comfortable with that solution even with the controls we have. We can control access but still, not happy with that. That's a pretty bad idea, access control to the device nodes in devtmpfs is controlled by the host's udev instance. That means if your group/user lists in the container and the host differ you already lost. Also access control in udev is dynamic, due to stuff like uaccess and similar. You really don't want to to have that into the container, i.e. where device change ownership all the time with UIDs/GIDs that make no sense at all in the container. Concur. In general I think it's a good idea not to expose any real devices to the container, but only the virtual ones that are programming APIs. That means: no /dev/sda, or /dev/ttyS0, but /dev/null, /dev/zero, /dev/random, /dev/urandom. And creating the latter in a tmpfs is quite simple. If I run lxc-console (which attaches to one of the vtys) it gives me nothing. Under sysvinit and upstart I get vty login prompts because they have started getty on those vtys. This is important in case network access has not started for one reason or another and the container was started detached in the background. The getty behaviour of systemd in containers is documented here: http://www.freedesktop.org/wiki/Software/systemd/ContainerInterface Sorry. This is unacceptable. We need some way that these will be active and you will be consistent with other containers. If LXC mounts ptys on top of the VT devices that's a really bad idea too, since /dev/tty1 and friends expose a number of APIs beyond the mere tty device that you cannot emulate with that. It includes files in /sys, as well as /dev/vcs and /dev/vcsa, various ioctls, and so on. Heck, even the most superficial of things, the $TERM variable will be incorrect. LXC shouldn't do that. REGARDLESS. I'm in this situation now testing what I thought was a hang condition (which is proving to be something else). I started a container detached redirecting the console to a file (a parameter I was missing) and the log to another file (which I had been doing). But, for some reason, sshd is not starting up. I have no way to attach to the bloody console of the container and I have no getty's on a vty I can attach to using lxc-console and I can't remote access a container which, for all other intents and purposes, appears to be running fine. Parameterize this bloody thing so we can have control over it. LXC really shouldn't pretent a pty was a VT tty, it's not. From the libvirt guys it has been proposed that we introduce a new env var to pass to PID 1 of the container, that simply lists ptys to start gettys on. That way we don't pretend anything about ttys that the ttys can't hold and have a clean setup. I SUSPECT the hang condition is something to do with systemd trying to start and interactive console on /dev/console, which sysvinit and upstart do not do. Yes, this is documented, please see the link I already posted, and which I linked above a second time. I've got some more problems relating to shutting down containers, some of which may be related to mounting tmpfs on /run to which /var/run is symlinked to. We're doing halt / restart detection by monitoring utmp in that directory but it looks like utmp isn't even in that directory anymore and mounting tmpfs on it was always problematical. We may have to have a more generic method to detect when a container has shut down or is restarting in that case. I can't parse this. The system call reboot() is virtualized for containers just fine and the container managaer (i.e. LXC) can check for that easily. Lennart -- Lennart Poettering - Red Hat, Inc. -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part
Re: [systemd-devel] Unable to run systemd in an LXC / cgroup container.
On Thu, 2012-10-25 at 23:38 +0200, Lennart Poettering wrote: On Thu, 25.10.12 11:59, Michael H. Warfield (m...@wittsend.com) wrote: I've got some more problems relating to shutting down containers, some of which may be related to mounting tmpfs on /run to which /var/run is symlinked to. We're doing halt / restart detection by monitoring utmp in that directory but it looks like utmp isn't even in that directory anymore and mounting tmpfs on it was always problematical. We may have to have a more generic method to detect when a container has shut down or is restarting in that case. I can't parse this. The system call reboot() is virtualized for containers just fine and the container managaer (i.e. LXC) can check for that easily. I strongly suspect that the condition I'm dealing with (not being able to restart the container) is an artifact of the devtmpfs kludge. I'm seeing some errors relating to /dev/loop* busy that seems to be related to the hung resources resulting in the inability to remove the zombie container. Disregard until I can get further information following a switch to a template based setup. Lennart Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Unable to run systemd in an LXC / cgroup container.
On Fri, 2012-10-26 at 11:58 -0400, Michael H. Warfield wrote: On Thu, 2012-10-25 at 23:38 +0200, Lennart Poettering wrote: On Thu, 25.10.12 11:59, Michael H. Warfield (m...@wittsend.com) wrote: http://wiki.1tux.org/wiki/Lxc/Installation#Additional_notes Unfortunately, in our case, merely getting a mount in there is a complication in that it also has to be populated but, at least, we understand the problem set now. Ok... Serge and I were corresponding on the lxc-users list and he had a suggestion that worked but I consider to be a bit of a sub-optimal workaround. Ironically, it was to mount devtmpfs on /dev. We don't (currently) have a method to auto-populate a tmpfs mount with the needed devices and this provided it. It does have a problem that makes me uncomfortable in that the container now has visibility into the hosts /dev system. I'm a security expert and I'm not comfortable with that solution even with the controls we have. We can control access but still, not happy with that. That's a pretty bad idea, access control to the device nodes in devtmpfs is controlled by the host's udev instance. That means if your group/user lists in the container and the host differ you already lost. Also access control in udev is dynamic, due to stuff like uaccess and similar. You really don't want to to have that into the container, i.e. where device change ownership all the time with UIDs/GIDs that make no sense at all in the container. Concur. In general I think it's a good idea not to expose any real devices to the container, but only the virtual ones that are programming APIs. That means: no /dev/sda, or /dev/ttyS0, but /dev/null, /dev/zero, /dev/random, /dev/urandom. And creating the latter in a tmpfs is quite simple. If I run lxc-console (which attaches to one of the vtys) it gives me nothing. Under sysvinit and upstart I get vty login prompts because they have started getty on those vtys. This is important in case network access has not started for one reason or another and the container was started detached in the background. The getty behaviour of systemd in containers is documented here: http://www.freedesktop.org/wiki/Software/systemd/ContainerInterface Sorry. This is unacceptable. We need some way that these will be active and you will be consistent with other containers. If LXC mounts ptys on top of the VT devices that's a really bad idea too, since /dev/tty1 and friends expose a number of APIs beyond the mere tty device that you cannot emulate with that. It includes files in /sys, as well as /dev/vcs and /dev/vcsa, various ioctls, and so on. Heck, even the most superficial of things, the $TERM variable will be incorrect. LXC shouldn't do that. REGARDLESS. I'm in this situation now testing what I thought was a hang condition (which is proving to be something else). I started a container detached redirecting the console to a file (a parameter I was missing) and the log to another file (which I had been doing). But, for some reason, sshd is not starting up. I have no way to attach to the bloody console of the container and I have no getty's on a vty I can attach to using lxc-console and I can't remote access a container which, for all other intents and purposes, appears to be running fine. Parameterize this bloody thing so we can have control over it. Here's another weirdism that's in your camp... The reason that sshd did not start was because the network did not start (IPv6 was up but IPv4 was not and the startup of several services failed as a consequence). Trying to restart the network manually resulted in this: [root@alcove mhw]# ifdown eth0 ./network-functions: line 237: cd: /var/run/netreport: No such file or directory [root@alcove mhw]# ifup eth0 ./network-functions: line 237: cd: /var/run/netreport: No such file or directory [root@alcove mhw]# ls /var/run/ dbus messagebus.pid rpcbind.sock systemd user log mount syslogd.pid udev What the hell is this? /var/run is symlinked to /run and is mounted with a tmpfs. So I created that directory and could ifup the the network and start sshd. So I did a little check on the run levels... Hmmm... F17 container (Alcove) in an F17 host (Forest). WHAT is going ON here? Is this why the network didn't start? [root@forest mhw]# runlevel N 5 [root@alcove mhw]# runlevel unknown [root@alcove mhw]# chkconfig Note: This output shows SysV services only and does not include native systemd services. SysV configuration data might be overridden by native systemd configuration. modules_dep 0:off 1:off 2:on3:on4:on5:on6:off netconsole 0:off 1:off 2:off 3:off 4:off 5:off 6:off network 0:off 1:off 2:off 3:on4:off 5:off 6:off Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw
Re: [systemd-devel] Unable to run systemd in an LXC / cgroup container.
On Fri, 2012-10-26 at 12:11 -0400, Michael H. Warfield wrote: On Thu, 2012-10-25 at 23:38 +0200, Lennart Poettering wrote: On Thu, 25.10.12 11:59, Michael H. Warfield (m...@wittsend.com) wrote: I SUSPECT the hang condition is something to do with systemd trying to start and interactive console on /dev/console, which sysvinit and upstart do not do. Yes, this is documented, please see the link I already posted, and which I linked above a second time. This may have been my fault. I was using the -o option to lxc-start (output logfile) and failed to specify the -c (console output redirect) option. It seems to fire up nicely (albeit with other problems) with that additional option. Continuing my research. Confirming. Using the -c option for the console file works. Unfortunately, thanks to no getty's on the ttys so lxc-console does not work and no way to connect to that console redirect and the failure of the network to start, I'm still trying to figure out just what is face planting in a container I can not access. :-/=/ Punch out the punch list one PUNCH at at time here. I've got some more problems relating to shutting down containers, some of which may be related to mounting tmpfs on /run to which /var/run is symlinked to. We're doing halt / restart detection by monitoring utmp in that directory but it looks like utmp isn't even in that directory anymore and mounting tmpfs on it was always problematical. We may have to have a more generic method to detect when a container has shut down or is restarting in that case. I can't parse this. The system call reboot() is virtualized for containers just fine and the container managaer (i.e. LXC) can check for that easily. Apparently, in recent kernels, we can. Unfortunately, I'm still finding that I can not restart a container I have previously halted. I have no problem with sysvinit and upstart systems on this host, so it is a container problem peculiar to systemd containers. Continuing to research that problem. Lennart -- Lennart Poettering - Red Hat, Inc. Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] One of my fundamental problems with systemd...
My most fundamental problem with systemd is its insistence in hiding and obfuscating errors in ways that makes debugging almost impossible. Almost every upgrade problem I've had in Fedora has been related to systemd's failure to provide comprehendable error messages to things like errors in fstab (#1 fsck up). The most recent problem has been an issue trying to get LXC containers to work. The networking is not coming up in the container at boot. It's not starting. What do I get? I finally dug it out of the console barf. A message that says this: [FAILED] Failed to start LSB: Bring up/down networking. See 'systemctl status network.service' for details. Ok fine... So I get logged in and I run this... [root@alcove mhw]# systemctl status network.service network.service - LSB: Bring up/down networking Loaded: loaded (/etc/rc.d/init.d/network) Active: failed (Result: exit-code) since Wed, 24 Oct 2012 18:23:07 +0400; 1min 57s ago Process: 91 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=209/STDOUT) CGroup: name=systemd:/system/lxc/Alcove/system/network.service Tells me nothing. Does not tell me where the problem is... If I then try to manually start the network I get this... [root@alcove mhw]# service network start Starting network (via systemctl): network[275]: Bringing up loopback interface: ./network-functions: line 237: cd: /var/run/netreport: No such file or directory network[275]: [ OK ] network[275]: Bringing up interface eth0: ./network-functions: line 237: cd: /var/run/netreport: No such file or directory network[275]: [ OK ] network[275]: touch: cannot touch `/var/lock/subsys/network': No such file or directory [ OK ] OK... This I can understand. There are missing directories in /var/run and in /var/lock. Don't tell me how that script should have done this or that or the other. That's NOT the problem. The problem is that systemd did not communicate back WHAT THE REAL PROBLEM WAS. Why is it so difficult for systemd to respond with intelligent error message??? The message said to run systemctl status network.service but that result was worthless. I'll now edit that startup script to fix this nonsense but it's pointing to a fundamental failure in systemd in communicating errors back to administrators in an actionable manner. Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] One of my fundamental problems with systemd...
On Fri, 2012-10-26 at 16:51 -0700, Kok, Auke-jan H wrote: On Fri, Oct 26, 2012 at 3:39 PM, Michael H. Warfield m...@wittsend.com wrote: My most fundamental problem with systemd is its insistence in hiding and obfuscating errors in ways that makes debugging almost impossible. Almost every upgrade problem I've had in Fedora has been related to systemd's failure to provide comprehendable error messages to things like errors in fstab (#1 fsck up). The most recent problem has been an issue trying to get LXC containers to work. The networking is not coming up in the container at boot. It's not starting. What do I get? I finally dug it out of the console barf. A message that says this: [FAILED] Failed to start LSB: Bring up/down networking. See 'systemctl status network.service' for details. Ok fine... So I get logged in and I run this... [root@alcove mhw]# systemctl status network.service network.service - LSB: Bring up/down networking Loaded: loaded (/etc/rc.d/init.d/network) Active: failed (Result: exit-code) since Wed, 24 Oct 2012 18:23:07 +0400; 1min 57s ago Process: 91 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=209/STDOUT) CGroup: name=systemd:/system/lxc/Alcove/system/network.service Tells me nothing. Does not tell me where the problem is... On the contrary, it does quite clearly indicate where the problem is. The information may be represented differently than you expect, but it is clear on the status of the service - including the return value of the execution. If I then try to manually start the network I get this... [root@alcove mhw]# service network start Starting network (via systemctl): network[275]: Bringing up loopback interface: ./network-functions: line 237: cd: /var/run/netreport: No such file or directory network[275]: [ OK ] network[275]: Bringing up interface eth0: ./network-functions: line 237: cd: /var/run/netreport: No such file or directory network[275]: [ OK ] network[275]: touch: cannot touch `/var/lock/subsys/network': No such file or directory [ OK ] OK... This I can understand. There are missing directories in /var/run and in /var/lock. Don't tell me how that script should have done this or that or the other. That's NOT the problem. The problem is that systemd did not communicate back WHAT THE REAL PROBLEM WAS. Why is it so difficult for systemd to respond with intelligent error message??? The message said to run systemctl status network.service but that result was worthless. I'll now edit that startup script to fix this nonsense but it's pointing to a fundamental failure in systemd in communicating errors back to administrators in an actionable manner. Try and be reasonable here. Here's what I read in your message: [FAILED] Failed to start LSB: Bring up/down networking. See 'systemctl status network.service' for details. Ahh, so let's read that output: [root@alcove mhw]# systemctl status network.service network.service - LSB: Bring up/down networking ok, just describes the service Loaded: loaded (/etc/rc.d/init.d/network) ok, it's loaded. it's a sysV init script. Maybe the script is old and wasn't written for systemd? Active: failed (Result: exit-code) since Wed, 24 Oct 2012 18:23:07 +0400; 1min 57s ago hmm, EXIT CODE failure. It exited with a non-zero status. Process: 91 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=209/STDOUT) ok so, the script returned a strange error code. CGroup: name=systemd:/system/lxc/Alcove/system/network.service shrug, doesn't seem to matter. ... Now, from just this I can conclude that your `/etc/rc.d/init.d/network start` produced an error. How is that useless information? It is exactly what the status of the network service is - failed, with an error code. Now, from what I remember newer versions of systemd produce a short 'tail' of the services' error log in case it fails, looking like this: # systemctl status connman.service connman.service - Connection service Loaded: loaded (/etc/systemd/system/connman.service; disabled) Active: inactive (dead) CGroup: name=systemd:/system/connman.service Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: Remove interface wlan0 [ wifi ] Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: Remove interface eth1 [ ethernet ] Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: eth0 {remove} index 2 Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: wlan0 {remove} index 3 Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: eth1 {remove} index 4 Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: Exit Oct 19 10:44:01 htpc connmand[3168]: eth0 {remove} index 2 Oct 19 10:44:01
Re: [systemd-devel] One of my fundamental problems with systemd...
On Fri, 2012-10-26 at 20:06 -0400, Michael H. Warfield wrote: On Fri, 2012-10-26 at 16:51 -0700, Kok, Auke-jan H wrote: On Fri, Oct 26, 2012 at 3:39 PM, Michael H. Warfield m...@wittsend.com wrote: My most fundamental problem with systemd is its insistence in hiding and obfuscating errors in ways that makes debugging almost impossible. Almost every upgrade problem I've had in Fedora has been related to systemd's failure to provide comprehendable error messages to things like errors in fstab (#1 fsck up). The most recent problem has been an issue trying to get LXC containers to work. The networking is not coming up in the container at boot. It's not starting. What do I get? I finally dug it out of the console barf. A message that says this: [FAILED] Failed to start LSB: Bring up/down networking. See 'systemctl status network.service' for details. Ok fine... So I get logged in and I run this... [root@alcove mhw]# systemctl status network.service network.service - LSB: Bring up/down networking Loaded: loaded (/etc/rc.d/init.d/network) Active: failed (Result: exit-code) since Wed, 24 Oct 2012 18:23:07 +0400; 1min 57s ago Process: 91 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=209/STDOUT) CGroup: name=systemd:/system/lxc/Alcove/system/network.service Tells me nothing. Does not tell me where the problem is... On the contrary, it does quite clearly indicate where the problem is. The information may be represented differently than you expect, but it is clear on the status of the service - including the return value of the execution. If I then try to manually start the network I get this... [root@alcove mhw]# service network start Starting network (via systemctl): network[275]: Bringing up loopback interface: ./network-functions: line 237: cd: /var/run/netreport: No such file or directory network[275]: [ OK ] network[275]: Bringing up interface eth0: ./network-functions: line 237: cd: /var/run/netreport: No such file or directory network[275]: [ OK ] network[275]: touch: cannot touch `/var/lock/subsys/network': No such file or directory [ OK ] OK... This I can understand. There are missing directories in /var/run and in /var/lock. Don't tell me how that script should have done this or that or the other. That's NOT the problem. The problem is that systemd did not communicate back WHAT THE REAL PROBLEM WAS. Why is it so difficult for systemd to respond with intelligent error message??? The message said to run systemctl status network.service but that result was worthless. I'll now edit that startup script to fix this nonsense but it's pointing to a fundamental failure in systemd in communicating errors back to administrators in an actionable manner. Try and be reasonable here. Here's what I read in your message: [FAILED] Failed to start LSB: Bring up/down networking. See 'systemctl status network.service' for details. Ahh, so let's read that output: [root@alcove mhw]# systemctl status network.service network.service - LSB: Bring up/down networking ok, just describes the service Loaded: loaded (/etc/rc.d/init.d/network) ok, it's loaded. it's a sysV init script. Maybe the script is old and wasn't written for systemd? Active: failed (Result: exit-code) since Wed, 24 Oct 2012 18:23:07 +0400; 1min 57s ago hmm, EXIT CODE failure. It exited with a non-zero status. Process: 91 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=209/STDOUT) ok so, the script returned a strange error code. CGroup: name=systemd:/system/lxc/Alcove/system/network.service shrug, doesn't seem to matter. ... Now, from just this I can conclude that your `/etc/rc.d/init.d/network start` produced an error. How is that useless information? It is exactly what the status of the network service is - failed, with an error code. Now, from what I remember newer versions of systemd produce a short 'tail' of the services' error log in case it fails, looking like this: # systemctl status connman.service connman.service - Connection service Loaded: loaded (/etc/systemd/system/connman.service; disabled) Active: inactive (dead) CGroup: name=systemd:/system/connman.service Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: Remove interface wlan0 [ wifi ] Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: Remove interface eth1 [ ethernet ] Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: eth0 {remove} index 2 Oct 19 10:44:01 htpc connmand[3168]: connmand[3168]: wlan0 {remove} index 3 Oct 19 10:44:01 htpc connmand[3168
Re: [systemd-devel] One of my fundamental problems with systemd...
On Fri, 2012-10-26 at 18:32 -0700, Kok, Auke-jan H wrote: On Fri, Oct 26, 2012 at 6:06 PM, Michael H. Warfield m...@wittsend.com wrote: On Fri, 2012-10-26 at 16:51 -0700, Kok, Auke-jan H wrote: This should help. Obviously, journalctl should help you a lot as well. And journalctl gave me jack shit trying to figure this out and I really DON'T need yet another obscure command to dig out errors that should have been presented in the first place. fine, be that way. Please stop using systemd. You're obviously not interested in actually talking on a normal level to developers that can, and may actually want to help you. Excuse me? I'm a kernel maintainer and a member of the Samba team. I have dedicated almost 2 decades to promoting and developing several dozen open source projects I will not enumerate here. I'm willing to help but I'm extremely frustrated at this point. When things don't work and I can't drill down to the reason and people give me platitudes I get frustrated. Would you like access to this container? If you are on IPv6, I'll give you immediate root shell access. Send me and RSA key and I will send you the FQDN and give you sudo access! Then you can tell me! IPv4 would take a little longer. I've done this VMware and others. I believe in solving problems not making excuses for why it didn't work. Auke Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] One of my fundamental problems with systemd...
On Fri, 2012-10-26 at 21:50 -0400, Michael H. Warfield wrote: On Fri, 2012-10-26 at 18:32 -0700, Kok, Auke-jan H wrote: On Fri, Oct 26, 2012 at 6:06 PM, Michael H. Warfield m...@wittsend.com wrote: On Fri, 2012-10-26 at 16:51 -0700, Kok, Auke-jan H wrote: This should help. Obviously, journalctl should help you a lot as well. And journalctl gave me jack shit trying to figure this out and I really DON'T need yet another obscure command to dig out errors that should have been presented in the first place. fine, be that way. Please stop using systemd. You're obviously not interested in actually talking on a normal level to developers that can, and may actually want to help you. Obviously, if I am to maintain an environment based on these distributions, that is impossible. Fedora has made it abjectly impossible to dispose of systemd an to even drop back to upstart at this point. That is not an option. Excuse me? I'm a kernel maintainer and a member of the Samba team. I have dedicated almost 2 decades to promoting and developing several dozen open source projects I will not enumerate here. I'm willing to help but I'm extremely frustrated at this point. When things don't work and I can't drill down to the reason and people give me platitudes I get frustrated. Would you like access to this container? If you are on IPv6, I'll give you immediate root shell access. Send me and RSA key and I will send you the FQDN and give you sudo access! Then you can tell me! IPv4 would take a little longer. I've done this VMware and others. I believe in solving problems not making excuses for why it didn't work. Hmmm... I'll tell you what. I'll even set up your very own LXC container running systemd at my colo facility (I have a rack at an ISP) so you have lots of bandwidth to play with. I'll get it up and running and you can access it and tell me what causes the errors that force me to go through those manual steps. I'll provide you with a complete development and test bed to make this work. I'll work with you. All in my infrastructure and my resources. All I want is for this to work so I can move beyond Fedora 15. How does that sound? I'm at your disposal. Auke Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] One of my fundamental problems with systemd...
On Fri, 2012-10-26 at 22:05 -0400, Michael H. Warfield wrote: On Fri, 2012-10-26 at 21:50 -0400, Michael H. Warfield wrote: On Fri, 2012-10-26 at 18:32 -0700, Kok, Auke-jan H wrote: On Fri, Oct 26, 2012 at 6:06 PM, Michael H. Warfield m...@wittsend.com wrote: On Fri, 2012-10-26 at 16:51 -0700, Kok, Auke-jan H wrote: This should help. Obviously, journalctl should help you a lot as well. And journalctl gave me jack shit trying to figure this out and I really DON'T need yet another obscure command to dig out errors that should have been presented in the first place. fine, be that way. Please stop using systemd. You're obviously not interested in actually talking on a normal level to developers that can, and may actually want to help you. Obviously, if I am to maintain an environment based on these distributions, that is impossible. Fedora has made it abjectly impossible to dispose of systemd an to even drop back to upstart at this point. That is not an option. Excuse me? I'm a kernel maintainer and a member of the Samba team. I have dedicated almost 2 decades to promoting and developing several dozen open source projects I will not enumerate here. BTW... Not to drop names (which I'm about to do) or anything and I know in a big organization not everybody knows everyone but... Just for references, if there is any question as to who I am, please check with Jeff Boerio jeff.boe...@intel.com or Bruce Monroe bruce.mon...@intel.com on your CERT teams in case there's some question about just who I am. I think they can provide sufficient references for me. I'm willing to help but I'm extremely frustrated at this point. When things don't work and I can't drill down to the reason and people give me platitudes I get frustrated. Would you like access to this container? If you are on IPv6, I'll give you immediate root shell access. Send me and RSA key and I will send you the FQDN and give you sudo access! Then you can tell me! IPv4 would take a little longer. I've done this VMware and others. I believe in solving problems not making excuses for why it didn't work. Hmmm... I'll tell you what. I'll even set up your very own LXC container running systemd at my colo facility (I have a rack at an ISP) so you have lots of bandwidth to play with. I'll get it up and running and you can access it and tell me what causes the errors that force me to go through those manual steps. I'll provide you with a complete development and test bed to make this work. I'll work with you. All in my infrastructure and my resources. All I want is for this to work so I can move beyond Fedora 15. How does that sound? I'm at your disposal. Auke Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] One of my fundamental problems with systemd...
On Sat, 2012-10-27 at 05:24 +0200, Olav Vitters wrote: sent too quickly.. On Sat, Oct 27, 2012 at 05:22:30AM +0200, Olav Vitters wrote: On Fri, Oct 26, 2012 at 10:16:47PM -0400, Michael H. Warfield wrote: BTW... Not to drop names (which I'm about to do) or anything and I know in a big organization not everybody knows everyone but... I prefer judging myself. And at the moment I see you using words like: - crap - POS - barf - fsck (though meaning fuck) It is not needed to tell what you did. Just be nice and people will show the same courtesy. Offense not intended. It's the terms we use in several other forums with no offense take. In fact, in the Openswan forum barf has a specific command and meaning. If you have a problem with Openswan you are quite often asked for the barf (dump). Maybe I'm too old and too abrasive and too use to the old culture where this was common. -- Regards, Olav Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Lxc-users] Unable to run systemd in an LXC / cgroup container.
On Thu, 2012-10-25 at 11:19 -0500, Serge Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): Sorry for taking a few days to get back on this. I was delivering a guest lecture up at Fordham University last Tuesday so I was out of pocket a couple of days or I would have responded sooner... On Mon, 2012-10-22 at 16:59 -0400, Michael H. Warfield wrote: On Mon, 2012-10-22 at 22:50 +0200, Lennart Poettering wrote: On Mon, 22.10.12 11:48, Michael H. Warfield (m...@wittsend.com) wrote: To summarize the problem... The LXC startup binary sets up various things for /dev and /dev/pts for the container to run properly and this works perfectly fine for SystemV start-up scripts and/or Upstart. Unfortunately, systemd has mounts of devtmpfs on /dev and devpts on /dev/pts which then break things horribly. This is because the kernel currently lacks namespaces for devices and won't for some time to come (in design). When devtmpfs gets mounted over top of /dev in the container, it then hijacks the hosts console tty and several other devices which had been set up through bind mounts by LXC and should have been LEFT ALONE. Please initialize a minimal tmpfs on /dev. systemd will then work fine. My containers have a reasonable /dev that work with Upstart just fine but they are not on tmpfs. Is mounting tmpfs on /dev and recreating that minimal /dev required? Well, it can be any kind of mount really. Just needs to be a mount. And the idea is to use tmpfs for this. What /dev are you currently using? It's probably not a good idea to reuse the hosts' /dev, since it contains so many device nodes that should not be accessible/visible to the container. Got it. And that explains the problems we're seeing but also what I'm seeing in some libvirt-lxc related pages, which is a separate and distinct project in spite of the similarities in the name... http://wiki.1tux.org/wiki/Lxc/Installation#Additional_notes Unfortunately, in our case, merely getting a mount in there is a complication in that it also has to be populated but, at least, we understand the problem set now. Ok... Serge and I were corresponding on the lxc-users list and he had a suggestion that worked but I consider to be a bit of a sub-optimal workaround. Ironically, it was to mount devtmpfs on /dev. We don't Oh, sorry - I take back that suggestion :) Well, it worked (sort of) and reinforced what the problem was and where the solution lay so there's no need to be sorry for it. We learned and we know why it's not the right solution. This is good. We made a lot of progress on this just in the last week. This is very good. Note that we have mount hooks, so templates could install a mount hook to mount a tmpfs onto /dev and populate it. Ah, now that is interesting. I haven't looked at that before. I need to explore that further. Or, if everyone is going to need it, we could just add a 'lxc.populatedevs = 1' option which does that without needing a hook. Eventually, with Fedora (and later RHEL / CentOS / SL), Arch Linux, and others going to systemd, I think this is going to be needed sooner than later. devtmpfs should not be used in containers :) Concur! -serge Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Lxc-users] Unable to run systemd in an LXC / cgroup container.
Hey Serge, On Thu, 2012-10-25 at 11:19 -0500, Serge Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): Sorry for taking a few days to get back on this. I was delivering a guest lecture up at Fordham University last Tuesday so I was out of pocket a couple of days or I would have responded sooner... On Mon, 2012-10-22 at 16:59 -0400, Michael H. Warfield wrote: On Mon, 2012-10-22 at 22:50 +0200, Lennart Poettering wrote: On Mon, 22.10.12 11:48, Michael H. Warfield (m...@wittsend.com) wrote: To summarize the problem... The LXC startup binary sets up various things for /dev and /dev/pts for the container to run properly and this works perfectly fine for SystemV start-up scripts and/or Upstart. Unfortunately, systemd has mounts of devtmpfs on /dev and devpts on /dev/pts which then break things horribly. This is because the kernel currently lacks namespaces for devices and won't for some time to come (in design). When devtmpfs gets mounted over top of /dev in the container, it then hijacks the hosts console tty and several other devices which had been set up through bind mounts by LXC and should have been LEFT ALONE. Please initialize a minimal tmpfs on /dev. systemd will then work fine. My containers have a reasonable /dev that work with Upstart just fine but they are not on tmpfs. Is mounting tmpfs on /dev and recreating that minimal /dev required? Well, it can be any kind of mount really. Just needs to be a mount. And the idea is to use tmpfs for this. What /dev are you currently using? It's probably not a good idea to reuse the hosts' /dev, since it contains so many device nodes that should not be accessible/visible to the container. Got it. And that explains the problems we're seeing but also what I'm seeing in some libvirt-lxc related pages, which is a separate and distinct project in spite of the similarities in the name... http://wiki.1tux.org/wiki/Lxc/Installation#Additional_notes Unfortunately, in our case, merely getting a mount in there is a complication in that it also has to be populated but, at least, we understand the problem set now. Ok... Serge and I were corresponding on the lxc-users list and he had a suggestion that worked but I consider to be a bit of a sub-optimal workaround. Ironically, it was to mount devtmpfs on /dev. We don't Oh, sorry - I take back that suggestion :) Note that we have mount hooks, so templates could install a mount hook to mount a tmpfs onto /dev and populate it. Ok... I've done some cursory search and turned up nothing but some comments about pre mount hooks. Where is the documentation about this feature and how I might use / implement it? Some examples would probably suffice. Is there a require release version of lxc-utils? Or, if everyone is going to need it, we could just add a 'lxc.populatedevs = 1' option which does that without needing a hook. devtmpfs should not be used in containers :) -serge Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Lxc-users] Unable to run systemd in an LXC / cgroup container.
On Thu, 2012-10-25 at 13:23 -0400, Michael H. Warfield wrote: Hey Serge, On Thu, 2012-10-25 at 11:19 -0500, Serge Hallyn wrote: ... Oh, sorry - I take back that suggestion :) Note that we have mount hooks, so templates could install a mount hook to mount a tmpfs onto /dev and populate it. Ok... I've done some cursory search and turned up nothing but some comments about pre mount hooks. Where is the documentation about this feature and how I might use / implement it? Some examples would probably suffice. Is there a require release version of lxc-utils? I think I found what I needed in the changelog here: http://www.mail-archive.com/lxc-devel@lists.sourceforge.net/msg01490.html I'll play with it and report back. Or, if everyone is going to need it, we could just add a 'lxc.populatedevs = 1' option which does that without needing a hook. devtmpfs should not be used in containers :) -serge Regards, Mike -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct ___ Lxc-users mailing list lxc-us...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Lxc-users] Unable to run systemd in an LXC / cgroup container.
On Thu, 2012-10-25 at 14:02 -0500, Serge Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): On Thu, 2012-10-25 at 13:23 -0400, Michael H. Warfield wrote: Hey Serge, On Thu, 2012-10-25 at 11:19 -0500, Serge Hallyn wrote: ... Oh, sorry - I take back that suggestion :) Note that we have mount hooks, so templates could install a mount hook to mount a tmpfs onto /dev and populate it. Ok... I've done some cursory search and turned up nothing but some comments about pre mount hooks. Where is the documentation about this feature and how I might use / implement it? Some examples would probably suffice. Is there a require release version of lxc-utils? I think I found what I needed in the changelog here: http://www.mail-archive.com/lxc-devel@lists.sourceforge.net/msg01490.html I'll play with it and report back. Also the Lifecycle management hooks section in https://help.ubuntu.com/12.10/serverguide/lxc.html This isn't working... Based on what was in both of those articles, I added this entry to another container (Plover) to test... lxc.hook.mount = /var/lib/lxc/Plover/mount When I run lxc-start -n Plover, I see this: [root@forest ~]# lxc-start -n Plover lxc-start: unknow key lxc.hook.mount lxc-start: failed to read configuration file I'm running the latest rc... [root@forest ~]# rpm -qa | grep lxc lxc-0.8.0.rc2-1.fc16.x86_64 lxc-libs-0.8.0.rc2-1.fc16.x86_64 lxc-doc-0.8.0.rc2-1.fc16.x86_64 Is it something in git that hasn't made it to a release yet? Note that I'm thinking that having lxc-start guess how to fill in /dev is wrong, because different distros and even different releases of the same distros have different expectations. For instance ubuntu lucid wants /dev/shm to be a directory, while precise+ wants a symlink. So somehow the template should get involved, be it by adding a hook, or simply specifying a configuration file which lxc uses internally to decide how to create /dev. I agree this needs to be by some sort of convention or template that we can adjust. Personally I'd prefer if /dev were always populated by the templates, and containers (i.e. userspace) didn't mount a fresh tmpfs for /dev. But that does complicate userspace, and we've seen it in debian/ubuntu as well (i.e. at certain package upgrades which rely on /dev being cleared after a reboot). -serge Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Unable to run systemd in an LXC / cgroup container.
On Thu, 2012-10-25 at 23:38 +0200, Lennart Poettering wrote: On Thu, 25.10.12 11:59, Michael H. Warfield (m...@wittsend.com) wrote: I've got some more problems relating to shutting down containers, some of which may be related to mounting tmpfs on /run to which /var/run is symlinked to. We're doing halt / restart detection by monitoring utmp in that directory but it looks like utmp isn't even in that directory anymore and mounting tmpfs on it was always problematical. We may have to have a more generic method to detect when a container has shut down or is restarting in that case. I can't parse this. The system call reboot() is virtualized for containers just fine and the container managaer (i.e. LXC) can check for that easily. The problem we have had was with differentiating between reboot and halt to either shut the container down cold or restarted it. You say easily and yet we never came up with an easy solution and monitored utmp instead for the next runlevel change. What is your easy solution for that problem? Lennart -- Lennart Poettering - Red Hat, Inc. Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Lxc-users] Unable to run systemd in an LXC / cgroup container.
On Thu, 2012-10-25 at 20:30 -0500, Serge Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): On Thu, 2012-10-25 at 23:38 +0200, Lennart Poettering wrote: On Thu, 25.10.12 11:59, Michael H. Warfield (m...@wittsend.com) wrote: I've got some more problems relating to shutting down containers, some of which may be related to mounting tmpfs on /run to which /var/run is symlinked to. We're doing halt / restart detection by monitoring utmp in that directory but it looks like utmp isn't even in that directory anymore and mounting tmpfs on it was always problematical. We may have to have a more generic method to detect when a container has shut down or is restarting in that case. I can't parse this. The system call reboot() is virtualized for containers just fine and the container managaer (i.e. LXC) can check for that easily. The problem we have had was with differentiating between reboot and halt to either shut the container down cold or restarted it. You say easily and yet we never came up with an easy solution and monitored utmp instead for the next runlevel change. What is your easy solution for that problem? I think you're on older kernels, where we had to resort to that. Pretty recently Daniel Lezcano's patch was finally accepted upstream, which lets a container call reboot() and lets the parent of init tell whether it called reboot or shutdown by looking at wTERMSIG(status). Now THAT is wonderful news! I hadn't realized that had been accepted. So we no longer need to rely on the old utmp kludge? -serge Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Lxc-users] Unable to run systemd in an LXC / cgroup container.
On Mon, 2012-10-22 at 09:06 +0100, John wrote: On 22/10/12 03:06, Michael H. Warfield wrote: On Mon, 2012-10-22 at 02:53 +0200, Kay Sievers wrote: On Sun, Oct 21, 2012 at 11:25 PM, Michael H. Warfield m...@wittsend.com wrote: This is being directed to the systemd-devel community but I'm cc'ing the lxc-users community and the Fedora community on this for their input as well. I know it's not always good to cross post between multiple lists but this is of interest to all three communities who may have valuable input. I'm new to this particular list, just having joined after tracking a problem down to some systemd internals... Several people over the last year or two on the lxc-users list have been discussions trying to run certain distros (notably Fedora 16 and above, recent Arch Linux and possibly others) in LXC containers, virualizing entire servers this way. This is very similar to Virtuoso / OpenVZ only it's using the native Linux cgroups for the containers (primary reason I dumped OpenVZ was to avoid their custom patched kernels). These recent distros have switched to systemd for the main init process and this has proven to be disastrous for those of us using LXC and trying to install or update our containers. To put it bluntly, it doesn't work and causes all sorts of problems on the host. To summarize the problem... The LXC startup binary sets up various things for /dev and /dev/pts for the container to run properly and this works perfectly fine for SystemV start-up scripts and/or Upstart. Unfortunately, systemd has mounts of devtmpfs on /dev and devpts on /dev/pts which then break things horribly. This is because the kernel currently lacks namespaces for devices and won't for some time to come (in design). When devtmpfs gets mounted over top of /dev in the container, it then hijacks the hosts console tty and several other devices which had been set up through bind mounts by LXC and should have been LEFT ALONE. Yes! I recognize that this problem with devtmpfs and lack of namespaces is a potential security problem anyways that could (and does) cause serious container-to-host problems. We're just not going to get that fixed right away in the linux cgroups and namespaces. How do we work around this problem in systemd where it has hard coded mounts in the binary that we can't override or configure? Or is it there and I'm just missing it trying to examine the sources? That's how I found where the problem lay. As a first step, this probably explains most of it: http://www.freedesktop.org/wiki/Software/systemd/ContainerInterface A very long ways, yeah. That looks like it could be just what we've been looking for. Just gotta figure out how to set that environment variable but that's up to a couple of others to comment on in the lxc-users list. Then we'll see where we go from there. Many thanks! Kay Regards, Mike I've just performed a very quick check on my Arch Linux system here. on host (running systemd): # cat /proc/1/environ TERM=linuxRD_TIMESTAMP= In a container (running sysvinit): # cat /proc/1/environ STY=623.systemd-lithiumTERM=screenTERMCAP=SC|screen|VT 100/ANSI X3.64 virtual terminal:\ :DO=\E[%dB:LE=\E[%dD:RI=\E[%dC:UP=\E[%dA:bs:bt=\E[Z:\ :cd=\E[J:ce=\E[K:cl=\E[H\E[J:cm=\E[%i%d;%dH:ct=\E[3g:\ :do=^J:nd=\E[C:pt:rc=\E8:rs=\Ec:sc=\E7:st=\EH:up=\EM:\ :le=^H:bl=^G:cr=^M:it#8:ho=\E[H:nw=\EE:ta=^I:is=\E)0:\ :li#24:co#80:am:xn:xv:LP:sr=\EM:al=\E[L:AL=\E[%dL:\ :cs=\E[%i%d;%dr:dl=\E[M:DL=\E[%dM:dc=\E[P:DC=\E[%dP:\ :im=\E[4h:ei=\E[4l:mi:IC=\E[%d@:ks=\E[?1h\E=:\ :ke=\E[?1l\E:vi=\E[?25l:ve=\E[34h\E[?25h:vs=\E[34l:\ :ti=\E[?1049h:te=\E[?1049l:k0=\E[10~:k1=\EOP:k2=\EOQ:\ :k3=\EOR:k4=\EOS:k5=\E[15~:k6=\E[17~:k7=\E[18~:\ :k8=\E[19~:k9=\E[20~:k;=\E[21~:F1=\E[23~:F2=\E[24~:\ :kh=\E[1~:@1=\E[1~:kH=\E[4~:@7=\E[4~:kN=\E[6~:kP=\E[5~:\ :kI=\E[2~:kD=\E[3~:ku=\EOA:kd=\EOB:kr=\EOC:kl=\EOD:WINDOW=0SHELL=/bin/shPATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/binLANG=en_GB.UTF-8container=lxc So it looks like that container environment variable is already set on PID1 Yeah, I saw that myself last night. Testing that out and it's still not working here (although it doesn't seem to be grabbing the host console now) if I use systemd but upstart fires right up and I see that container variable set. Looked like a number of mounts listed on that wiki page. Maybe something is missing. Right now it's just hanging trying to start the container and, when I subsequently try to shut the container down it results in a hung resource and it can't delete the cgroups directory because it's busy. Only thing I did was change the link to /sbin/init from upstart to systemd and it's now dead and I'll have to reboot the host to free the resource. :-P Regards, John Regards, Mike -- Michael H. Warfield (AI4NB
Re: [systemd-devel] Unable to run systemd in an LXC / cgroup container.
On Mon, 2012-10-22 at 16:11 +0200, Lennart Poettering wrote: On Sun, 21.10.12 17:25, Michael H. Warfield (m...@wittsend.com) wrote: Hello, This is being directed to the systemd-devel community but I'm cc'ing the lxc-users community and the Fedora community on this for their input as well. I know it's not always good to cross post between multiple lists but this is of interest to all three communities who may have valuable input. I'm new to this particular list, just having joined after tracking a problem down to some systemd internals... Several people over the last year or two on the lxc-users list have been discussions trying to run certain distros (notably Fedora 16 and above, recent Arch Linux and possibly others) in LXC containers, virualizing entire servers this way. This is very similar to Virtuoso / OpenVZ only it's using the native Linux cgroups for the containers (primary reason I dumped OpenVZ was to avoid their custom patched kernels). These recent distros have switched to systemd for the main init process and this has proven to be disastrous for those of us using LXC and trying to install or update our containers. Note that it is explicitly our intention to make running systemd inside of containers as smooth as possibly. The notes Kay linked summarize what the container manager needs to do for best integration. To summarize the problem... The LXC startup binary sets up various things for /dev and /dev/pts for the container to run properly and this works perfectly fine for SystemV start-up scripts and/or Upstart. Unfortunately, systemd has mounts of devtmpfs on /dev and devpts on /dev/pts which then break things horribly. This is because the kernel currently lacks namespaces for devices and won't for some time to come (in design). When devtmpfs gets mounted over top of /dev in the container, it then hijacks the hosts console tty and several other devices which had been set up through bind mounts by LXC and should have been LEFT ALONE. Please initialize a minimal tmpfs on /dev. systemd will then work fine. My containers have a reasonable /dev that work with Upstart just fine but they are not on tmpfs. Is mounting tmpfs on /dev and recreating that minimal /dev required? Yes! I recognize that this problem with devtmpfs and lack of namespaces is a potential security problem anyways that could (and does) cause serious container-to-host problems. We're just not going to get that fixed right away in the linux cgroups and namespaces. No, devtmpfs really doesn't need updating, containers simply shouldn't use it. Ok, yeah. That seems to be at the heart of the problem we're trying to solve. How do we work around this problem in systemd where it has hard coded mounts in the binary that we can't override or configure? Or is it there and I'm just missing it trying to examine the sources? That's how I found where the problem lay. systemd will make use of pre-existing mounts if they exist, and only mount something new if they don't exist. So you're saying that, if we have something mounted on /dev, that's what prevents systemd from mounting devtmpfs on /dev? That could be problematical. Tested out a couple of options there that didn't work. That's going to take some effort. Note that there are reports that LXC has issues with the fact that newer systemd enables shared mount propagation for all mounts by default (this should actually be beneficial for containers as this ensures that new mounts appear in the containers). LXC when run on such a system fails as soon as it tries to use pivot_root(), as that is incompatible with shared mount propagation. The needs fixing in LXC: it should use MS_MOVE or MS_BIND to place the new root dir in / instead. A short term work-around is to simply remount the root tree to private before invoking LXC. But, I have systemd running on my host system (F17) and containers with sysvinit or upstart inits are all starting just fine. That sounds like it should impact all containers as pivot_root() is issued before systemd in the container is started. Or am I missing something here? That sounds like a problem for Serge and others to investigate further. I'll see about trying that workaround though. Lennart -- Lennart Poettering - Red Hat, Inc. Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Unable to run systemd in an LXC / cgroup container.
On Mon, 2012-10-22 at 22:50 +0200, Lennart Poettering wrote: On Mon, 22.10.12 11:48, Michael H. Warfield (m...@wittsend.com) wrote: To summarize the problem... The LXC startup binary sets up various things for /dev and /dev/pts for the container to run properly and this works perfectly fine for SystemV start-up scripts and/or Upstart. Unfortunately, systemd has mounts of devtmpfs on /dev and devpts on /dev/pts which then break things horribly. This is because the kernel currently lacks namespaces for devices and won't for some time to come (in design). When devtmpfs gets mounted over top of /dev in the container, it then hijacks the hosts console tty and several other devices which had been set up through bind mounts by LXC and should have been LEFT ALONE. Please initialize a minimal tmpfs on /dev. systemd will then work fine. My containers have a reasonable /dev that work with Upstart just fine but they are not on tmpfs. Is mounting tmpfs on /dev and recreating that minimal /dev required? Well, it can be any kind of mount really. Just needs to be a mount. And the idea is to use tmpfs for this. What /dev are you currently using? It's probably not a good idea to reuse the hosts' /dev, since it contains so many device nodes that should not be accessible/visible to the container. Got it. And that explains the problems we're seeing but also what I'm seeing in some libvirt-lxc related pages, which is a separate and distinct project in spite of the similarities in the name... http://wiki.1tux.org/wiki/Lxc/Installation#Additional_notes Unfortunately, in our case, merely getting a mount in there is a complication in that it also has to be populated but, at least, we understand the problem set now. systemd will make use of pre-existing mounts if they exist, and only mount something new if they don't exist. So you're saying that, if we have something mounted on /dev, that's what prevents systemd from mounting devtmpfs on /dev? Yes. But, I have systemd running on my host system (F17) and containers with sysvinit or upstart inits are all starting just fine. That sounds like it should impact all containers as pivot_root() is issued before systemd in the container is started. Or am I missing something here? That sounds like a problem for Serge and others to investigate further. I'll see about trying that workaround though. The shared issue is F18, and it's about running LXC on a systemd system, not about running systemd inside of LXC. Whew! I'll deal with F18 when I need to deal with F18. That explains why my F17 hosts are running and gives Serge and others a chance to address this, forewarned. Thanks for that info. Lennart -- Lennart Poettering - Red Hat, Inc. Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] Unable to run systemd in an LXC / cgroup container.
Hello, This is being directed to the systemd-devel community but I'm cc'ing the lxc-users community and the Fedora community on this for their input as well. I know it's not always good to cross post between multiple lists but this is of interest to all three communities who may have valuable input. I'm new to this particular list, just having joined after tracking a problem down to some systemd internals... Several people over the last year or two on the lxc-users list have been discussions trying to run certain distros (notably Fedora 16 and above, recent Arch Linux and possibly others) in LXC containers, virualizing entire servers this way. This is very similar to Virtuoso / OpenVZ only it's using the native Linux cgroups for the containers (primary reason I dumped OpenVZ was to avoid their custom patched kernels). These recent distros have switched to systemd for the main init process and this has proven to be disastrous for those of us using LXC and trying to install or update our containers. To put it bluntly, it doesn't work and causes all sorts of problems on the host. To summarize the problem... The LXC startup binary sets up various things for /dev and /dev/pts for the container to run properly and this works perfectly fine for SystemV start-up scripts and/or Upstart. Unfortunately, systemd has mounts of devtmpfs on /dev and devpts on /dev/pts which then break things horribly. This is because the kernel currently lacks namespaces for devices and won't for some time to come (in design). When devtmpfs gets mounted over top of /dev in the container, it then hijacks the hosts console tty and several other devices which had been set up through bind mounts by LXC and should have been LEFT ALONE. Yes! I recognize that this problem with devtmpfs and lack of namespaces is a potential security problem anyways that could (and does) cause serious container-to-host problems. We're just not going to get that fixed right away in the linux cgroups and namespaces. How do we work around this problem in systemd where it has hard coded mounts in the binary that we can't override or configure? Or is it there and I'm just missing it trying to examine the sources? That's how I found where the problem lay. Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel