Re: [Lxc-users] lxc-start leaves temporary pivot dir behind

2010-05-13 Thread Ferenc Wagner
Daniel Lezcano daniel.lezc...@free.fr writes:

 Ferenc Wagner wrote:

 Daniel Lezcano daniel.lezc...@free.fr writes:
   
 Ferenc Wagner wrote:
 
 Daniel Lezcano daniel.lezc...@free.fr writes:
 
 Ferenc Wagner wrote:
 
 Actually, I'm not sure you can fully solve this.  If rootfs is a
 separate file system, this is only much ado about nothing.  If rootfs
 isn't a separate filesystem, you can't automatically find a good
 place and also clean it up.

 Maybe a single /tmp/lxc directory may be used as the mount points are
 private to the container. So it would be acceptable to have a single
 directory for N containers, no ?

 Then why not /usr/lib/lxc/pivotdir or something like that?  Such a
 directory could belong to the lxc package and not clutter up /tmp.  As
 you pointed out, this directory would always be empty in the outer name
 space, so a single one would suffice.  Thus there would be no need
 cleaning it up, either.

 Agree. Shall we consider $(prefix)/var/run/lxc ?

 Hmm, /var/run/lxc is inconvenient, because it disappears on each reboot
 if /var/run is on tmpfs.  This isn't variable data either, that's why I
 recommended /usr above.

 Good point. I will change that to /usr/$(libdir)/lxc and let the
 distro maintainer to choose a better place if he wants with the
 configure option.

I'm not sure what libdir is, doesn't this conflict with lxc-init?
That's in the /usr/lib/lxc directory, at least in Debian.  I'd vote for
/usr/lib/lxc/oldroot in this setting.
-- 
Regards,
Feri.

--

___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] lxc-start leaves temporary pivot dir behind

2010-05-13 Thread Ferenc Wagner
Michael H. Warfield m...@wittsend.com writes:

 On Wed, 2010-05-12 at 23:18 +0200, Daniel Lezcano wrote: 

 Ferenc Wagner wrote:

 Daniel Lezcano daniel.lezc...@free.fr writes:
   
 Ferenc Wagner wrote:

 Daniel Lezcano daniel.lezc...@free.fr writes:
   
 Ferenc Wagner wrote:
 
 Actually, I'm not sure you can fully solve this.  If rootfs is a
 separate file system, this is only much ado about nothing.  If rootfs
 isn't a separate filesystem, you can't automatically find a good
 place and also clean it up.

 Maybe a single /tmp/lxc directory may be used as the mount points are
 private to the container. So it would be acceptable to have a single
 directory for N containers, no ?

 Then why not /usr/lib/lxc/pivotdir or something like that?  Such a
 directory could belong to the lxc package and not clutter up /tmp.  As
 you pointed out, this directory would always be empty in the outer name
 space, so a single one would suffice.  Thus there would be no need
 cleaning it up, either.

 Agree. Shall we consider $(prefix)/var/run/lxc ?

 Hmm, /var/run/lxc is inconvenient, because it disappears on each reboot
 if /var/run is on tmpfs.  This isn't variable data either, that's why I
 recommended /usr above.

 Good point. I will change that to /usr/$(libdir)/lxc and let the distro 
 maintainer to choose a better place if he wants with the configure option.

 Are you SURE you want /usr/${libdir}/lxc for this?  Some high security
 systems might mount /usr as a separate read-only partition (OK - I'm and
 old school old fart).  Part of the standard allows for /usr to be an RO
 file system.

Read-only /usr is a good thing, and stays perfectly possible with this
choice.  We're talking about an absolutely static directory, which
serves as a temporary mount point only.

 Wouldn't this be more appropriate in /var/${libdir}/lxc instead?  Maybe
 create a .tmp directory under it or .tmp.${CTID} or something?  Or,
 maybe, something under /var/${libdir}/lxc/${CTID}/tmp instead?  /var is
 for things that change and vary.  Wouldn't that be a better location and
 you've already got control of the /var/${libdir}/lxc location, don't
 you?

There's nothing variable in this directory, and we need a single one
only, and only when rootfs is the same file system as the current root
(looking forward a little bit).

I don't know the FHS by heart, maybe it has something to say about this.
I'd certainly be fine with /var/lib/lxc/oldroot or something like that
as well.
-- 
Regards,
Feri.

--

___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] lxc-unshare woes and signal forwarding in lxc-start

2010-05-13 Thread Ferenc Wagner
Daniel Lezcano daniel.lezc...@free.fr writes:

 Ferenc Wagner wrote:

 Daniel Lezcano daniel.lezc...@free.fr writes:
   
 Ferenc Wagner wrote:
 
 Daniel Lezcano daniel.lezc...@free.fr writes:
 
 Ferenc Wagner wrote:
 
 I'd like to use lxc-start as a wrapper, invisible to the parent and
 the (jailed) child.  Of course I could hack around this by not
 exec-ing lxc-start but keeping the shell around, trap all signals and
 lxc-killing them forward.  But it's kind of ugly in my opinion.

 Ok, got it. I think that makes sense to forward the signals,
 especially for job management.  What signals do you want to
 forward?

 Basically all of them.  I couldn't find a definitive list of signals
 used for job control in SGE, but the following is probably a good
 approximation: SIGTTOU, SIGTTIN, SIGUSR1, SIGUSR2, SIGCONT, SIGWINCH and
 SIGTSTP.

 Yes, that could be a good starting point. I was wondering about
 SIGSTOP being sent to lxc-start which is not forwardable of course,
 is it a problem ?

 I suppose not, SIGSTOP and SIGKILL are impossible to use in application-
 specific ways.  On the other hand, SIGXCPU and SIGXFSZ should probably
 be forwarded, too.  Naturally, this business can't be perfected, but a
 good enough solution could still be valuable.

 Agree.

I attached a proof-of-concept patch which seems to work good enough for
me.  The function names are somewhat off now, but I leave that for later.

 Looking at the source, the SIGCHLD mechanism could be
 mimicked, but LXC_TTY_ADD_HANDLER may get in the way.

 We should remove LXC_TTY_ADD_HANDLER and do everything in the signal
 handler of SIGCHLD by extending the handler. I have a pending fix
 changing a bit the signal handler function.

What's the purpose of LXC_TTY_ADD_HANDLER anyway?  I didn't dig into it.

 I'm also worried about signals sent to the whole process group: they
 may be impossible to distinguish from the targeted signals and thus
 can't propagate correctly.

 Good point. Maybe we can setpgrp the first process of the container?

 We've got three options:
   A) do nothing, as now
   B) forward to our child
   C) forward to our child's process group

 The signal could arrive because it was sent to
   1) the PID of lxc-start
   2) the process group of lxc-start

 If we don't put the first process of the container into a new process
 group (as now), this is what happens:

 AB C
 1   swallowedOKothers also killed
 2  OK   child gets extraeverybody gets extra

 If we put the first process of the container into a new process group:

 AB C
 1   swallowedOKothers also killed
 2   swallowed   only the child killed  OK

 Neither is a clear winner, although the latter is somewhat more
 symmetrical.  I'm not sure about wanting all this configurable...

 hmm ... Maybe Greg, (it's an expert with signals and processes), has
 an idea on how to deal with that.

I'd say we should setpgrp the container init, forward all signals we
can to it, and have a configuration option for the set of signals which
should be forwarded to the full process group of the container init.
Or does it make sense to swallow anything?
-- 
Cheers,
Feri.

From 8ba413c1c19cf188d1d1bf1ed72fe26f265c192b Mon Sep 17 00:00:00 2001
From: Ferenc Wagner wf...@niif.hu
Date: Thu, 13 May 2010 11:33:59 +0200
Subject: [PATCH] forward control signals to the container init


Signed-off-by: Ferenc Wagner wf...@niif.hu
---
 src/lxc/start.c |   43 ++-
 1 files changed, 30 insertions(+), 13 deletions(-)

diff --git a/src/lxc/start.c b/src/lxc/start.c
index 7e34cce..58b747f 100644
--- a/src/lxc/start.c
+++ b/src/lxc/start.c
@@ -198,6 +198,16 @@ static int setup_sigchld_fd(sigset_t *oldmask)
 		return -1;
 	}
 
+	sigaddset(mask, SIGUSR1);
+	sigaddset(mask, SIGUSR2);
+	sigaddset(mask, SIGTERM);
+	sigaddset(mask, SIGCONT);
+	sigaddset(mask, SIGTSTP);
+	sigaddset(mask, SIGTTIN);
+	sigaddset(mask, SIGTTOU);
+	sigaddset(mask, SIGXCPU);
+	sigaddset(mask, SIGXFSZ);
+	sigaddset(mask, SIGWINCH);
 	if (sigaddset(mask, SIGCHLD) || sigprocmask(SIG_BLOCK, mask, oldmask)) {
 		SYSERROR(failed to set mask signal);
 		return -1;
@@ -238,22 +248,29 @@ static int sigchld_handler(int fd, void *data,
 		return -1;
 	}
 
-	if (siginfo.ssi_code == CLD_STOPPED ||
-	siginfo.ssi_code == CLD_CONTINUED) {
-		INFO(container init process was stopped/continued);
-		return 0;
-	}
+	switch (siginfo.ssi_signo) {
+	case SIGCHLD:
+		if (siginfo.ssi_code == CLD_STOPPED ||
+		siginfo.ssi_code == CLD_CONTINUED) {
+			INFO(container init process was stopped/continued);
+			return 0;
+		}
 
-	/* more robustness, protect ourself from a SIGCHLD sent
-	 * by a process different from the container init
-	 */
-	if (siginfo.ssi_pid != *pid) {
-		WARN(invalid pid for SIGCHLD);
+		/* more robustness, protect ourself from a SIGCHLD 

[Lxc-users] LXC a feature complete replacement of OpenVZ?

2010-05-13 Thread Christian Haintz
Hi,

At first LXC seams to be a great work from what we have read already.

There are still a few open questions for us (we are currently running  
dozens of OpenVZ Hardwarenodes).

1) OpenVZ in the long-term seams to be a dead end. Will LXC be a  
feature complete replacement for OpenVZ in the 1.0 Version?

As of the current version
2) is there IPTable support, any sort of control like the OpenVZ  
IPTable config.
3) Is there support for tun/tap device
4) is there support for correct memory info and disk space info (are  
df and top are showing the container ressources or the resources of  
the hardwarenode)
5) is there something compared to the fine grained controll about  
memory resources like vmguarpages/privmpages/oomguarpages in LXC?
6) is LXC production ready?

Thanks in Advance, and we are looking forward to switch to Linux  
Containers when all Questions are answered with yes :-)

Regards,
Christian

--
Christian Haintz
Student of Software Development and Business Management
Graz, University of Technology


--

___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] lxc-start leaves temporary pivot dir behind

2010-05-13 Thread Daniel Lezcano
Ferenc Wagner wrote:
 Daniel Lezcano daniel.lezc...@free.fr writes:


 Ferenc Wagner wrote:


 Daniel Lezcano daniel.lezc...@free.fr writes:


 Ferenc Wagner wrote:


 Daniel Lezcano daniel.lezc...@free.fr writes:


 Ferenc Wagner wrote:


 Actually, I'm not sure you can fully solve this.  If rootfs is a
 separate file system, this is only much ado about nothing.  If rootfs
 isn't a separate filesystem, you can't automatically find a good
 place and also clean it up.

 Maybe a single /tmp/lxc directory may be used as the mount points are
 private to the container. So it would be acceptable to have a single
 directory for N containers, no ?

 Then why not /usr/lib/lxc/pivotdir or something like that?  Such a
 directory could belong to the lxc package and not clutter up /tmp.  As
 you pointed out, this directory would always be empty in the outer name
 space, so a single one would suffice.  Thus there would be no need
 cleaning it up, either.

 Agree. Shall we consider $(prefix)/var/run/lxc ?

 Hmm, /var/run/lxc is inconvenient, because it disappears on each reboot
 if /var/run is on tmpfs.  This isn't variable data either, that's why I
 recommended /usr above.

 Good point. I will change that to /usr/$(libdir)/lxc and let the
 distro maintainer to choose a better place if he wants with the
 configure option.


 I'm not sure what libdir is, doesn't this conflict with lxc-init?
 That's in the /usr/lib/lxc directory, at least in Debian.  I'd vote for
 /usr/lib/lxc/oldroot in this setting.

$(libdir) is the variable defined by configure --libdir=path
Usually it is /usr/lib on 32bits or /usr/lib64 on 64bits.

lxc-init is located in $(libexecdir), that is /usr/libexec or /libexec 
depending of the configure setting.






--

___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] LXC a feature complete replacement of OpenVZ?

2010-05-13 Thread Gordon Henderson
On Thu, 13 May 2010, Christian Haintz wrote:

 Hi,

 At first LXC seams to be a great work from what we have read already.

 There are still a few open questions for us (we are currently running
 dozens of OpenVZ Hardwarenodes).

I can't answer for the developers, but here's my answers/observations 
based on what I've seen and used ...

 1) OpenVZ in the long-term seams to be a dead end. Will LXC be a
 feature complete replacement for OpenVZ in the 1.0 Version?

I looked at OpenVZ and while it looked promising, didn't seem to be going 
anywhere. I also struggled to get their patches into a recent kernel and 
it looked like there was no Debian support for it. LXC was in the kernel 
as standard - I doubt it'll come out now... (and there is a back-ported 
lxc debian package that works fine under Lenny)


 As of the current version
 2) is there IPTable support, any sort of control like the OpenVZ
 IPTable config.

I run iptables - and in some cases different iptable setups in each 
container on a host (which also has it's own iptables).

Seems to just work. Each container has an eth0 and the host has a br0 
(as well as an eth0).

Logging is at the kernel level though, so goes into the log-files on the 
server host rather than in the container - it may be possible to isolate 
that, but it's not something I'm too bothered with.

My iptables are just shell-scripts that get called as part of the boot 
sequence - I really don't know what sort of control OpenVZ gives you.


 3) Is there support for tun/tap device

Doesn't look like it yet...

http://www.mail-archive.com/lxc-users@lists.sourceforge.net/msg00239.html


 4) is there support for correct memory info and disk space info (are
 df and top are showing the container ressources or the resources of
 the hardwarenode)

Something I'm looking at myself - top gives your own processes, but cpu 
usage is for the whole machine. 'df' I can get by manipulating /etc/mtab - 
then I get the size of the entire partition my host is running under. I'm 
not doing anything 'clever' like creating a file and loopback mounting it 
- all my containers in a host are currently on the same partition. I'm not 
looking to give fixed-size disks to each container though. YMMV.

However gathering cpu stats for each container is something I am 
interested in - and was about to post to the list about it - I think there 
are files (on the host) under /cgroup/container-name/cpuacct.stat and a 
few others which might help me though, but I'm going to have to look them 
up...

 5) is there something compared to the fine grained controll about
 memory resources like vmguarpages/privmpages/oomguarpages in LXC?

Pass..

 6) is LXC production ready?

Not sure who could make that definitive decision ;-)

It sounds like the lack of tun/tap might be a show-stopper for you though. 
(come back next week ;-)

However, I'm using it in production - got a dozen LAMPy type boxes running 
it so-far with several containers inside, and a small number of asterisk 
hosts. (I'm not mixing the LAMP and asterisk hosts though) My clients 
haven't noticed any changes which makes me happy. I don't think what I'm 
doing is very stressful to the systems though, but so-far I'm very happy 
with it.

I did test it to my own satisfaction before I committed myself to it on 
servers 300 miles away though. One test was to create 20 containers on an 
old 1.8GHz celeron box, each running asterisk with one connected to the 
next and so on - then place a call into the first. It manged 3 loops 
playing media before it had any problems - and those were due to kernel 
context/network switching rather than anything to do with the LXC setup. 
(I suspect there is more network overhead though due to the bridge and 
vlan nature of the underlying plumbing)

So right now, I'm happy with LXC - I've no need for other virtualisation 
as I'm purely running Linux, so don't need to host Win, different kernels, 
etc. And for me, it's a management tool - I can now take a container and 
move it to different hardware (not yet a proper live migration, but the 
final rsync is currently only a few minutes and I can live with that) I 
have also saved myself a headache or two by moving old servers with OS's I 
couldn't upgrade into new hardware - so I have one server running Debian 
Lenny, kernel 2.6.33.1 hosting an old Debian Woody server inside a 
container running the customers custom application which they developed 6 
years ago... They're happy as they got new hardware and I'm happy as I 
didn't have to worry about migrating their code to a new version of Debian 
on new hardware... And I can also take that entire image now and move it 
to another server if I needed to load-balance, upgrade, cater for h/w 
failure, etc.

I'm using kernel 2.6.33.x (which I custom compile for the server hardware) 
and Debian Lenny FWIW.

I'm trying to not sound like a complete fanboi, but until the start of 
this year, I had no interest in virtualisation at all, but once 

Re: [Lxc-users] LXC a feature complete replacement of OpenVZ?

2010-05-13 Thread Daniel Lezcano
On 05/13/2010 06:17 PM, Christian Haintz wrote:
 Hi,

 At first LXC seams to be a great work from what we have read already.

 There are still a few open questions for us (we are currently running
 dozens of OpenVZ Hardwarenodes).

 1) OpenVZ in the long-term seams to be a dead end. Will LXC be a
 feature complete replacement for OpenVZ in the 1.0 Version?

Theorically speaking, LXC is not planned to be a replacement to OpenVZ. 
When a specific functionality is missing, it is added. Sometimes that 
needs a kernel development implying an attempt to mainline inclusion.

When the users of LXC want a new functionality, they send a patchset or 
ask if it possible to implement it. Often, the modifications need a 
kernel modification at that takes sometime to reach the upstream kernel 
(eg. sysfs per namespace).

Practically speaking, LXC evolves following the needs (eg. entering a 
container) of the users and that may lead to a replacement of OpenVZ.

The version 1.0 is planned to be a stable version, with documentation 
and frozen API.

 As of the current version
 2) is there IPTable support, any sort of control like the OpenVZ
 IPTable config.

The iptables support in the container is depending on the kernel version 
you are using. AFAICS, iptables per namespace is implemented now.

 3) Is there support for tun/tap device

The drivers are ready to be used in the container but not sysfs and that 
unfortunately prevent to create a tun/tap in a container.

sysfs per namespace is on the way to be merged upstream.

 4) is there support for correct memory info and disk space info (are
 df and top are showing the container ressources or the resources of
 the hardwarenode)

No and that will not be supported by the kernel but it is possible to do 
that with fuse. I did a prototype here:

http://lxc.sourceforge.net/download/procfs/procfs.tar.gz

But I gave up with it because I have too much things to do with lxc and 
not enough free time. Anyone is welcome to improve it ;)

 5) is there something compared to the fine grained controll about
 memory resources like vmguarpages/privmpages/oomguarpages in LXC?

I don't know these controls you are talking about but LXC is plugged 
with the cgroups. One of the subsystem of the cgroup is the memory 
controller allowing to assign an amount of physical memory and swap 
space to the container. There are some mechanism for notification as 
well. There are some other resource controller like io (new), freezer, 
cpuset, net_cls and device whitelist (googling one of these name + lwn 
may help).

 6) is LXC production ready?

yes and no :)

If you plan to run several webserver (not a full system) or non-root 
applications, then yes IMHO it is ready for production.

If you plan to run a full system and you have very aggressive users 
inside with root privilege then it may not be ready yet. If you setup a 
full system and you plan to have only the administrator of the host to 
be the administrator of the containers, and the users inside the 
container are never root, then IMHO it ready if you accept for example 
to have the iptables logs to go to the host system.

Really, it depends of what you want to do ...

I don't know OpenVZ very well, but AFAIK it is focused on system 
container while LXC can setup different level of isolation allowing to 
run an application sharing a filesystem or a network for example, as 
well as running a full system. But this flexibility is a drawback too 
because the administrator of the container needs a bit of knowledge on 
the system administration and the container technology.

 Thanks in Advance, and we are looking forward to switch to Linux
 Containers when all Questions are answered with yes :-)

Hope that helped.

Thanks
   -- Daniel

--

___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users