Re: [lxc-devel] [PATCH 0/5] Signal stuff v2 and some documentation

2010-07-15 Thread Ferenc Wagner
Daniel Lezcano daniel.lezc...@free.fr writes:

 On 06/09/2010 07:56 PM, Ferenc Wagner wrote:

 here are basically the same patches, with some obvious errors corrected
 and some unrelated documentation added.  It actually survived some
 targeted testing in the past days and seems to behave as expected, ie.

 # lxc-start -n s -- sh -c trap 'echo TERM' TERM; sleep 10

 can be interrupted by Ctrl-C from the terminal (the sleep process does
 not ignore the SIGINT sent to the foreground process group by the OS),
 while a

 # pkill lxc-start

 does not terminate the sleep as the SIGTERM gets forwarded to the shell
 only, which reports it after the sleep expires.  This forwarding
 mechanism makes it possible to plug lxc into our batch queueing system.


 is it your last version or can I investigate with this patchset ?

Yes, this is the version I've been using since I posted it.  I haven't
ported it to latest git, but it shouldn't be hard.  It seems to do what
I intended, but obviously interferes with the console handling, but that
should be rethought anyway, as I see it.  Basically, I feel like the
container console from the user space PoV should be an alias for a
terminal device, just like on a real system.  /dev/console isn't
virtualized by the kernel, so it shouldn't be accessible from a
container, although bind mounting it to some tty is an option in case
some program uses it explicitly.  In any case, the console presented
by lxc-start should always be detachable, preferable even detached by
default.
-- 
Regards,
Feri.

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


Re: [lxc-devel] [PATCH 0/5] Signal stuff v2 and some documentation

2010-07-15 Thread Daniel Lezcano
On 07/15/2010 10:07 PM, Ferenc Wagner wrote:
 Daniel Lezcanodaniel.lezc...@free.fr  writes:


 On 06/09/2010 07:56 PM, Ferenc Wagner wrote:

  
 here are basically the same patches, with some obvious errors corrected
 and some unrelated documentation added.  It actually survived some
 targeted testing in the past days and seems to behave as expected, ie.

 # lxc-start -n s -- sh -c trap 'echo TERM' TERM; sleep 10

 can be interrupted by Ctrl-C from the terminal (the sleep process does
 not ignore the SIGINT sent to the foreground process group by the OS),
 while a

 # pkill lxc-start

 does not terminate the sleep as the SIGTERM gets forwarded to the shell
 only, which reports it after the sleep expires.  This forwarding
 mechanism makes it possible to plug lxc into our batch queueing system.


 is it your last version or can I investigate with this patchset ?
  
 Yes, this is the version I've been using since I posted it.  I haven't
 ported it to latest git, but it shouldn't be hard.  It seems to do what
 I intended, but obviously interferes with the console handling, but that
 should be rethought anyway, as I see it.

Ok, thanks.  I will take the 2 first patches, so signal forwarding is 
done but without [tc]setpgrp for the moment.
I have a couple a patches on top of yours where when lxc-init receives a 
SIGTERM, it does like the usual 'init' process by sending a kill(-1, 
SIGTERM) followed by a kill(-1, SIGKILL) if all the processes do not 
exit after a small amount of time.

I just figured out, in your use case, you are using 'lxc-start -n foo 
prog'. You are getting ride of the child reaping (the kernel reparents 
orphan processes to the container's init). The purpose of lxc-init is to 
reap childs, mount /proc, /dev/shm, forward signals to process 2 and 
support daemons. Maybe you already noticed that, but maybe you should 
use the 'lxc-execute -n foo prog' (which spawns lxc-init). In this 
case, it would be more convenient to do [tc]setpgrp in lxc-init, so we 
solve the problem with the console.


 Basically, I feel like the container console from the user space PoV should 
 be an alias for a
 terminal device, just like on a real system.  /dev/console isn't
 virtualized by the kernel, so it shouldn't be accessible from a
 container, although bind mounting it to some tty is an option in case
 some program uses it explicitly.

That was the first implementation but the '/sbin/init' process calls 
TIOCSCTTY, borrowing the tty to the current terminal.

In any case, the console presented
 by lxc-start should always be detachable, preferable even detached by
 default.


Yep, I will send a matrix with a lxc-execute vs lxc-start vs start() 
common function vs console and hopefully we can find a nice way to fix 
this mess.

Thanks Ferenc,

   -- Daniel


--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


Re: [lxc-devel] [PATCH 1/2]: Ensure freezer state has changed

2010-07-15 Thread Daniel Lezcano
On 07/15/2010 02:59 AM, Matt Helsley wrote:
 On Fri, Jul 09, 2010 at 07:51:32PM -0700, Sukadev Bhattiprolu wrote:

 From: Sukadev Bhattiprolusuka...@linux.vnet.ibm.com
 Subject: [PATCH 1/2] Ensure frezer state has changed

 A write to the freezer.state file does not gurantee that the state has
 changed. To ensure that the freezer state is either FROZEN or THAWED,
 read the freezer state and if it has not changed, repeat the write.

 Technically this is only necessary for the THAWED -  FROZEN
 transition. In other words, if we're FROZEN and write THAWED then
 we don't need to read the state. However, it doesn't hurt to check.

 Reviewed-by: Matt Helsleymatth...@us.ibm.com

Thanks Matt for the comments.
Suka, I pushed your patch, but if you have time, that would be nice if 
you can address Matt's comments.

Thanks
   -- Daniel

--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


[lxc-devel] [patch -lxc 2/4] generalize the name of the signal handler

2010-07-15 Thread Daniel Lezcano
From: Ferenc Wagner wf...@niif.hu

Signed-off-by: Ferenc Wagner wf...@niif.hu
Signed-off-by: Daniel Lezcano dlezc...@fr.ibm.com
---
 src/lxc/start.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/lxc/start.c b/src/lxc/start.c
index 92f44e3..1d4087c 100644
--- a/src/lxc/start.c
+++ b/src/lxc/start.c
@@ -190,7 +190,7 @@ int lxc_check_inherited(int fd_to_ignore)
return ret;
 }
 
-static int setup_sigchld_fd(sigset_t *oldmask)
+static int setup_signal_fd(sigset_t *oldmask)
 {
sigset_t mask;
int fd;
@@ -222,7 +222,7 @@ static int setup_sigchld_fd(sigset_t *oldmask)
return fd;
 }
 
-static int sigchld_handler(int fd, void *data,
+static int signal_handler(int fd, void *data,
   struct lxc_epoll_descr *descr)
 {
struct signalfd_siginfo siginfo;
@@ -305,7 +305,7 @@ int lxc_poll(const char *name, struct lxc_handler *handler)
goto out_sigfd;
}
 
-   if (lxc_mainloop_add_handler(descr, sigfd, sigchld_handler, pid)) {
+   if (lxc_mainloop_add_handler(descr, sigfd, signal_handler, pid)) {
ERROR(failed to add handler for the signal);
goto out_mainloop_open;
}
@@ -371,7 +371,7 @@ struct lxc_handler *lxc_init(const char *name, struct 
lxc_conf *conf)
/* the signal fd has to be created before forking otherwise
 * if the child process exits before we setup the signal fd,
 * the event will be lost and the command will be stuck */
-   handler-sigfd = setup_sigchld_fd(handler-oldmask);
+   handler-sigfd = setup_signal_fd(handler-oldmask);
if (handler-sigfd  0) {
ERROR(failed to set sigchild fd handler);
goto out_delete_console;
@@ -402,7 +402,7 @@ void lxc_fini(const char *name, struct lxc_handler *handler)
lxc_set_state(name, handler, STOPPING);
lxc_set_state(name, handler, STOPPED);
 
-   /* reset mask set by setup_sigchld_fd */
+   /* reset mask set by setup_signal_fd */
if (sigprocmask(SIG_SETMASK, handler-oldmask, NULL))
WARN(failed to restore sigprocmask);
 
-- 
1.7.0.4


--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


[lxc-devel] [patch -lxc 3/4] lxc-init kills all processes with SIGTERM

2010-07-15 Thread Daniel Lezcano
When lxc-init receives a SIGTERM, let's kill all the processes of
the pid namespace with kill -1. So the exit of the container will
happen gracefully with processes death cascade.

Signed-off-by: Daniel Lezcano dlezc...@fr.ibm.com
---
 src/lxc/lxc_init.c |   14 --
 1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/lxc/lxc_init.c b/src/lxc/lxc_init.c
index 5e0da5e..d91a3a1 100644
--- a/src/lxc/lxc_init.c
+++ b/src/lxc/lxc_init.c
@@ -154,11 +154,21 @@ int main(int argc, char *argv[])
int orphan = 0;
pid_t waited_pid;
 
-   if (was_interrupted) {
+   switch (was_interrupted) {
+
+   case 0:
+   break;
+
+   case SIGTERM:
+   kill(-1, SIGTERM);
+   break;
+
+   default:
kill(pid, was_interrupted);
-   was_interrupted = 0;
+   break;
}
 
+   was_interrupted = 0;
waited_pid = wait(status);
if (waited_pid  0) {
if (errno == ECHILD)
-- 
1.7.0.4


--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


[lxc-devel] [patch -lxc 4/4] lxc-init finishes the remaining processes with SIGKILL

2010-07-15 Thread Daniel Lezcano
If lxc-init receives a SIGALRM, a timeout, it kills all the processes
of the container with SIGKILL. That will prevent the container to be
stuck when one process ignore the SIGTERM signal.

Each time a process exits, the timeout is resetted.

Signed-off-by: Daniel Lezcano dlezc...@fr.ibm.com
---
 src/lxc/lxc_init.c |   36 +++-
 1 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/src/lxc/lxc_init.c b/src/lxc/lxc_init.c
index d91a3a1..5c264c6 100644
--- a/src/lxc/lxc_init.c
+++ b/src/lxc/lxc_init.c
@@ -82,7 +82,7 @@ int main(int argc, char *argv[])
int err = -1;
char **aargv;
sigset_t mask, omask;
-   int i;
+   int i, shutdown = 0;
 
while (1) {
int ret = getopt_long_only(argc, argv, , options, NULL);
@@ -106,6 +106,10 @@ int main(int argc, char *argv[])
aargv = argv[optind];
argc -= nbargs;
 
+/*
+* mask all the signals so we are safe to install a
+* signal handler and to fork
+*/
sigfillset(mask);
sigprocmask(SIG_SETMASK, mask, omask);
 
@@ -113,6 +117,9 @@ int main(int argc, char *argv[])
struct sigaction act;
 
sigfillset(act.sa_mask);
+   sigdelset(mask, SIGILL);
+   sigdelset(mask, SIGSEGV);
+   sigdelset(mask, SIGBUS);
act.sa_flags = 0;
act.sa_handler = interrupt_handler;
sigaction(i, act, NULL);
@@ -131,8 +138,10 @@ int main(int argc, char *argv[])
 
if (!pid) {
 
+   /* restore default signal handlers */
for (i = 1; i  NSIG; i++)
signal(i, SIG_DFL);
+
sigprocmask(SIG_SETMASK, omask, NULL);
 
NOTICE(about to exec '%s', aargv[0]);
@@ -142,6 +151,8 @@ int main(int argc, char *argv[])
exit(err);
}
 
+   /* let's process the signals now */
+   sigdelset(omask, SIGALRM);
sigprocmask(SIG_SETMASK, omask, NULL);
 
/* no need of other inherited fds but stderr */
@@ -160,7 +171,15 @@ int main(int argc, char *argv[])
break;
 
case SIGTERM:
-   kill(-1, SIGTERM);
+   if (!shutdown) {
+   shutdown = 1;
+   kill(-1, SIGTERM);
+   alarm(1);
+   }
+   break;
+
+   case SIGALRM:
+   kill(-1, SIGKILL);
break;
 
default:
@@ -175,13 +194,20 @@ int main(int argc, char *argv[])
goto out;
if (errno == EINTR)
continue;
-   ERROR(failed to wait child : %s, strerror(errno));
+
+   ERROR(failed to wait child : %s,
+ strerror(errno));
goto out;
}
 
+   /* reset timer each time a process exited */
+   if (shutdown)
+   alarm(1);
+
/*
-* keep the exit code of started application (not wrapped pid)
-* and continue to wait for the end of the orphan group.
+* keep the exit code of started application
+* (not wrapped pid) and continue to wait for
+* the end of the orphan group.
 */
if ((waited_pid != pid) || (orphan ==1))
continue;
-- 
1.7.0.4


--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


[lxc-devel] [patch -lxc 1/4] forward signals to the container init

2010-07-15 Thread Daniel Lezcano
From: Ferenc Wagner wf...@niif.hu

Signed-off-by: Ferenc Wagner wf...@niif.hu
Signed-off-by: Daniel Lezcano dlezc...@fr.ibm.com
---
 src/lxc/start.c |   22 ++
 1 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/src/lxc/start.c b/src/lxc/start.c
index dc57bea..92f44e3 100644
--- a/src/lxc/start.c
+++ b/src/lxc/start.c
@@ -195,13 +195,13 @@ static int setup_sigchld_fd(sigset_t *oldmask)
sigset_t mask;
int fd;
 
-   if (sigprocmask(SIG_BLOCK, NULL, mask)) {
-   SYSERROR(failed to get mask signal);
-   return -1;
-   }
-
-   if (sigaddset(mask, SIGCHLD) || sigprocmask(SIG_BLOCK, mask, 
oldmask)) {
-   SYSERROR(failed to set mask signal);
+   /* Block everything except serious error signals */
+   if (sigfillset(mask) ||
+   sigdelset(mask, SIGILL) ||
+   sigdelset(mask, SIGSEGV) ||
+   sigdelset(mask, SIGBUS) ||
+   sigprocmask(SIG_BLOCK, mask, oldmask)) {
+   SYSERROR(failed to set signal mask);
return -1;
}
 
@@ -231,7 +231,7 @@ static int sigchld_handler(int fd, void *data,
 
ret = read(fd, siginfo, sizeof(siginfo));
if (ret  0) {
-   ERROR(failed to read sigchld info);
+   ERROR(failed to read signal info);
return -1;
}
 
@@ -240,6 +240,12 @@ static int sigchld_handler(int fd, void *data,
return -1;
}
 
+   if (siginfo.ssi_signo != SIGCHLD) {
+   kill(*pid, siginfo.ssi_signo);
+   INFO(forwarded signal %d to pid %d, siginfo.ssi_signo, *pid);
+   return 0;
+   }
+
if (siginfo.ssi_code == CLD_STOPPED ||
siginfo.ssi_code == CLD_CONTINUED) {
INFO(container init process was stopped/continued);
-- 
1.7.0.4


--
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel