Re: [lxc-devel] [PATCH 0/5] Signal stuff v2 and some documentation
Daniel Lezcano daniel.lezc...@free.fr writes: On 06/09/2010 07:56 PM, Ferenc Wagner wrote: here are basically the same patches, with some obvious errors corrected and some unrelated documentation added. It actually survived some targeted testing in the past days and seems to behave as expected, ie. # lxc-start -n s -- sh -c trap 'echo TERM' TERM; sleep 10 can be interrupted by Ctrl-C from the terminal (the sleep process does not ignore the SIGINT sent to the foreground process group by the OS), while a # pkill lxc-start does not terminate the sleep as the SIGTERM gets forwarded to the shell only, which reports it after the sleep expires. This forwarding mechanism makes it possible to plug lxc into our batch queueing system. is it your last version or can I investigate with this patchset ? Yes, this is the version I've been using since I posted it. I haven't ported it to latest git, but it shouldn't be hard. It seems to do what I intended, but obviously interferes with the console handling, but that should be rethought anyway, as I see it. Basically, I feel like the container console from the user space PoV should be an alias for a terminal device, just like on a real system. /dev/console isn't virtualized by the kernel, so it shouldn't be accessible from a container, although bind mounting it to some tty is an option in case some program uses it explicitly. In any case, the console presented by lxc-start should always be detachable, preferable even detached by default. -- Regards, Feri. -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 0/5] Signal stuff v2 and some documentation
On 07/15/2010 10:07 PM, Ferenc Wagner wrote: Daniel Lezcanodaniel.lezc...@free.fr writes: On 06/09/2010 07:56 PM, Ferenc Wagner wrote: here are basically the same patches, with some obvious errors corrected and some unrelated documentation added. It actually survived some targeted testing in the past days and seems to behave as expected, ie. # lxc-start -n s -- sh -c trap 'echo TERM' TERM; sleep 10 can be interrupted by Ctrl-C from the terminal (the sleep process does not ignore the SIGINT sent to the foreground process group by the OS), while a # pkill lxc-start does not terminate the sleep as the SIGTERM gets forwarded to the shell only, which reports it after the sleep expires. This forwarding mechanism makes it possible to plug lxc into our batch queueing system. is it your last version or can I investigate with this patchset ? Yes, this is the version I've been using since I posted it. I haven't ported it to latest git, but it shouldn't be hard. It seems to do what I intended, but obviously interferes with the console handling, but that should be rethought anyway, as I see it. Ok, thanks. I will take the 2 first patches, so signal forwarding is done but without [tc]setpgrp for the moment. I have a couple a patches on top of yours where when lxc-init receives a SIGTERM, it does like the usual 'init' process by sending a kill(-1, SIGTERM) followed by a kill(-1, SIGKILL) if all the processes do not exit after a small amount of time. I just figured out, in your use case, you are using 'lxc-start -n foo prog'. You are getting ride of the child reaping (the kernel reparents orphan processes to the container's init). The purpose of lxc-init is to reap childs, mount /proc, /dev/shm, forward signals to process 2 and support daemons. Maybe you already noticed that, but maybe you should use the 'lxc-execute -n foo prog' (which spawns lxc-init). In this case, it would be more convenient to do [tc]setpgrp in lxc-init, so we solve the problem with the console. Basically, I feel like the container console from the user space PoV should be an alias for a terminal device, just like on a real system. /dev/console isn't virtualized by the kernel, so it shouldn't be accessible from a container, although bind mounting it to some tty is an option in case some program uses it explicitly. That was the first implementation but the '/sbin/init' process calls TIOCSCTTY, borrowing the tty to the current terminal. In any case, the console presented by lxc-start should always be detachable, preferable even detached by default. Yep, I will send a matrix with a lxc-execute vs lxc-start vs start() common function vs console and hopefully we can find a nice way to fix this mess. Thanks Ferenc, -- Daniel -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 1/2]: Ensure freezer state has changed
On 07/15/2010 02:59 AM, Matt Helsley wrote: On Fri, Jul 09, 2010 at 07:51:32PM -0700, Sukadev Bhattiprolu wrote: From: Sukadev Bhattiprolusuka...@linux.vnet.ibm.com Subject: [PATCH 1/2] Ensure frezer state has changed A write to the freezer.state file does not gurantee that the state has changed. To ensure that the freezer state is either FROZEN or THAWED, read the freezer state and if it has not changed, repeat the write. Technically this is only necessary for the THAWED - FROZEN transition. In other words, if we're FROZEN and write THAWED then we don't need to read the state. However, it doesn't hurt to check. Reviewed-by: Matt Helsleymatth...@us.ibm.com Thanks Matt for the comments. Suka, I pushed your patch, but if you have time, that would be nice if you can address Matt's comments. Thanks -- Daniel -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
[lxc-devel] [patch -lxc 2/4] generalize the name of the signal handler
From: Ferenc Wagner wf...@niif.hu Signed-off-by: Ferenc Wagner wf...@niif.hu Signed-off-by: Daniel Lezcano dlezc...@fr.ibm.com --- src/lxc/start.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/src/lxc/start.c b/src/lxc/start.c index 92f44e3..1d4087c 100644 --- a/src/lxc/start.c +++ b/src/lxc/start.c @@ -190,7 +190,7 @@ int lxc_check_inherited(int fd_to_ignore) return ret; } -static int setup_sigchld_fd(sigset_t *oldmask) +static int setup_signal_fd(sigset_t *oldmask) { sigset_t mask; int fd; @@ -222,7 +222,7 @@ static int setup_sigchld_fd(sigset_t *oldmask) return fd; } -static int sigchld_handler(int fd, void *data, +static int signal_handler(int fd, void *data, struct lxc_epoll_descr *descr) { struct signalfd_siginfo siginfo; @@ -305,7 +305,7 @@ int lxc_poll(const char *name, struct lxc_handler *handler) goto out_sigfd; } - if (lxc_mainloop_add_handler(descr, sigfd, sigchld_handler, pid)) { + if (lxc_mainloop_add_handler(descr, sigfd, signal_handler, pid)) { ERROR(failed to add handler for the signal); goto out_mainloop_open; } @@ -371,7 +371,7 @@ struct lxc_handler *lxc_init(const char *name, struct lxc_conf *conf) /* the signal fd has to be created before forking otherwise * if the child process exits before we setup the signal fd, * the event will be lost and the command will be stuck */ - handler-sigfd = setup_sigchld_fd(handler-oldmask); + handler-sigfd = setup_signal_fd(handler-oldmask); if (handler-sigfd 0) { ERROR(failed to set sigchild fd handler); goto out_delete_console; @@ -402,7 +402,7 @@ void lxc_fini(const char *name, struct lxc_handler *handler) lxc_set_state(name, handler, STOPPING); lxc_set_state(name, handler, STOPPED); - /* reset mask set by setup_sigchld_fd */ + /* reset mask set by setup_signal_fd */ if (sigprocmask(SIG_SETMASK, handler-oldmask, NULL)) WARN(failed to restore sigprocmask); -- 1.7.0.4 -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
[lxc-devel] [patch -lxc 3/4] lxc-init kills all processes with SIGTERM
When lxc-init receives a SIGTERM, let's kill all the processes of the pid namespace with kill -1. So the exit of the container will happen gracefully with processes death cascade. Signed-off-by: Daniel Lezcano dlezc...@fr.ibm.com --- src/lxc/lxc_init.c | 14 -- 1 files changed, 12 insertions(+), 2 deletions(-) diff --git a/src/lxc/lxc_init.c b/src/lxc/lxc_init.c index 5e0da5e..d91a3a1 100644 --- a/src/lxc/lxc_init.c +++ b/src/lxc/lxc_init.c @@ -154,11 +154,21 @@ int main(int argc, char *argv[]) int orphan = 0; pid_t waited_pid; - if (was_interrupted) { + switch (was_interrupted) { + + case 0: + break; + + case SIGTERM: + kill(-1, SIGTERM); + break; + + default: kill(pid, was_interrupted); - was_interrupted = 0; + break; } + was_interrupted = 0; waited_pid = wait(status); if (waited_pid 0) { if (errno == ECHILD) -- 1.7.0.4 -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
[lxc-devel] [patch -lxc 4/4] lxc-init finishes the remaining processes with SIGKILL
If lxc-init receives a SIGALRM, a timeout, it kills all the processes of the container with SIGKILL. That will prevent the container to be stuck when one process ignore the SIGTERM signal. Each time a process exits, the timeout is resetted. Signed-off-by: Daniel Lezcano dlezc...@fr.ibm.com --- src/lxc/lxc_init.c | 36 +++- 1 files changed, 31 insertions(+), 5 deletions(-) diff --git a/src/lxc/lxc_init.c b/src/lxc/lxc_init.c index d91a3a1..5c264c6 100644 --- a/src/lxc/lxc_init.c +++ b/src/lxc/lxc_init.c @@ -82,7 +82,7 @@ int main(int argc, char *argv[]) int err = -1; char **aargv; sigset_t mask, omask; - int i; + int i, shutdown = 0; while (1) { int ret = getopt_long_only(argc, argv, , options, NULL); @@ -106,6 +106,10 @@ int main(int argc, char *argv[]) aargv = argv[optind]; argc -= nbargs; +/* +* mask all the signals so we are safe to install a +* signal handler and to fork +*/ sigfillset(mask); sigprocmask(SIG_SETMASK, mask, omask); @@ -113,6 +117,9 @@ int main(int argc, char *argv[]) struct sigaction act; sigfillset(act.sa_mask); + sigdelset(mask, SIGILL); + sigdelset(mask, SIGSEGV); + sigdelset(mask, SIGBUS); act.sa_flags = 0; act.sa_handler = interrupt_handler; sigaction(i, act, NULL); @@ -131,8 +138,10 @@ int main(int argc, char *argv[]) if (!pid) { + /* restore default signal handlers */ for (i = 1; i NSIG; i++) signal(i, SIG_DFL); + sigprocmask(SIG_SETMASK, omask, NULL); NOTICE(about to exec '%s', aargv[0]); @@ -142,6 +151,8 @@ int main(int argc, char *argv[]) exit(err); } + /* let's process the signals now */ + sigdelset(omask, SIGALRM); sigprocmask(SIG_SETMASK, omask, NULL); /* no need of other inherited fds but stderr */ @@ -160,7 +171,15 @@ int main(int argc, char *argv[]) break; case SIGTERM: - kill(-1, SIGTERM); + if (!shutdown) { + shutdown = 1; + kill(-1, SIGTERM); + alarm(1); + } + break; + + case SIGALRM: + kill(-1, SIGKILL); break; default: @@ -175,13 +194,20 @@ int main(int argc, char *argv[]) goto out; if (errno == EINTR) continue; - ERROR(failed to wait child : %s, strerror(errno)); + + ERROR(failed to wait child : %s, + strerror(errno)); goto out; } + /* reset timer each time a process exited */ + if (shutdown) + alarm(1); + /* -* keep the exit code of started application (not wrapped pid) -* and continue to wait for the end of the orphan group. +* keep the exit code of started application +* (not wrapped pid) and continue to wait for +* the end of the orphan group. */ if ((waited_pid != pid) || (orphan ==1)) continue; -- 1.7.0.4 -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
[lxc-devel] [patch -lxc 1/4] forward signals to the container init
From: Ferenc Wagner wf...@niif.hu Signed-off-by: Ferenc Wagner wf...@niif.hu Signed-off-by: Daniel Lezcano dlezc...@fr.ibm.com --- src/lxc/start.c | 22 ++ 1 files changed, 14 insertions(+), 8 deletions(-) diff --git a/src/lxc/start.c b/src/lxc/start.c index dc57bea..92f44e3 100644 --- a/src/lxc/start.c +++ b/src/lxc/start.c @@ -195,13 +195,13 @@ static int setup_sigchld_fd(sigset_t *oldmask) sigset_t mask; int fd; - if (sigprocmask(SIG_BLOCK, NULL, mask)) { - SYSERROR(failed to get mask signal); - return -1; - } - - if (sigaddset(mask, SIGCHLD) || sigprocmask(SIG_BLOCK, mask, oldmask)) { - SYSERROR(failed to set mask signal); + /* Block everything except serious error signals */ + if (sigfillset(mask) || + sigdelset(mask, SIGILL) || + sigdelset(mask, SIGSEGV) || + sigdelset(mask, SIGBUS) || + sigprocmask(SIG_BLOCK, mask, oldmask)) { + SYSERROR(failed to set signal mask); return -1; } @@ -231,7 +231,7 @@ static int sigchld_handler(int fd, void *data, ret = read(fd, siginfo, sizeof(siginfo)); if (ret 0) { - ERROR(failed to read sigchld info); + ERROR(failed to read signal info); return -1; } @@ -240,6 +240,12 @@ static int sigchld_handler(int fd, void *data, return -1; } + if (siginfo.ssi_signo != SIGCHLD) { + kill(*pid, siginfo.ssi_signo); + INFO(forwarded signal %d to pid %d, siginfo.ssi_signo, *pid); + return 0; + } + if (siginfo.ssi_code == CLD_STOPPED || siginfo.ssi_code == CLD_CONTINUED) { INFO(container init process was stopped/continued); -- 1.7.0.4 -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel