Re: [lxc-devel] [PATCH 2/2] init: Fix whitespace damage
Quoting Richard Weinberger (rich...@nod.at): While we are here, fix the whitespace damage. Signed-off-by: Richard Weinberger rich...@nod.at Acked-by: Serge E. Hallyn serge.hal...@ubuntu.com --- src/lxc/lxc_init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/lxc/lxc_init.c b/src/lxc/lxc_init.c index e4c9a32..84e1293 100644 --- a/src/lxc/lxc_init.c +++ b/src/lxc/lxc_init.c @@ -47,7 +47,7 @@ static struct option options[] = { { 0, 0, 0, 0 }, }; -static int was_interrupted = 0; +static int was_interrupted = 0; static void interrupt_handler(int sig) { -- 1.8.1.4 -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 1/2] init: unnest interrupt_handler
Quoting Richard Weinberger (rich...@nod.at): There is no need to use nested functions voodoo. I see no downside to this, but what is the upside? Signed-off-by: Richard Weinberger rich...@nod.at --- src/lxc/lxc_init.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/src/lxc/lxc_init.c b/src/lxc/lxc_init.c index c83c2f1..e4c9a32 100644 --- a/src/lxc/lxc_init.c +++ b/src/lxc/lxc_init.c @@ -49,15 +49,14 @@ static struct option options[] = { static int was_interrupted = 0; -int main(int argc, char *argv[]) +static void interrupt_handler(int sig) { + if (!was_interrupted) + was_interrupted = sig; +} - void interrupt_handler(int sig) - { - if (!was_interrupted) - was_interrupted = sig; - } - +int main(int argc, char *argv[]) +{ pid_t pid; int nbargs = 0; int err = -1; -- 1.8.1.4 -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] start: Detect early failure of the new child
Quoting Richard Weinberger (rich...@nod.at): If the process in the new namespace dies very early we have currently no chance to detect this. The parent process will just die due to SIGPIPE if it write to the fd used for synchronisation and nobody will notice the real cause of the problem. Install a SIGCHLD handler to detect the death. Later when the child does execve() to the init within the new namespace the handler will be disabled automatically. Signed-off-by: Richard Weinberger rich...@nod.at Acked-by: Serge E. Hallyn serge.hal...@ubuntu.com Thanks Richard, useful for debugging. It also might help (as a follow-on patch) to use waitpid and print the exited pid alongside the init pid. --- src/lxc/start.c | 28 1 file changed, 28 insertions(+) diff --git a/src/lxc/start.c b/src/lxc/start.c index aefccd6..a58737a 100644 --- a/src/lxc/start.c +++ b/src/lxc/start.c @@ -198,6 +198,7 @@ static int setup_signal_fd(sigset_t *oldmask) sigdelset(mask, SIGILL) || sigdelset(mask, SIGSEGV) || sigdelset(mask, SIGBUS) || + sigdelset(mask, SIGCHLD) || sigprocmask(SIG_BLOCK, mask, oldmask)) { SYSERROR(failed to set signal mask); return -1; @@ -739,10 +740,29 @@ int save_phys_nics(struct lxc_conf *conf) return 0; } +static void sigchild_handler(int sig) +{ + int status; + pid_t child; + + child = wait(status); + if (child 0) { + SYSERROR(SIGCHLD caught but wait() failed: %m\n); + return; + } + + if (WIFSIGNALED(status)) + ERROR(Process in the new namespace died before execve() +due to signal: %i, WTERMSIG(status)); + else if (WIFEXITED(status)) + ERROR(Process in the new namespace died before execve() +with exit code: %i, WIFEXITED(status)); +} int lxc_spawn(struct lxc_handler *handler) { int failed_before_rename = 0; + struct sigaction act; const char *name = handler-name; if (lxc_sync_init(handler)) @@ -793,6 +813,14 @@ int lxc_spawn(struct lxc_handler *handler) goto out_delete_net; } + /* + * Install a SIGCHLD handler to detect the death of the new child between + * clone() and execve(). + */ + memset(act, 0, sizeof(act)); + act.sa_handler = sigchild_handler; + sigaction(SIGCHLD, act, NULL); + /* Create a process in a new set of namespaces */ handler-pid = lxc_clone(do_start, handler, handler-clone_flags); if (handler-pid 0) { -- 1.8.1.4 -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 1/2] Support starting containers concurrently
Quoting S.Çağlar Onur (cag...@10ur.org): From: S.Çağlar Onur cag...@10ur.org Trying to start multiple containers concurrently may cause lxc_monitor_read_timeout to fail as select call could be interrupted by a signal, handle it. Hi, so if I understand right we're waiting on this for a redesign of the monitor, right? How are you generating your patches? Could you add a Signed-off-by automatically? --- src/lxc/state.c |9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/lxc/state.c b/src/lxc/state.c index 95707ac..c50ef00 100644 --- a/src/lxc/state.c +++ b/src/lxc/state.c @@ -230,8 +230,13 @@ extern int lxc_wait(const char *lxcname, const char *states, int timeout, const goto out_close; curtime = tv.tv_sec; } - if (lxc_monitor_read_timeout(fd, msg, timeout) 0) - goto out_close; + if (lxc_monitor_read_timeout(fd, msg, timeout) 0) { + /* continue if select interrupted by signal */ + if (errno == EINTR) + continue; + else + goto out_close; + } if (timeout != -1) { retval = gettimeofday(tv, NULL); -- 1.7.10.4 -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
[lxc-devel] [PATCH 2/2] init: Fix whitespace damage
While we are here, fix the whitespace damage. Signed-off-by: Richard Weinberger rich...@nod.at --- src/lxc/lxc_init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/lxc/lxc_init.c b/src/lxc/lxc_init.c index e4c9a32..84e1293 100644 --- a/src/lxc/lxc_init.c +++ b/src/lxc/lxc_init.c @@ -47,7 +47,7 @@ static struct option options[] = { { 0, 0, 0, 0 }, }; -static int was_interrupted = 0; +static int was_interrupted = 0; static void interrupt_handler(int sig) { -- 1.8.1.4 -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
[lxc-devel] [PATCH 1/2] init: unnest interrupt_handler
There is no need to use nested functions voodoo. Signed-off-by: Richard Weinberger rich...@nod.at --- src/lxc/lxc_init.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/src/lxc/lxc_init.c b/src/lxc/lxc_init.c index c83c2f1..e4c9a32 100644 --- a/src/lxc/lxc_init.c +++ b/src/lxc/lxc_init.c @@ -49,15 +49,14 @@ static struct option options[] = { static int was_interrupted = 0; -int main(int argc, char *argv[]) +static void interrupt_handler(int sig) { + if (!was_interrupted) + was_interrupted = sig; +} - void interrupt_handler(int sig) - { - if (!was_interrupted) - was_interrupted = sig; - } - +int main(int argc, char *argv[]) +{ pid_t pid; int nbargs = 0; int err = -1; -- 1.8.1.4 -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
[lxc-devel] [PATCH] start: Detect early failure of the new child
If the process in the new namespace dies very early we have currently no chance to detect this. The parent process will just die due to SIGPIPE if it write to the fd used for synchronisation and nobody will notice the real cause of the problem. Install a SIGCHLD handler to detect the death. Later when the child does execve() to the init within the new namespace the handler will be disabled automatically. Signed-off-by: Richard Weinberger rich...@nod.at --- src/lxc/start.c | 28 1 file changed, 28 insertions(+) diff --git a/src/lxc/start.c b/src/lxc/start.c index aefccd6..a58737a 100644 --- a/src/lxc/start.c +++ b/src/lxc/start.c @@ -198,6 +198,7 @@ static int setup_signal_fd(sigset_t *oldmask) sigdelset(mask, SIGILL) || sigdelset(mask, SIGSEGV) || sigdelset(mask, SIGBUS) || + sigdelset(mask, SIGCHLD) || sigprocmask(SIG_BLOCK, mask, oldmask)) { SYSERROR(failed to set signal mask); return -1; @@ -739,10 +740,29 @@ int save_phys_nics(struct lxc_conf *conf) return 0; } +static void sigchild_handler(int sig) +{ + int status; + pid_t child; + + child = wait(status); + if (child 0) { + SYSERROR(SIGCHLD caught but wait() failed: %m\n); + return; + } + + if (WIFSIGNALED(status)) + ERROR(Process in the new namespace died before execve() + due to signal: %i, WTERMSIG(status)); + else if (WIFEXITED(status)) + ERROR(Process in the new namespace died before execve() + with exit code: %i, WIFEXITED(status)); +} int lxc_spawn(struct lxc_handler *handler) { int failed_before_rename = 0; + struct sigaction act; const char *name = handler-name; if (lxc_sync_init(handler)) @@ -793,6 +813,14 @@ int lxc_spawn(struct lxc_handler *handler) goto out_delete_net; } + /* +* Install a SIGCHLD handler to detect the death of the new child between +* clone() and execve(). +*/ + memset(act, 0, sizeof(act)); + act.sa_handler = sigchild_handler; + sigaction(SIGCHLD, act, NULL); + /* Create a process in a new set of namespaces */ handler-pid = lxc_clone(do_start, handler, handler-clone_flags); if (handler-pid 0) { -- 1.8.1.4 -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] utils: reimplement/fix mkdir_p()
Quoting Richard Weinberger (rich...@nod.at): Reimplement mkdir_p() such that it: ...handles relativ paths correctly. (currently it crashes) ...does not rely on dirname(). ...is not recursive. ...is shorter. ;-) Looks good, thanks. Yeah I prefer non-recursive. Three comments though, Signed-off-by: Richard Weinberger rich...@nod.at --- src/lxc/utils.c | 48 +--- 1 file changed, 17 insertions(+), 31 deletions(-) diff --git a/src/lxc/utils.c b/src/lxc/utils.c index e07ca7b..9794553 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -95,39 +95,25 @@ extern int get_u16(unsigned short *val, const char *arg, int base) return 0; } -static int is_all_slashes(char *path) -{ - while (*path *path == '/') - path++; - if (*path) - return 0; - return 1; -} - extern int mkdir_p(char *dir, mode_t mode) { - int ret; - char *d; - - if (is_all_slashes(dir)) - return 0; - - d = strdup(dir); - if (!d) - return -1; - - ret = mkdir_p(dirname(d), mode); - free(d); - if (ret) - return -1; - - if (!access(dir, F_OK)) - return 0; - - if (mkdir(dir, mode)) { - SYSERROR(failed to create directory '%s'\n, dir); - return -1; - } + char *tmp = dir; + char *orig = dir; + char *makeme; + + do { + dir = tmp + strspn(tmp, /); + tmp = dir + strcspn(dir, /); + makeme = strndupa(orig, dir - orig); strndupa *can* fail and return NULL. + if (*makeme) { + if (!access(makeme, F_OK)) + return 0; did you mean to continue here? + if (mkdir(makeme, mode)) { As people are starting to want to thread, I think it's worth making sure all mkdirs and creats check for errno=-EEXIST even if they've just checked with access. Certainly with the cgroup setup a race is possible when starting two simultaneous containers first time after boot using the api. + SYSERROR(failed to create directory '%s'\n, makeme); + return -1; + } + } + } while(tmp != dir); return 0; } -- 1.8.1.4 -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] Use container specific domain socket name
Hi Serge, Yeah you are correct we need regular users to be able to monitor their own containes. I guess we can encrypt the messages but I'm not going there :) Cheers, On Wed, Apr 17, 2013 at 8:52 AM, Serge Hallyn serge.hal...@ubuntu.comwrote: Quoting S.Çağlar Onur (cag...@10ur.org): Hi there, What about using AF_INET but binding a restricted port while adding a new field to the message? As an example we can start to create a hmac (or something like that) per container in the creation time and save that into LXCPATH/CONTAINERNAME/hmac. Then both client (can add that value to message) and server (can read from filesystem to check authenticity of the file) can use it. By binding a restricted port we guarantee that regular users cannot sniff the traffic and by using the filesystem permissions we provide the desired separation? But we want regular users to be able to monitor their own containers. Now I suppose we could require an extra netns layer where an unprivileged user must first create a new userns, create a new netns in that, and start containers from there. Then he has privilege over restricted ports in that netns, so he can monitor containers created from there. It also gives a somewhat simple way to provide networking to unprivileged-user-created containers- simply have a privileged init script create the userns+netns for the user, keeping it open, create a NIC in there and hook it into a host bridge (since this init job is privileged on the host), then hand the setns fd to the user (by bind-mounting into a DAC-protected directory like /lxc/ns/$USER/). Now the user can setns into /lxc/ns/$user before running any lxc commands. It's quite different from what I was earlier envisioning, but doable. Disclaimer: I'm groggy this morning, so might be talking sillyness. -serge -- S.Çağlar Onur cag...@10ur.org -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] utils: reimplement/fix mkdir_p()
Quoting Richard Weinberger (rich...@nod.at): Am 17.04.2013 18:00, schrieb Serge Hallyn: Quoting Richard Weinberger (rich...@nod.at): Reimplement mkdir_p() such that it: ...handles relativ paths correctly. (currently it crashes) ...does not rely on dirname(). ...is not recursive. ...is shorter. ;-) Looks good, thanks. Yeah I prefer non-recursive. Three comments though, Signed-off-by: Richard Weinberger rich...@nod.at --- src/lxc/utils.c | 48 +--- 1 file changed, 17 insertions(+), 31 deletions(-) diff --git a/src/lxc/utils.c b/src/lxc/utils.c index e07ca7b..9794553 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -95,39 +95,25 @@ extern int get_u16(unsigned short *val, const char *arg, int base) return 0; } -static int is_all_slashes(char *path) -{ - while (*path *path == '/') - path++; - if (*path) - return 0; - return 1; -} - extern int mkdir_p(char *dir, mode_t mode) { - int ret; - char *d; - - if (is_all_slashes(dir)) - return 0; - - d = strdup(dir); - if (!d) - return -1; - - ret = mkdir_p(dirname(d), mode); - free(d); - if (ret) - return -1; - - if (!access(dir, F_OK)) - return 0; - - if (mkdir(dir, mode)) { - SYSERROR(failed to create directory '%s'\n, dir); - return -1; - } + char *tmp = dir; + char *orig = dir; + char *makeme; + + do { + dir = tmp + strspn(tmp, /); + tmp = dir + strcspn(dir, /); + makeme = strndupa(orig, dir - orig); strndupa *can* fail and return NULL. How? I was hoping that smart libc on embedded would return NULL if it sees there isn't enough space. I misread the strndupa manpage to say it returns NULL on failure - I see there is no such promise. strndupa() uses alloca(), which allocates memory on the stack. *If* it fails we have already overrun the stack and are on a one-way trip to lala land. :-) + if (*makeme) { + if (!access(makeme, F_OK)) + return 0; did you mean to continue here? No, if the access() we have to stop. = If the access() does what? access(/var, F_OK) should immediately return 0, so you're saying mkdir_p(/var/lib/lxc/c1) should immediately fail? It's entirely possible I'm not thinking right, but... mkdir -p /foo/bar also just exists if it is not allowed to access one parent directory. + if (mkdir(makeme, mode)) { As people are starting to want to thread, I think it's worth making sure all mkdirs and creats check for errno=-EEXIST even if they've just checked with access. Certainly with the cgroup setup a race is possible when starting two simultaneous containers first time after boot using the api. Yes, you are right. I'll update my patch to take care of this. Thanks, //richard -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 1/2] Support starting containers concurrently
Hi, That's up to you, I can keep them in my branch till we endup with a new monitor code or you can merge them now and they start to work with new monitor code later. I'm using git to send those patches and I can easily add signed-off lines (I even don't know why they are missing as I believe I use --signoff argument) and sent it again. Best, On Wed, Apr 17, 2013 at 9:10 AM, Serge Hallyn serge.hal...@ubuntu.comwrote: Quoting S.Çağlar Onur (cag...@10ur.org): From: S.Çağlar Onur cag...@10ur.org Trying to start multiple containers concurrently may cause lxc_monitor_read_timeout to fail as select call could be interrupted by a signal, handle it. Hi, so if I understand right we're waiting on this for a redesign of the monitor, right? How are you generating your patches? Could you add a Signed-off-by automatically? --- src/lxc/state.c |9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/lxc/state.c b/src/lxc/state.c index 95707ac..c50ef00 100644 --- a/src/lxc/state.c +++ b/src/lxc/state.c @@ -230,8 +230,13 @@ extern int lxc_wait(const char *lxcname, const char *states, int timeout, const goto out_close; curtime = tv.tv_sec; } - if (lxc_monitor_read_timeout(fd, msg, timeout) 0) - goto out_close; + if (lxc_monitor_read_timeout(fd, msg, timeout) 0) { + /* continue if select interrupted by signal */ + if (errno == EINTR) + continue; + else + goto out_close; + } if (timeout != -1) { retval = gettimeofday(tv, NULL); -- 1.7.10.4 -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel -- S.Çağlar Onur cag...@10ur.org -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] Use container specific domain socket name
Well if we really want to go that route, we can a. specify a monitor-port in $lxcpath/lxc.conf b. write a world-unreadable $lxcpath/monitor-secret file c. require catting $lxcpath/monitor-secret at initial connection so /var/lib/lxc/lxc.conf can have monitor-port=9998, while /home/serge/lxcbase/lxc.conf has , and /var/lib/lxc/monitor-secret has uber-secure password 'god' while /home/serge/lxcbase/monitor-secret contains 'mortal'. Now user serge can monitor his own containers on port , but can't listen on port 9998. Of course we can still use a port 1024 for root-owned containers for a bit more peace of mind. But what I don't like about this is that it requires more setup, whereas the AF_UNIX based solution requires no hand-cranking. I suppose we could make monitor-port, if unspecified, default to some-hash($lxcpath), and just say that the secret doesnt' have to be specified (I'm not sure everyone will care about container event spoofing). -serge Quoting S.Çağlar Onur (cag...@10ur.org): Hi Serge, Yeah you are correct we need regular users to be able to monitor their own containes. I guess we can encrypt the messages but I'm not going there :) Cheers, On Wed, Apr 17, 2013 at 8:52 AM, Serge Hallyn serge.hal...@ubuntu.comwrote: Quoting S.Çağlar Onur (cag...@10ur.org): Hi there, What about using AF_INET but binding a restricted port while adding a new field to the message? As an example we can start to create a hmac (or something like that) per container in the creation time and save that into LXCPATH/CONTAINERNAME/hmac. Then both client (can add that value to message) and server (can read from filesystem to check authenticity of the file) can use it. By binding a restricted port we guarantee that regular users cannot sniff the traffic and by using the filesystem permissions we provide the desired separation? But we want regular users to be able to monitor their own containers. Now I suppose we could require an extra netns layer where an unprivileged user must first create a new userns, create a new netns in that, and start containers from there. Then he has privilege over restricted ports in that netns, so he can monitor containers created from there. It also gives a somewhat simple way to provide networking to unprivileged-user-created containers- simply have a privileged init script create the userns+netns for the user, keeping it open, create a NIC in there and hook it into a host bridge (since this init job is privileged on the host), then hand the setns fd to the user (by bind-mounting into a DAC-protected directory like /lxc/ns/$USER/). Now the user can setns into /lxc/ns/$user before running any lxc commands. It's quite different from what I was earlier envisioning, but doable. Disclaimer: I'm groggy this morning, so might be talking sillyness. -serge -- S.Çağlar Onur cag...@10ur.org -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] utils: reimplement/fix mkdir_p()
Am 17.04.2013 18:00, schrieb Serge Hallyn: Quoting Richard Weinberger (rich...@nod.at): Reimplement mkdir_p() such that it: ...handles relativ paths correctly. (currently it crashes) ...does not rely on dirname(). ...is not recursive. ...is shorter. ;-) Looks good, thanks. Yeah I prefer non-recursive. Three comments though, Signed-off-by: Richard Weinberger rich...@nod.at --- src/lxc/utils.c | 48 +--- 1 file changed, 17 insertions(+), 31 deletions(-) diff --git a/src/lxc/utils.c b/src/lxc/utils.c index e07ca7b..9794553 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -95,39 +95,25 @@ extern int get_u16(unsigned short *val, const char *arg, int base) return 0; } -static int is_all_slashes(char *path) -{ -while (*path *path == '/') -path++; -if (*path) -return 0; -return 1; -} - extern int mkdir_p(char *dir, mode_t mode) { -int ret; -char *d; - -if (is_all_slashes(dir)) -return 0; - -d = strdup(dir); -if (!d) -return -1; - -ret = mkdir_p(dirname(d), mode); -free(d); -if (ret) -return -1; - -if (!access(dir, F_OK)) -return 0; - -if (mkdir(dir, mode)) { -SYSERROR(failed to create directory '%s'\n, dir); -return -1; -} +char *tmp = dir; +char *orig = dir; +char *makeme; + +do { +dir = tmp + strspn(tmp, /); +tmp = dir + strcspn(dir, /); +makeme = strndupa(orig, dir - orig); strndupa *can* fail and return NULL. How? strndupa() uses alloca(), which allocates memory on the stack. *If* it fails we have already overrun the stack and are on a one-way trip to lala land. :-) +if (*makeme) { +if (!access(makeme, F_OK)) +return 0; did you mean to continue here? No, if the access() we have to stop. mkdir -p /foo/bar also just exists if it is not allowed to access one parent directory. +if (mkdir(makeme, mode)) { As people are starting to want to thread, I think it's worth making sure all mkdirs and creats check for errno=-EEXIST even if they've just checked with access. Certainly with the cgroup setup a race is possible when starting two simultaneous containers first time after boot using the api. Yes, you are right. I'll update my patch to take care of this. Thanks, //richard -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 1/2] Support starting containers concurrently
Quoting S.Çağlar Onur (cag...@10ur.org): Hi, That's up to you, I can keep them in my branch till we endup with a new monitor code or you can merge them now and they start to work with new monitor code later. I'm using git to send those patches and I can easily add signed-off lines (I even don't know why they are missing as I believe I use --signoff argument) and sent it again. Sure - please resend with signed-off-by, and I'll ack and push. I just do 'git commit -s' (by reflex) and git format-patch (well, git fp cause i'm lazy and most of my kbds suck) to make the patches - not sure what could be goign wrong for you. -serge -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
[lxc-devel] [PATCH 1/2] Support starting containers concurrently
From: S.Çağlar Onur cag...@10ur.org Trying to start multiple containers concurrently may cause lxc_monitor_read_timeout to fail as select call could be interrupted by a signal, handle it. Signed-off-by: S.Çağlar Onur cag...@10ur.org --- src/lxc/state.c |9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/lxc/state.c b/src/lxc/state.c index 437f11a..0bb307d 100644 --- a/src/lxc/state.c +++ b/src/lxc/state.c @@ -231,8 +231,13 @@ extern int lxc_wait(const char *lxcname, const char *states, int timeout, const goto out_close; curtime = tv.tv_sec; } - if (lxc_monitor_read_timeout(fd, msg, timeout) 0) - goto out_close; + if (lxc_monitor_read_timeout(fd, msg, timeout) 0) { + /* continue if select interrupted by signal */ + if (errno == EINTR) + continue; + else + goto out_close; + } if (timeout != -1) { retval = gettimeofday(tv, NULL); -- 1.7.10.4 -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
[lxc-devel] [PATCH 2/2] Support stopping containers concurrently
From: S.Çağlar Onur cag...@10ur.org Trying to stop multiple containers concurrently ends up with cgroup is not mounted errors as multiple threads corrupts the shared variables. Fix that stack corruption and start to use getmntent_r to support stopping multiple containers concurrently. Signed-off-by: S.Çağlar Onur cag...@10ur.org --- src/lxc/cgroup.c | 152 +++-- src/lxc/freezer.c | 18 +-- src/lxc/state.c | 15 -- 3 files changed, 126 insertions(+), 59 deletions(-) diff --git a/src/lxc/cgroup.c b/src/lxc/cgroup.c index 368214f..0739477 100644 --- a/src/lxc/cgroup.c +++ b/src/lxc/cgroup.c @@ -54,6 +54,11 @@ lxc_log_define(lxc_cgroup, lxc); #define MTAB /proc/mounts +/* In the case of a bind mount, there could be two long pathnames in the + * mntent plus options so use large enough buffer size + */ +#define LARGE_MAXPATHLEN 4 * MAXPATHLEN + /* Check if a mount is a cgroup hierarchy for any subsystem. * Return the first subsystem found (or NULL if none). */ @@ -100,29 +105,31 @@ static char *mount_has_subsystem(const struct mntent *mntent) */ static int get_cgroup_mount(const char *subsystem, char *mnt) { - struct mntent *mntent; + struct mntent *mntent, mntent_r; FILE *file = NULL; int ret, err = -1; + char buf[LARGE_MAXPATHLEN] = {0}; + file = setmntent(MTAB, r); if (!file) { SYSERROR(failed to open %s, MTAB); return -1; } - while ((mntent = getmntent(file))) { - if (strcmp(mntent-mnt_type, cgroup)) + while ((mntent = getmntent_r(file, mntent_r, buf, sizeof(buf { + if (strcmp(mntent_r.mnt_type, cgroup) != 0) continue; - + if (subsystem) { - if (!hasmntopt(mntent, subsystem)) + if (!hasmntopt(mntent_r, subsystem)) continue; } else { - if (!mount_has_subsystem(mntent)) + if (!mount_has_subsystem(mntent_r)) continue; } - ret = snprintf(mnt, MAXPATHLEN, %s, mntent-mnt_dir); + ret = snprintf(mnt, MAXPATHLEN, %s, mntent_r.mnt_dir); if (ret 0 || ret = MAXPATHLEN) goto fail; @@ -148,22 +155,33 @@ out: * * Returns 0 on success, -1 on error. * - * The answer is written in a static char[MAXPATHLEN] in this function and - * should not be freed. */ extern int cgroup_path_get(char **path, const char *subsystem, const char *cgpath) { - static charbuf[MAXPATHLEN]; - static charretbuf[MAXPATHLEN]; int rc; + char *buf = NULL; + char *retbuf = NULL; + + buf = malloc(MAXPATHLEN * sizeof(char)); + if (!buf) { + ERROR(malloc failed); + goto fail; + } + + retbuf = malloc(MAXPATHLEN * sizeof(char)); + if (!retbuf) { + ERROR(malloc failed); + goto fail; + } + /* lxc_cgroup_set passes a state object for the subsystem, * so trim it to just the subsystem part */ if (subsystem) { rc = snprintf(retbuf, MAXPATHLEN, %s, subsystem); if (rc 0 || rc = MAXPATHLEN) { ERROR(subsystem name too long); - return -1; + goto fail; } char *s = index(retbuf, '.'); if (s) @@ -172,19 +190,28 @@ extern int cgroup_path_get(char **path, const char *subsystem, const char *cgpat } if (get_cgroup_mount(subsystem ? retbuf : NULL, buf)) { ERROR(cgroup is not mounted); - return -1; + goto fail; } rc = snprintf(retbuf, MAXPATHLEN, %s/%s, buf, cgpath); if (rc 0 || rc = MAXPATHLEN) { ERROR(name too long); - return -1; + goto fail; } DEBUG(%s: returning %s for subsystem %s, __func__, retbuf, subsystem); + if(buf) + free(buf); + *path = retbuf; return 0; +fail: + if (buf) + free(buf); + if (retbuf) + free(retbuf); + return -1; } /* @@ -292,20 +319,25 @@ static int do_cgroup_set(const char *path, const char *value) int lxc_cgroup_set_bypath(const char *cgpath, const char *filename, const char *value) { int ret; - char *dirpath; + char *dirpath = NULL; char path[MAXPATHLEN]; ret = cgroup_path_get(dirpath, filename, cgpath); if (ret) - return -1; + goto fail; ret = snprintf(path, MAXPATHLEN, %s/%s, dirpath, filename); if (ret 0 || ret = MAXPATHLEN) { ERROR(pathname too long); -
Re: [lxc-devel] [PATCH] utils: reimplement/fix mkdir_p()
Quoting Richard Weinberger (rich...@nod.at): Reimplement mkdir_p() such that it: ...handles relativ paths correctly. (currently it crashes) ...does not rely on dirname(). ...is not recursive. ...is shorter. ;-) Signed-off-by: Richard Weinberger rich...@nod.at Thanks, this looks great. Acked-by: Serge E. Hallyn serge.hal...@ubuntu.com --- src/lxc/utils.c | 46 +++--- 1 file changed, 15 insertions(+), 31 deletions(-) diff --git a/src/lxc/utils.c b/src/lxc/utils.c index e07ca7b..6a154f9 100644 --- a/src/lxc/utils.c +++ b/src/lxc/utils.c @@ -95,39 +95,23 @@ extern int get_u16(unsigned short *val, const char *arg, int base) return 0; } -static int is_all_slashes(char *path) -{ - while (*path *path == '/') - path++; - if (*path) - return 0; - return 1; -} - extern int mkdir_p(char *dir, mode_t mode) { - int ret; - char *d; - - if (is_all_slashes(dir)) - return 0; - - d = strdup(dir); - if (!d) - return -1; - - ret = mkdir_p(dirname(d), mode); - free(d); - if (ret) - return -1; - - if (!access(dir, F_OK)) - return 0; - - if (mkdir(dir, mode)) { - SYSERROR(failed to create directory '%s'\n, dir); - return -1; - } + char *tmp = dir; + char *orig = dir; + char *makeme; + + do { + dir = tmp + strspn(tmp, /); + tmp = dir + strcspn(dir, /); + makeme = strndupa(orig, dir - orig); + if (*makeme) { + if (mkdir(makeme, mode) errno != EEXIST) { + SYSERROR(failed to create directory '%s'\n, makeme); + return -1; + } + } + } while(tmp != dir); return 0; } -- 1.8.1.4 -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 2/2] Support stopping containers concurrently
Quoting S.Çağlar Onur (cag...@10ur.org): From: S.Çağlar Onur cag...@10ur.org Trying to stop multiple containers concurrently ends up with cgroup is not mounted errors as multiple threads corrupts the shared variables. Fix that stack corruption and start to use getmntent_r to support stopping multiple containers concurrently. Signed-off-by: S.Çağlar Onur cag...@10ur.org --- src/lxc/cgroup.c | 152 +++-- src/lxc/freezer.c | 18 +-- src/lxc/state.c | 15 -- 3 files changed, 126 insertions(+), 59 deletions(-) diff --git a/src/lxc/cgroup.c b/src/lxc/cgroup.c index 368214f..0739477 100644 --- a/src/lxc/cgroup.c +++ b/src/lxc/cgroup.c @@ -54,6 +54,11 @@ lxc_log_define(lxc_cgroup, lxc); #define MTAB /proc/mounts +/* In the case of a bind mount, there could be two long pathnames in the + * mntent plus options so use large enough buffer size + */ +#define LARGE_MAXPATHLEN 4 * MAXPATHLEN + /* Check if a mount is a cgroup hierarchy for any subsystem. * Return the first subsystem found (or NULL if none). */ @@ -100,29 +105,31 @@ static char *mount_has_subsystem(const struct mntent *mntent) */ static int get_cgroup_mount(const char *subsystem, char *mnt) { - struct mntent *mntent; + struct mntent *mntent, mntent_r; FILE *file = NULL; int ret, err = -1; + char buf[LARGE_MAXPATHLEN] = {0}; Ah yes, this must be what I thought we were waiting on - a response from Stéphane on this. I'm still worried about this stack usage, especially in something which is rather commonly called. Stéphane, is this a non-issue for arm? -serge -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [PATCH 1/2] Support starting containers concurrently
Hi Serge, On Wed, Apr 17, 2013 at 6:02 PM, Serge Hallyn serge.hal...@ubuntu.comwrote: Quoting S.Çağlar Onur (cag...@10ur.org): From: S.Çağlar Onur cag...@10ur.org Trying to start multiple containers concurrently may cause lxc_monitor_read_timeout to fail as select call could be interrupted by a signal, handle it. Signed-off-by: S.Çağlar Onur cag...@10ur.org --- src/lxc/state.c |9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/lxc/state.c b/src/lxc/state.c index 437f11a..0bb307d 100644 --- a/src/lxc/state.c +++ b/src/lxc/state.c @@ -231,8 +231,13 @@ extern int lxc_wait(const char *lxcname, const char *states, int timeout, const goto out_close; curtime = tv.tv_sec; } - if (lxc_monitor_read_timeout(fd, msg, timeout) 0) - goto out_close; + if (lxc_monitor_read_timeout(fd, msg, timeout) 0) { + /* continue if select interrupted by signal */ + if (errno == EINTR) + continue; Hm, wait, sorry - you need to recalculate the timeout here (if not -1)? Ah, good catch, I'll wait Stéphane's response to other patch and submit them together. + else + goto out_close; + } if (timeout != -1) { retval = gettimeofday(tv, NULL); -- 1.7.10.4 -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel -- S.Çağlar Onur cag...@10ur.org -- Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel