Re: [Lxc-users] [PATCH 2/2] cgroups: support cgroups mounted in multiple places
On 06/27/2011 03:53 PM, Michael H. Warfield wrote: Cool. I now have it running in 3 environments. Two are Fedora 12 (I know, I know, it's on my todo list) i686 systems with single cgroup mounts while the other is my Fedora 15 x86_64 platform. All are running quite smoothly and lxc-info is working. That's cool :) Daniel: Did you pick up my little patch from a couple of messages back? That message did not including the -devel list. Should I also send it there? Thank Michael for investigating and fixing the bug. I have to look at Serge patch to check if it does not already fix the problem, otherwise I will fix it with your patch on top of it. Thanks -- Daniel -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] [PATCH 2/2] cgroups: support cgroups mounted in multiple places
On Fri, 2011-07-01 at 10:12 +0200, Daniel Lezcano wrote: On 06/27/2011 03:53 PM, Michael H. Warfield wrote: Cool. I now have it running in 3 environments. Two are Fedora 12 (I know, I know, it's on my todo list) i686 systems with single cgroup mounts while the other is my Fedora 15 x86_64 platform. All are running quite smoothly and lxc-info is working. That's cool :) Daniel: Did you pick up my little patch from a couple of messages back? That message did not including the -devel list. Should I also send it there? Thank Michael for investigating and fixing the bug. I have to look at Serge patch to check if it does not already fix the problem, otherwise I will fix it with your patch on top of it. I looked at his revised patch and it does incorporate my fix. I'll test it out a little later today but it looks good to me. Only different between what he did and what I did was in style. I tend to be a little more pedantic about braces and use them even when it's not necessary around a single line. Thanks -- Daniel Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] [PATCH 2/2] cgroups: support cgroups mounted in multiple places
On Sun, 2011-06-26 at 18:49 -0500, Serge E. Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): On Sun, 2011-06-26 at 14:00 -0500, Serge E. Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): Thanks, Michael, good catch. Now wait a minute. Is that a typo here: No it's not, but: char *s = index(retbuf, '.'); If you're doing, in effect, a dirname here should that be this: char *s = index(retbuf, '/'); IAC... That *s = '\0'; should include a NULL check. Adding the NULL check and lxc-info works. Looks like that subsystem name in the call to that routine is not what Serge thought it was. I threw a print above the snprintf about just for giggles to print out the subsystem name being passed to it and this is what I got back... [mhw@forest SPECS]$ sudo lxc-info -n Alcove subsystem name: freezer 'Alcove' is RUNNING No wonder s was null. No dot and no /. I applied this patch and it got lxc-info working. But it was a quick hack just to address the NULL pointer. Is it the correct fix? No, it's not. For the calls to this function that come from cgroup.c itself, '.' is the right thing. The problem is that lxc_cgroup_set() and lxc_cgroup_get() pass in things like 'devices.allow'. I was going to make the index conditional, but all the callers of this function pass in either a filename (with a '.' in it) or NULL. I failed to notice these: src/lxc/freezer.c: ret = lxc_cgroup_path_get(nsgroup, freezer, name); src/lxc/state.c:err = lxc_cgroup_path_get(nsgroup, freezer, name); :) These are what you are running into. So the thing to do is leave it searching for index(s, '.') but do nothing if s is NULL. And that's what I believe results with my little hack. Only truncate if Oops, sorry, I didn't look closely enough and assumed your patch was switching to checking for '/'. there was a hit and s was non-null. I see now from your comments that the check on '.' was correct. I was uncertain about the inputs and outputs in this routine. Checking for the NILL condition may not be the solution, in this case, but it is still a best common practice to check pointers like that. Never can tell what may crop up in the future. Really it would be cleaner to have lxc_cgroup_{sg}et() do the index, so that lxc_cgorup_path_get() always gets a subsystem or NULL. I'm not doing that patch right now, though, trivial as it ought to be. I hear you. So Acked-by: Serge Hallyn serge.hal...@ubuntu.com to your patch to fix my bug, and let's leave it at that for now until it gets more testing? Cool. I now have it running in 3 environments. Two are Fedora 12 (I know, I know, it's on my todo list) i686 systems with single cgroup mounts while the other is my Fedora 15 x86_64 platform. All are running quite smoothly and lxc-info is working. Daniel: Did you pick up my little patch from a couple of messages back? That message did not including the -devel list. Should I also send it there? Thanks again for testing and looking into it! -serge Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] [PATCH 2/2] cgroups: support cgroups mounted in multiple places
Quoting Michael H. Warfield (m...@wittsend.com): cd /sys/fs/cgroup for d in `/bin/ls`; do echo 1 $d/cgroup.clone_children done Doing this step alone broke lxc totally for me, with or without the patch below. This was on Fedora 15 testing with lxc 0.7.4.2 as well as Do you have the ns cgroup mounted? The above is only for without ns cgroup. -serge -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] [PATCH 2/2] cgroups: support cgroups mounted in multiple places
On 06/26/2011 05:06 PM, Serge E. Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): cd /sys/fs/cgroup for d in `/bin/ls`; do echo 1 $d/cgroup.clone_children done Doing this step alone broke lxc totally for me, with or without the patch below. This was on Fedora 15 testing with lxc 0.7.4.2 as well as Do you have the ns cgroup mounted? The above is only for without ns cgroup. Yes, the ns_cgroup and clone_children are mutually exclusive. If you try to create a namespace with a ns cgroup mounted and the clone_children flag is set. The 'clone' will return EINVAL. I noticed with the patch applied because there was a bug in it which was setting this flag even if the cgroup was ns. It is possible, you ran lxc with the patch, the clone_children was set, and then you remove the patch and try to run lxc again. As the clone_children is still set, that will make impossible to clone a new namespace. With the patch I sent previously on top of Serge's patch, that should be fixed. -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] [PATCH 2/2] cgroups: support cgroups mounted in multiple places
On 06/26/2011 05:52 PM, Michael H. Warfield wrote: On Sun, 2011-06-26 at 17:27 +0200, Daniel Lezcano wrote: On 06/26/2011 05:06 PM, Serge E. Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): cd /sys/fs/cgroup for d in `/bin/ls`; do echo 1 $d/cgroup.clone_children done Doing this step alone broke lxc totally for me, with or without the patch below. This was on Fedora 15 testing with lxc 0.7.4.2 as well as Do you have the ns cgroup mounted? The above is only for without ns cgroup. Yes, the ns_cgroup and clone_children are mutually exclusive. If you try to create a namespace with a ns cgroup mounted and the clone_children flag is set. The 'clone' will return EINVAL. I noticed with the patch applied because there was a bug in it which was setting this flag even if the cgroup was ns. It is possible, you ran lxc with the patch, the clone_children was set, and then you remove the patch and try to run lxc again. As the clone_children is still set, that will make impossible to clone a new namespace. With the patch I sent previously on top of Serge's patch, that should be fixed. Ok... Got that an threw it into the mix. I'm still seeing the segfaults on lxc-info. Seems to fault even if I specify the name of a non-existent machine as well. Oh, seems I missed this info :s If you launch lxc-info -n foo, that segfaults ? On what architecture does it fault ? -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] [PATCH 2/2] cgroups: support cgroups mounted in multiple places
On Sun, 2011-06-26 at 18:27 +0200, Daniel Lezcano wrote: On 06/26/2011 05:52 PM, Michael H. Warfield wrote: On Sun, 2011-06-26 at 17:27 +0200, Daniel Lezcano wrote: On 06/26/2011 05:06 PM, Serge E. Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): cd /sys/fs/cgroup for d in `/bin/ls`; do echo 1 $d/cgroup.clone_children done Doing this step alone broke lxc totally for me, with or without the patch below. This was on Fedora 15 testing with lxc 0.7.4.2 as well as Do you have the ns cgroup mounted? The above is only for without ns cgroup. Yes, the ns_cgroup and clone_children are mutually exclusive. If you try to create a namespace with a ns cgroup mounted and the clone_children flag is set. The 'clone' will return EINVAL. I noticed with the patch applied because there was a bug in it which was setting this flag even if the cgroup was ns. It is possible, you ran lxc with the patch, the clone_children was set, and then you remove the patch and try to run lxc again. As the clone_children is still set, that will make impossible to clone a new namespace. With the patch I sent previously on top of Serge's patch, that should be fixed. Ok... Got that an threw it into the mix. I'm still seeing the segfaults on lxc-info. Seems to fault even if I specify the name of a non-existent machine as well. Oh, seems I missed this info :s If you launch lxc-info -n foo, that segfaults ? On what architecture does it fault ? Fedora 15, x86_64. 0.7.4.2 is fine. Just rebuilt a fresh set of packages from GIT sources alone. They're fine and lxc-info from git works. Git patched with Serge's patch. Segfault. Git patched with both patches. Segfault. Here's what gdb has to say about it: [root@forest lxc]# gdb lxc-info core.13546 GNU gdb (GDB) Fedora (7.2.90.20110525-39.fc15) Copyright (C) 2011 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type show copying and show warranty for details. This GDB was configured as x86_64-redhat-linux-gnu. For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/... Reading symbols from /home/mhw/rpmbuild/BUILD/lxc-0.7.4.2git2/src/lxc/lxc-info...done. [New LWP 13546] Core was generated by `./lxc-info -n Alcove'. Program terminated with signal 11, Segmentation fault. #0 lxc_cgroup_path_get (path=0x7fff89b35928, subsystem=0x7fba52a4fdc1 freezer, name=0x7fff89b3987d Alcove) at cgroup.c:326 326*s = '\0'; (gdb) trace Tracepoint 1 at 0x7fba52a41e74: file cgroup.c, line 326. (gdb) backtrace #0 lxc_cgroup_path_get (path=0x7fff89b35928, subsystem=0x7fba52a4fdc1 freezer, name=0x7fff89b3987d Alcove) at cgroup.c:326 #1 0x7fba52a4a337 in freezer_state (name=optimized out) at state.c:74 #2 0x7fba52a4a42d in lxc_getstate (name=0x7fff89b3987d Alcove) at state.c:130 #3 0x0040068d in main (argc=optimized out, argv=optimized out) at lxc_info.c:63 Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] [PATCH 2/2] cgroups: support cgroups mounted in multiple places
Looking at the sources and Serge's patch... On Sun, 2011-06-26 at 13:33 -0400, Michael H. Warfield wrote: On Sun, 2011-06-26 at 18:27 +0200, Daniel Lezcano wrote: On 06/26/2011 05:52 PM, Michael H. Warfield wrote: On Sun, 2011-06-26 at 17:27 +0200, Daniel Lezcano wrote: On 06/26/2011 05:06 PM, Serge E. Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): cd /sys/fs/cgroup for d in `/bin/ls`; do echo 1 $d/cgroup.clone_children done Doing this step alone broke lxc totally for me, with or without the patch below. This was on Fedora 15 testing with lxc 0.7.4.2 as well as Do you have the ns cgroup mounted? The above is only for without ns cgroup. Yes, the ns_cgroup and clone_children are mutually exclusive. If you try to create a namespace with a ns cgroup mounted and the clone_children flag is set. The 'clone' will return EINVAL. I noticed with the patch applied because there was a bug in it which was setting this flag even if the cgroup was ns. It is possible, you ran lxc with the patch, the clone_children was set, and then you remove the patch and try to run lxc again. As the clone_children is still set, that will make impossible to clone a new namespace. With the patch I sent previously on top of Serge's patch, that should be fixed. Ok... Got that an threw it into the mix. I'm still seeing the segfaults on lxc-info. Seems to fault even if I specify the name of a non-existent machine as well. Oh, seems I missed this info :s If you launch lxc-info -n foo, that segfaults ? On what architecture does it fault ? Fedora 15, x86_64. 0.7.4.2 is fine. Just rebuilt a fresh set of packages from GIT sources alone. They're fine and lxc-info from git works. Git patched with Serge's patch. Segfault. Git patched with both patches. Segfault. Here's what gdb has to say about it: [root@forest lxc]# gdb lxc-info core.13546 GNU gdb (GDB) Fedora (7.2.90.20110525-39.fc15) Copyright (C) 2011 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type show copying and show warranty for details. This GDB was configured as x86_64-redhat-linux-gnu. For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/... Reading symbols from /home/mhw/rpmbuild/BUILD/lxc-0.7.4.2git2/src/lxc/lxc-info...done. [New LWP 13546] Core was generated by `./lxc-info -n Alcove'. Program terminated with signal 11, Segmentation fault. #0 lxc_cgroup_path_get (path=0x7fff89b35928, subsystem=0x7fba52a4fdc1 freezer, name=0x7fff89b3987d Alcove) at cgroup.c:326 326 *s = '\0'; (gdb) trace Tracepoint 1 at 0x7fba52a41e74: file cgroup.c, line 326. (gdb) backtrace #0 lxc_cgroup_path_get (path=0x7fff89b35928, subsystem=0x7fba52a4fdc1 freezer, name=0x7fff89b3987d Alcove) at cgroup.c:326 #1 0x7fba52a4a337 in freezer_state (name=optimized out) at state.c:74 #2 0x7fba52a4a42d in lxc_getstate (name=0x7fff89b3987d Alcove) at state.c:130 #3 0x0040068d in main (argc=optimized out, argv=optimized out) at lxc_info.c:63 Ok... That looks like this piece from that patch: +/* + * lxc_cgroup_path_get: put into *path the pathname for + * %subsystem and cgroup %name. If %subsystem is NULL, then + * the first mounted cgroup will be used (for nr_tasks) + */ +int lxc_cgroup_path_get(char **path, const char *subsystem, const char *name) +{ + static charbuf[MAXPATHLEN]; + static charretbuf[MAXPATHLEN]; + + /* what lxc_cgroup_set calls subsystem is actually the filename, i.e. + 'devices.allow'. So for our purposee we trim it */ + if (subsystem) { + snprintf(retbuf, MAXPATHLEN, %s, subsystem); + char *s = index(retbuf, '.'); + *s = '\0'; + DEBUG(%s: called for subsys %s name %s\n, __func__, retbuf, name); } Now wait a minute. Is that a typo here: char *s = index(retbuf, '.'); If you're doing, in effect, a dirname here should that be this: char *s = index(retbuf, '/'); IAC... That *s = '\0'; should include a NULL check. Adding the NULL check and lxc-info works. Looks like that subsystem name in the call to that routine is not what Serge thought it was. I threw a print above the snprintf about just for giggles to print out the subsystem name being passed to it and this is what I got back... [mhw@forest SPECS]$ sudo lxc-info -n Alcove subsystem name: freezer 'Alcove' is RUNNING No wonder s was null. No dot and no /. Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we
Re: [Lxc-users] [PATCH 2/2] cgroups: support cgroups mounted in multiple places
On Sun, 2011-06-26 at 13:56 -0400, Michael H. Warfield wrote: Looking at the sources and Serge's patch... On Sun, 2011-06-26 at 13:33 -0400, Michael H. Warfield wrote: On Sun, 2011-06-26 at 18:27 +0200, Daniel Lezcano wrote: On 06/26/2011 05:52 PM, Michael H. Warfield wrote: On Sun, 2011-06-26 at 17:27 +0200, Daniel Lezcano wrote: On 06/26/2011 05:06 PM, Serge E. Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): cd /sys/fs/cgroup for d in `/bin/ls`; do echo 1 $d/cgroup.clone_children done Doing this step alone broke lxc totally for me, with or without the patch below. This was on Fedora 15 testing with lxc 0.7.4.2 as well as Do you have the ns cgroup mounted? The above is only for without ns cgroup. Yes, the ns_cgroup and clone_children are mutually exclusive. If you try to create a namespace with a ns cgroup mounted and the clone_children flag is set. The 'clone' will return EINVAL. I noticed with the patch applied because there was a bug in it which was setting this flag even if the cgroup was ns. It is possible, you ran lxc with the patch, the clone_children was set, and then you remove the patch and try to run lxc again. As the clone_children is still set, that will make impossible to clone a new namespace. With the patch I sent previously on top of Serge's patch, that should be fixed. Ok... Got that an threw it into the mix. I'm still seeing the segfaults on lxc-info. Seems to fault even if I specify the name of a non-existent machine as well. Oh, seems I missed this info :s If you launch lxc-info -n foo, that segfaults ? On what architecture does it fault ? Fedora 15, x86_64. 0.7.4.2 is fine. Just rebuilt a fresh set of packages from GIT sources alone. They're fine and lxc-info from git works. Git patched with Serge's patch. Segfault. Git patched with both patches. Segfault. Here's what gdb has to say about it: [root@forest lxc]# gdb lxc-info core.13546 GNU gdb (GDB) Fedora (7.2.90.20110525-39.fc15) Copyright (C) 2011 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type show copying and show warranty for details. This GDB was configured as x86_64-redhat-linux-gnu. For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/... Reading symbols from /home/mhw/rpmbuild/BUILD/lxc-0.7.4.2git2/src/lxc/lxc-info...done. [New LWP 13546] Core was generated by `./lxc-info -n Alcove'. Program terminated with signal 11, Segmentation fault. #0 lxc_cgroup_path_get (path=0x7fff89b35928, subsystem=0x7fba52a4fdc1 freezer, name=0x7fff89b3987d Alcove) at cgroup.c:326 326*s = '\0'; (gdb) trace Tracepoint 1 at 0x7fba52a41e74: file cgroup.c, line 326. (gdb) backtrace #0 lxc_cgroup_path_get (path=0x7fff89b35928, subsystem=0x7fba52a4fdc1 freezer, name=0x7fff89b3987d Alcove) at cgroup.c:326 #1 0x7fba52a4a337 in freezer_state (name=optimized out) at state.c:74 #2 0x7fba52a4a42d in lxc_getstate (name=0x7fff89b3987d Alcove) at state.c:130 #3 0x0040068d in main (argc=optimized out, argv=optimized out) at lxc_info.c:63 Ok... That looks like this piece from that patch: +/* + * lxc_cgroup_path_get: put into *path the pathname for + * %subsystem and cgroup %name. If %subsystem is NULL, then + * the first mounted cgroup will be used (for nr_tasks) + */ +int lxc_cgroup_path_get(char **path, const char *subsystem, const char *name) +{ + static charbuf[MAXPATHLEN]; + static charretbuf[MAXPATHLEN]; + + /* what lxc_cgroup_set calls subsystem is actually the filename, i.e. + 'devices.allow'. So for our purposee we trim it */ + if (subsystem) { + snprintf(retbuf, MAXPATHLEN, %s, subsystem); + char *s = index(retbuf, '.'); + *s = '\0'; + DEBUG(%s: called for subsys %s name %s\n, __func__, retbuf, name); } Now wait a minute. Is that a typo here: char *s = index(retbuf, '.'); If you're doing, in effect, a dirname here should that be this: char *s = index(retbuf, '/'); IAC... That *s = '\0'; should include a NULL check. Adding the NULL check and lxc-info works. Looks like that subsystem name in the call to that routine is not what Serge thought it was. I threw a print above the snprintf about just for giggles to print out the subsystem name being passed to it and this is what I got back... [mhw@forest SPECS]$ sudo lxc-info -n Alcove subsystem name: freezer 'Alcove' is RUNNING No wonder s was null. No dot and no /. I
Re: [Lxc-users] [PATCH 2/2] cgroups: support cgroups mounted in multiple places
On Sun, 2011-06-26 at 14:00 -0500, Serge E. Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): Thanks, Michael, good catch. Now wait a minute. Is that a typo here: No it's not, but: char *s = index(retbuf, '.'); If you're doing, in effect, a dirname here should that be this: char *s = index(retbuf, '/'); IAC... That *s = '\0'; should include a NULL check. Adding the NULL check and lxc-info works. Looks like that subsystem name in the call to that routine is not what Serge thought it was. I threw a print above the snprintf about just for giggles to print out the subsystem name being passed to it and this is what I got back... [mhw@forest SPECS]$ sudo lxc-info -n Alcove subsystem name: freezer 'Alcove' is RUNNING No wonder s was null. No dot and no /. I applied this patch and it got lxc-info working. But it was a quick hack just to address the NULL pointer. Is it the correct fix? No, it's not. For the calls to this function that come from cgroup.c itself, '.' is the right thing. The problem is that lxc_cgroup_set() and lxc_cgroup_get() pass in things like 'devices.allow'. I was going to make the index conditional, but all the callers of this function pass in either a filename (with a '.' in it) or NULL. I failed to notice these: src/lxc/freezer.c: ret = lxc_cgroup_path_get(nsgroup, freezer, name); src/lxc/state.c:err = lxc_cgroup_path_get(nsgroup, freezer, name); :) These are what you are running into. So the thing to do is leave it searching for index(s, '.') but do nothing if s is NULL. And that's what I believe results with my little hack. Only truncate if there was a hit and s was non-null. I see now from your comments that the check on '.' was correct. I was uncertain about the inputs and outputs in this routine. Checking for the NILL condition may not be the solution, in this case, but it is still a best common practice to check pointers like that. Never can tell what may crop up in the future. Really it would be cleaner to have lxc_cgroup_{sg}et() do the index, so that lxc_cgorup_path_get() always gets a subsystem or NULL. I'm not doing that patch right now, though, trivial as it ought to be. I hear you. thanks, -serge Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] [PATCH 2/2] cgroups: support cgroups mounted in multiple places
Quoting Michael H. Warfield (m...@wittsend.com): On Sun, 2011-06-26 at 14:00 -0500, Serge E. Hallyn wrote: Quoting Michael H. Warfield (m...@wittsend.com): Thanks, Michael, good catch. Now wait a minute. Is that a typo here: No it's not, but: char *s = index(retbuf, '.'); If you're doing, in effect, a dirname here should that be this: char *s = index(retbuf, '/'); IAC... That *s = '\0'; should include a NULL check. Adding the NULL check and lxc-info works. Looks like that subsystem name in the call to that routine is not what Serge thought it was. I threw a print above the snprintf about just for giggles to print out the subsystem name being passed to it and this is what I got back... [mhw@forest SPECS]$ sudo lxc-info -n Alcove subsystem name: freezer 'Alcove' is RUNNING No wonder s was null. No dot and no /. I applied this patch and it got lxc-info working. But it was a quick hack just to address the NULL pointer. Is it the correct fix? No, it's not. For the calls to this function that come from cgroup.c itself, '.' is the right thing. The problem is that lxc_cgroup_set() and lxc_cgroup_get() pass in things like 'devices.allow'. I was going to make the index conditional, but all the callers of this function pass in either a filename (with a '.' in it) or NULL. I failed to notice these: src/lxc/freezer.c: ret = lxc_cgroup_path_get(nsgroup, freezer, name); src/lxc/state.c:err = lxc_cgroup_path_get(nsgroup, freezer, name); :) These are what you are running into. So the thing to do is leave it searching for index(s, '.') but do nothing if s is NULL. And that's what I believe results with my little hack. Only truncate if Oops, sorry, I didn't look closely enough and assumed your patch was switching to checking for '/'. there was a hit and s was non-null. I see now from your comments that the check on '.' was correct. I was uncertain about the inputs and outputs in this routine. Checking for the NILL condition may not be the solution, in this case, but it is still a best common practice to check pointers like that. Never can tell what may crop up in the future. Really it would be cleaner to have lxc_cgroup_{sg}et() do the index, so that lxc_cgorup_path_get() always gets a subsystem or NULL. I'm not doing that patch right now, though, trivial as it ought to be. I hear you. So Acked-by: Serge Hallyn serge.hal...@ubuntu.com to your patch to fix my bug, and let's leave it at that for now until it gets more testing? Thanks again for testing and looking into it! -serge -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] [PATCH 2/2] cgroups: support cgroups mounted in multiple places
On Sat, 2011-06-25 at 20:05 +0200, Daniel Lezcano wrote: On 06/24/2011 07:54 PM, Serge Hallyn wrote: I.e. with systemd or libcgroup. To do this, instead of looking for one cgroup called 'lxc' or otherwise taking the first cgroup we find, we actually create a container in every mounted cgroup fs. Right now it's done under the root of each fs. We may want to put that under lxc, or, better yet, make that configurable. Note the use of clone_children seems not quite right, but that's not for this patch to fix. In particular, if clone_children is not in the mntopts, we reject it. Yet later we try to set it ourselves. I believe we should simply, if ns cgroup is not composed, always try to set clone_children to 1. As it stands, with libcgroup installed, I had to do cd /sys/fs/cgroup for d in `/bin/ls`; do echo 1 $d/cgroup.clone_children done But after that, 'lxc-start -n l1' worked like a charm. It also continues to work with a single mount of cgroups under /cgroup. Signed-off-by: Serge Hallyn serge.hal...@ubuntu.com --- Ok, I installed f15 on a kvm machine. But WTF are all these daemons !! 1 root 20 0 48232 15m 1772 R 66.1 1.5 4:53.82 systemd 19247 dlezcano 20 0 790m 40m 11m R 62.2 4.0 0:46.93 gnome-settings- 19255 root 20 0 104m 27m 3252 R 35.9 2.8 0:44.83 packagekitd 1173 root 20 0 196m 65m 2416 R 30.6 6.5 2:43.70 udisks-daemon ... eating all the cpu and making impossible to use the fedora ? Don't get me started, yeah... systemd and udisks-daemon should not be consuming that much cpu. That is weird. Gnome-settings and packagekitd, I don't know. Why do we need a bloody daemon to check for packages and updates? I just don't know. OTOH... I'm not experiencing that on Forest (my hard-iron F15 host engine). My load average is staying low. So I'm not sure what's going on for your. systemd is looking all the time at /proc/1/mountinfo and reading the content, nice for an init process ! And tomorrow the init process will be firefox ... Hmmm... Something fishy there for sure. Regards, Mike -- Michael H. Warfield (AI4NB) | (770) 985-6132 | m...@wittsend.com /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0x674627FF| possible worlds. A pessimist is sure of it! signature.asc Description: This is a digitally signed message part -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense.. http://p.sf.net/sfu/splunk-d2d-c1___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] [PATCH 2/2] cgroups: support cgroups mounted in multiple places
On 06/25/2011 08:19 PM, Michael H. Warfield wrote: On Sat, 2011-06-25 at 20:05 +0200, Daniel Lezcano wrote: On 06/24/2011 07:54 PM, Serge Hallyn wrote: I.e. with systemd or libcgroup. To do this, instead of looking for one cgroup called 'lxc' or otherwise taking the first cgroup we find, we actually create a container in every mounted cgroup fs. Right now it's done under the root of each fs. We may want to put that under lxc, or, better yet, make that configurable. Note the use of clone_children seems not quite right, but that's not for this patch to fix. In particular, if clone_children is not in the mntopts, we reject it. Yet later we try to set it ourselves. I believe we should simply, if ns cgroup is not composed, always try to set clone_children to 1. As it stands, with libcgroup installed, I had to do cd /sys/fs/cgroup for d in `/bin/ls`; do echo 1 $d/cgroup.clone_children done But after that, 'lxc-start -n l1' worked like a charm. It also continues to work with a single mount of cgroups under /cgroup. Signed-off-by: Serge Hallyn serge.hal...@ubuntu.com --- Ok, I installed f15 on a kvm machine. But WTF are all these daemons !! 1 root 20 0 48232 15m 1772 R 66.1 1.5 4:53.82 systemd 19247 dlezcano 20 0 790m 40m 11m R 62.2 4.0 0:46.93 gnome-settings- 19255 root 20 0 104m 27m 3252 R 35.9 2.8 0:44.83 packagekitd 1173 root 20 0 196m 65m 2416 R 30.6 6.5 2:43.70 udisks-daemon ... eating all the cpu and making impossible to use the fedora ? Don't get me started, yeah... systemd and udisks-daemon should not be consuming that much cpu. That is weird. Gnome-settings and packagekitd, I don't know. Why do we need a bloody daemon to check for packages and updates? I just don't know. yep. OTOH... I'm not experiencing that on Forest (my hard-iron F15 host engine). My load average is staying low. So I'm not sure what's going on for your. Since I switched to multi-user.target (aka init 3), things are working much better. -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense.. http://p.sf.net/sfu/splunk-d2d-c1 ___ Lxc-users mailing list Lxc-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-users
Re: [Lxc-users] [PATCH 2/2] cgroups: support cgroups mounted in multiple places
On Fri, 2011-06-24 at 12:54 -0500, Serge Hallyn wrote: I.e. with systemd or libcgroup. To do this, instead of looking for one cgroup called 'lxc' or otherwise taking the first cgroup we find, we actually create a container in every mounted cgroup fs. Right now it's done under the root of each fs. We may want to put that under lxc, or, better yet, make that configurable. Yeah, looking at my system and just looking for the devices.* entries, I see some interesting stuff there that libvirt is doing... [root@forest mhw]# find /sys/fs/cgroup -name devices\* /sys/fs/cgroup/devices /sys/fs/cgroup/devices/libvirt/lxc/devices.list /sys/fs/cgroup/devices/libvirt/lxc/devices.deny /sys/fs/cgroup/devices/libvirt/lxc/devices.allow /sys/fs/cgroup/devices/libvirt/qemu/devices.list /sys/fs/cgroup/devices/libvirt/qemu/devices.deny /sys/fs/cgroup/devices/libvirt/qemu/devices.allow /sys/fs/cgroup/devices/libvirt/devices.list /sys/fs/cgroup/devices/libvirt/devices.deny /sys/fs/cgroup/devices/libvirt/devices.allow /sys/fs/cgroup/devices/devices.list /sys/fs/cgroup/devices/devices.deny /sys/fs/cgroup/devices/devices.allow Granted that the lxc under libvert is not this lxc project and they have their own xml based stuff, still looks like they're stuffing things under another directory. Good question there. Note the use of clone_children seems not quite right, but that's not for this patch to fix. In particular, if clone_children is not in the mntopts, we reject it. Yet later we try to set it ourselves. I believe we should simply, if ns cgroup is not composed, always try to set clone_children to 1. As it stands, with libcgroup installed, I had to do cd /sys/fs/cgroup for d in `/bin/ls`; do echo 1 $d/cgroup.clone_children done But after that, 'lxc-start -n l1' worked like a charm. It also continues to work with a single mount of cgroups under /cgroup. Cool. I'll check this out. Haven't had much time for coding, here the last couple of weeks, but I can at least do some testing. Signed-off-by: Serge Hallyn serge.hal...@ubuntu.com --- src/lxc/cgroup.c | 207 + src/lxc/cgroup.h |2 +- src/lxc/freezer.c |2 +- src/lxc/lxc.h |8 +- src/lxc/state.c |2 +- 5 files changed, 135 insertions(+), 86 deletions(-) diff --git a/src/lxc/cgroup.c b/src/lxc/cgroup.c index a068a01..ecba56e 100644 --- a/src/lxc/cgroup.c +++ b/src/lxc/cgroup.c @@ -52,11 +52,10 @@ enum { CGROUP_CLONE_CHILDREN, }; -static int get_cgroup_mount(const char *mtab, char *mnt) +static int get_cgroup_mount(const char *mtab, const char *subsystem, char *mnt) { struct mntent *mntent; FILE *file = NULL; -int err = -1; file = setmntent(mtab, r); if (!file) { @@ -66,29 +65,24 @@ static int get_cgroup_mount(const char *mtab, char *mnt) while ((mntent = getmntent(file))) { - /* there is a cgroup mounted named lxc */ - if (!strcmp(mntent-mnt_fsname, lxc) - !strcmp(mntent-mnt_type, cgroup)) { - strcpy(mnt, mntent-mnt_dir); - err = 0; - break; - } - - /* fallback to the first non-lxc cgroup found */ -if (!strcmp(mntent-mnt_type, cgroup) err) { +if (strcmp(mntent-mnt_type, cgroup)) + continue; + if (!subsystem || hasmntopt(mntent, subsystem)) { strcpy(mnt, mntent-mnt_dir); - err = 0; + fclose(file); + DEBUG(using cgroup mounted at '%s', mnt); + return 0; } }; - DEBUG(using cgroup mounted at '%s', mnt); + DEBUG(Failed to find cgroup for %s\n, subsystem ? subsystem : (NULL)); fclose(file); -return err; +return -1; } -static int get_cgroup_flags(const char *mtab, int *flags) +static int get_cgroup_flags(const char *mtab, const char *mnt_dir, int *flags) { struct mntent *mntent; FILE *file = NULL; @@ -103,38 +97,24 @@ static int get_cgroup_flags(const char *mtab, int *flags) *flags = 0; while ((mntent = getmntent(file))) { - - /* there is a cgroup mounted named lxc */ - if (!strcmp(mntent-mnt_fsname, lxc) - !strcmp(mntent-mnt_type, cgroup)) { - - if (hasmntopt(mntent, ns)) - *flags |= CGROUP_NS_CGROUP; - - if (hasmntopt(mntent, clone_children)) - *flags |= CGROUP_CLONE_CHILDREN; - + if (strcmp(mntent-mnt_type, cgroup)) + continue; + if (strcmp(mntent-mnt_dir, mnt_dir)) + continue; + if (hasmntopt(mntent, ns)) {