On 2013/5/24 20:49, Serge Hallyn wrote: > Quoting Qiang Huang (h.huangqi...@huawei.com): >> Hi, >> >> I found a tricky problem in LXC, once I made a mistake in config, set >> >> lxc.cgroup.cpuset.cpus = -1 >> >> ofcourse start would fail, but then "lxc-ls --active" showed the container >> is active. >> >> error message is: >> # lxc-start -n hq111 -f config_hq -l TRACE >> lxc-start: Invalid argument - write /cgroup/lxc/hq111/cpuset.cpus : Invalid >> argument >> lxc-start: Error setting cpuset.cpus to -1 for lxc/hq111 >> >> lxc-start: failed to setup the cgroups for 'hq111' >> lxc-start: failed to spawn 'hq111' >> lxc-start: Device or resource busy - failed to remove cgroup >> '/cgroup/lxc/hq111' >> >> >> This is not hard to reproduce, just keep trying, not stable though. >> Then I read through the code and figured recursive_rmdir() failed, rmdir() >> return >> -1 sometimes, any idea how to fix this? > > Could you tell us exactly which version this is, and exactly how you > created the container? When I do it in ubuntu saucy (roughly 0.9.0 lxc), > the cgroup gets correctly removed. >
Hi Serge, I think I have found the reason, when setup_cgroup() fail, the child process may still exist when the father try to destroy cgroup.(We have no sync mechanism to ensure child can exit before father when something wrong happen) commit 6031a6e5f939bda07d98768d34dafae677a7dfeb Author: Dwight Engen <dwight.en...@oracle.com> Date: Wed May 15 12:27:34 2013 -0400 set non device cgroup items before the cgroup is entered This allows some special cgroup items such as memory.kmem.limit_in_bytes to be successfully set, since they must be set before any task is put into the cgroup. The devices cgroup is setup later giving the container a chance to mount file systems before the device it might want to mount from becomes unavailable. Signed-off-by: Dwight Engen <dwight.en...@oracle.com> Signed-off-by: Serge Hallyn <serge.hal...@ubuntu.com> This patch moved setup_cgroup() before lxc_cgroup_enter(), when setup_cgroup() fail, there is no task in cgroup, so remove cgroup wouldn't fail. So my problem no longer exists on the latest code, but there are still potential problems if we don't ensure child exit before father, such as Michael's problem, might also caused by this. ------------------------------------------------------------------------------ Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET Get 100% visibility into your production application - at no cost. Code-level diagnostics for performance bottlenecks with <2% overhead Download for free and get started troubleshooting in minutes. http://p.sf.net/sfu/appdyn_d2d_ap1 _______________________________________________ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel