Re: [Lxc-users] can't remove cgroup

2011-06-17 Thread Serge Hallyn
Quoting Brian K. White (br...@aljex.com):
 On 6/16/2011 3:26 PM, Serge Hallyn wrote:
  Quoting Brian K. White (br...@aljex.com):
  I thought we killed this problem?
  ...
  nj12:~ # rm -rf /sys/fs/cgroup/vps001
 
  rmdir
 
 
 Did that too. no joy.
 
 In fact I did both the main directory and several runs of find|xargs to 
 delete files and directories using rm -f , rm -rf and rmdir.
 I'll have to wait for it to happen again to diagnose what the problem 
 was. I had to reboot the host because I needed that vm back up.
 
 I'm guessing the developer was doing something I didn't expect within 
 the vm, besides the use of the reboot command, to tie up the context 
 group even after all processes went away.

Or maybe, if you don't have a release agent set, he just ran something
like vsftpd which created new cgroups by cloning?

-serge

--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] can't remove cgroup

2011-06-17 Thread Brian K. White
On 6/17/2011 12:06 PM, Serge Hallyn wrote:
 Quoting Brian K. White (br...@aljex.com):
 On 6/16/2011 3:26 PM, Serge Hallyn wrote:
 Quoting Brian K. White (br...@aljex.com):
 I thought we killed this problem?
 ...
 nj12:~ # rm -rf /sys/fs/cgroup/vps001

 rmdir


 Did that too. no joy.

 In fact I did both the main directory and several runs of find|xargs to
 delete files and directories using rm -f , rm -rf and rmdir.
 I'll have to wait for it to happen again to diagnose what the problem
 was. I had to reboot the host because I needed that vm back up.

 I'm guessing the developer was doing something I didn't expect within
 the vm, besides the use of the reboot command, to tie up the context
 group even after all processes went away.

 Or maybe, if you don't have a release agent set, he just ran something
 like vsftpd which created new cgroups by cloning?

 -serge


I do have a release agent, and I usually have the required vsftpd config 
options to disable namespace usage as part of my recipe for setting up 
all systems, but I did not do most of the setup of these particular 
vm's, I'm trying to get one of my people up to speed so they can do it 
so I intentionally stayed away.

It's entirely possible the special vsftpd config either didn't get done, 
or got lost in a full distribution version in-place upgrade that was 
done from within the vm.

... aha, just checked. An old version of my template vsftpd config was 
used which did not yet have the namespace options.

I will add them and test! (as well as update the source of the template 
config obviously)

Thank you even if this doesn't turn out to be the culprit of this 
incident, it's still a hole I missed.

-- 
bkw

--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] can't remove cgroup

2011-06-17 Thread Serge Hallyn
Quoting Brian K. White (br...@aljex.com):
 On 6/17/2011 12:06 PM, Serge Hallyn wrote:
  Quoting Brian K. White (br...@aljex.com):
  On 6/16/2011 3:26 PM, Serge Hallyn wrote:
  Quoting Brian K. White (br...@aljex.com):
  I thought we killed this problem?
  ...
  nj12:~ # rm -rf /sys/fs/cgroup/vps001
 
  rmdir
 
 
  Did that too. no joy.
 
  In fact I did both the main directory and several runs of find|xargs to
  delete files and directories using rm -f , rm -rf and rmdir.
  I'll have to wait for it to happen again to diagnose what the problem
  was. I had to reboot the host because I needed that vm back up.
 
  I'm guessing the developer was doing something I didn't expect within
  the vm, besides the use of the reboot command, to tie up the context
  group even after all processes went away.
 
  Or maybe, if you don't have a release agent set, he just ran something
  like vsftpd which created new cgroups by cloning?
 
  -serge
 
 
 I do have a release agent, and I usually have the required vsftpd config 
 options to disable namespace usage as part of my recipe for setting up 
 all systems, but I did not do most of the setup of these particular 
 vm's, I'm trying to get one of my people up to speed so they can do it 
 so I intentionally stayed away.
 
 It's entirely possible the special vsftpd config either didn't get done, 
 or got lost in a full distribution version in-place upgrade that was 
 done from within the vm.
 
 ... aha, just checked. An old version of my template vsftpd config was 
 used which did not yet have the namespace options.
 
 I will add them and test! (as well as update the source of the template 
 config obviously)
 
 Thank you even if this doesn't turn out to be the culprit of this 
 incident, it's still a hole I missed.

Hm, if you have release agents then that shouldn't be the problem,
unless there was a client still connected to one of those vsftpd
servers (which I think you've said was not the case).

-serge

--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


[Lxc-users] can't remove cgroup

2011-06-16 Thread Brian K. White
I thought we killed this problem?

nj12:~ # lxc-start -n vps001 -f /etc/lxc/vps001/config
lxc-start: Device or resource busy - failed to remove previous cgroup 
'/sys/fs/cgroup/vps001'
lxc-start: failed to spawn 'vps001'
lxc-start: Device or resource busy - failed to remove cgroup 
'/sys/fs/cgroup/vps001'

nj12:~ # lxc-ps auxwww |grep vps001
root  9307  0.0  0.0   7668   808 pts/0S+   14:06 
0:00 grep vps001

nj12:~ # lxc-info -n vps001
'vps001' is STOPPED

nj12:~ # lxc-destroy -n vps001
'vps001' does not exist

nj12:~ # mount |grep cgroup
cgroup on /sys/fs/cgroup type cgroup (rw)

nj12:~ # rm -rf /sys/fs/cgroup/vps001
rm: cannot remove 
`/sys/fs/cgroup/vps001/30149/cpuset.memory_spread_slab': Operation not 
permitted
rm: cannot remove 
`/sys/fs/cgroup/vps001/30149/cpuset.memory_spread_page': Operation not 
permitted
[...]
rm: cannot remove `/sys/fs/cgroup/vps001/cgroup.procs': Operation not 
permitted
rm: cannot remove `/sys/fs/cgroup/vps001/tasks': Operation not permitted
nj12:~ #

The dirs and files still exist so just ignore the error doesn't apply 
here. What happened was the user issued the command reboot from within 
the container. In my own testing I had only ever used shutdown -r now 
which worked fine.

This is lxc 0.7.4.2 on kernel 2.6.39

How can I clear this cgroup? How can I even tell if there are really any 
processes holding it open if lxc-ps shows none?
How can I restart this container other than by editing the start script 
to use a different cgroup name or restarting the entire host?

-- 
bkw

--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users


Re: [Lxc-users] can't remove cgroup

2011-06-16 Thread Brian K. White
On 6/16/2011 3:26 PM, Serge Hallyn wrote:
 Quoting Brian K. White (br...@aljex.com):
 I thought we killed this problem?
 ...
 nj12:~ # rm -rf /sys/fs/cgroup/vps001

 rmdir


Did that too. no joy.

In fact I did both the main directory and several runs of find|xargs to 
delete files and directories using rm -f , rm -rf and rmdir.
I'll have to wait for it to happen again to diagnose what the problem 
was. I had to reboot the host because I needed that vm back up.

I'm guessing the developer was doing something I didn't expect within 
the vm, besides the use of the reboot command, to tie up the context 
group even after all processes went away.

-- 
bkw

--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Lxc-users mailing list
Lxc-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-users