Hi,
So, I just created CLOUDSTACK-2893, but Wei Zhou mentioned that there
are some related issues:
* CLOUDSTACK-2729
* CLOUDSTACK-2780
I restarted my Agent and the issue described in 2893 went away, but I'm
wondering how that happened.
Anyway, after going further I found that I have some "orphaned" storage
pools, with that I mean, they are mounted and in use, but not defined
nor active in libvirt:
root@n02:~# lsof |grep "\.iso"|awk '{print $9}'|cut -d '/' -f 3|sort -n|uniq
eb3cd8fd-a462-35b9-882a-f4b9f2f4a84c
f84e51ab-d203-3114-b581-247b81b7d2c1
fd968b03-bd11-3179-a2b3-73def7c66c68
7ceb73e5-5ab1-3862-ad6e-52cb986aff0d
7dc0149e-0281-3353-91eb-4589ef2b1ec1
8e005344-6a65-3802-ab36-31befc95abf3
88ddd8f5-e6c7-3f3d-bef2-eea8f33aa593
765e63d7-e9f9-3203-bf4f-e55f83fe9177
1287a27d-0383-3f5a-84aa-61211621d451
98622150-41b2-3ba3-9c9c-09e3b6a2da03
root@n02:~#
Looking at libvirt:
root@n02:~# virsh pool-list
Name State Autostart
-----------------------------------------
52801816-fe44-3a2b-a147-bb768eeea295 active no
7ceb73e5-5ab1-3862-ad6e-52cb986aff0d active no
88ddd8f5-e6c7-3f3d-bef2-eea8f33aa593 active no
a83d1100-4ffa-432a-8467-4dc266c4b0c8 active no
fd968b03-bd11-3179-a2b3-73def7c66c68 active no
root@n02:~#
What happens here is that the mountpoints are in use (ISO attached to
Instance) but there is no storage pool in libvirt.
This means that when you try to deploy a second VM with the same ISO
libvirt will error out since the Agent will try to create and start a
new storage pool which will fail since the mountpoint is already in use.
The remedy would be to take the hypervisor into maintainence, reboot int
completely and migrate Instances to it again.
In libvirt there is no way to start a NFS storage pool without libvirt
mounting it.
Any suggestions on how we can work around this code wise?
For my issue I'm writing a patch which adds some more debug lines to
show what the Agent is doing, but it's kind of weird that we got into
this "disconnected" state.
Wido