On 07/16/2013 12:27 AM, Marcus Sorensen wrote:
    I'm ok with a symptom fix on our end, if the root cause is in
Libvirt we can't do much about that. This is the sort of patch that
tends to get pulled into the regular update cycle of the
distributions, so unless there's more to it and it's not a good fix I
imagine we will see it come through without having to wait for the
next point releases. We still have to support existing users who might
not be running the latest, though, so the symptom fix is probably ok
as a temporary measure.

I'm ok with not calling storagePoolRefresh every time we want a capacity update, since that's also kind of I/O intensive for larger storage arrays.

However, we should make sure we have a GOOD comment in the code about this "fix", since that's the reason I initially removed the old code which invoked "df".

I'll see if I can get this libvirt patch into Ubuntu when it hits libvirt upstream, since this bug is really annoying.

Wido


On Mon, Jul 15, 2013 at 3:42 PM, Edison Su <edison...@citrix.com> wrote:
There is a serious issue on 
KVM(https://issues.apache.org/jira/browse/CLOUDSTACK-2729): a libvirt storage 
pool can disappear on KVM host, it's easy to be reproduced in our internal QA 
environment.
Wei found the root cause, is on the libvirt:
"
This is a libvirt issue. I created a ticket for it.
https://bugzilla.redhat.com/show_bug.cgi?id=977706
The patch is very simple.
https://www.redhat.com/archives/libvir-list/2013-July/msg00635.html
"
But it's also introduced by CloudStack, as cloudstack will call libvirt storage 
pool refresh method each time when access the storage pool. The code is added 
by commit: 2ffc9907f7b0d371737e39b7649f7af23026f5cf, about less than one year 
ago.

As Wei suggested, we can call storage pool refresh only if needed, it will 
mitigate the issue(It's behavior I did on cloudstack pre-4.0), but it's only 
treat the symptom, not the cause.
Or add a cluster wide lock, only one guy can access storage pool at one time, 
we can add a file lock on NFS primary storage.
Any idea/feedback on how to fix this KVM issue?




Reply via email to