Wido, You applied the libvirt patch on your production system, and this issue disappeared, right? If so, that is good. I expect the redhat community can accep the patch (or the v2 https://www.redhat.com/archives/libvir-list/2013-July/msg00639.html) ASAP.
-Wei 2013/8/2 Wido den Hollander <w...@widodh.nl> > > > On 08/02/2013 01:55 AM, Edison Su wrote: > >> Hi Wei, regarding to the bug CLOUDSTACK-2729, I removed storage.refresh >> during getStoragePool in LibvirtStorageAdaptor, but the issue still >> happened in BVT. >> I am thinking add file lock on primary storage, seems you already have >> the patch, could you share the patch with us? >> >> > Fyi, I fixed this by patching the libvirt on our production systems rather > then fixing the CloudStack agent. > > It's just one very small patch: https://bugzilla.redhat.com/** > show_bug.cgi?id=977706<https://bugzilla.redhat.com/show_bug.cgi?id=977706> > > https://www.redhat.com/**archives/libvir-list/2013-**July/msg00635.html<https://www.redhat.com/archives/libvir-list/2013-July/msg00635.html> > > Wido > > > -----Original Message----- >>> From: Wei ZHOU [mailto:ustcweiz...@gmail.com] >>> Sent: Tuesday, July 16, 2013 3:35 AM >>> To: dev@cloudstack.apache.org >>> Subject: Re: How to fix libvirt storage pool refresh issue? >>> >>> I agree with Wido. >>> >>> Moreover, the file lock will cause performane degrade of VM deployment. >>> >>> -Wei >>> >>> >>> 2013/7/16 Wido den Hollander <w...@widodh.nl> >>> >>> On 07/16/2013 12:27 AM, Marcus Sorensen wrote: >>>> >>>> I'm ok with a symptom fix on our end, if the root cause is in >>>>> Libvirt we can't do much about that. This is the sort of patch that >>>>> tends to get pulled into the regular update cycle of the >>>>> distributions, so unless there's more to it and it's not a good fix I >>>>> imagine we will see it come through without having to wait for the >>>>> next point releases. We still have to support existing users who >>>>> might not be running the latest, though, so the symptom fix is >>>>> probably ok as a temporary measure. >>>>> >>>>> >>>> I'm ok with not calling storagePoolRefresh every time we want a >>>> capacity update, since that's also kind of I/O intensive for larger >>>> storage >>>> >>> arrays. >>> >>>> >>>> However, we should make sure we have a GOOD comment in the code >>>> >>> about >>> >>>> this "fix", since that's the reason I initially removed the old code >>>> which invoked "df". >>>> >>>> I'll see if I can get this libvirt patch into Ubuntu when it hits >>>> libvirt upstream, since this bug is really annoying. >>>> >>>> Wido >>>> >>>> >>>> >>>> On Mon, Jul 15, 2013 at 3:42 PM, Edison Su <edison...@citrix.com> >>>>> wrote: >>>>> >>>>> There is a serious issue on >>>>> KVM(https://issues.apache.org/****<https://issues.apache.org/**> >>>>>> jira/browse/CLOUDSTACK- >>>>>> >>>>> 2729<https://issues.apache.**org/jira/browse/CLOUDSTACK-**2729<https://issues.apache.org/jira/browse/CLOUDSTACK-2729> >>> >): >>> >>>> a libvirt storage pool can disappear on KVM host, it's easy to be >>>>>> reproduced in our internal QA environment. >>>>>> Wei found the root cause, is on the libvirt: >>>>>> " >>>>>> This is a libvirt issue. I created a ticket for it. >>>>>> https://bugzilla.redhat.com/****show_bug.cgi?id=977706<https://bugzilla.redhat.com/**show_bug.cgi?id=977706> >>>>>> <https:/**/bugzill <https://bugzill> >>>>>> a.redhat.com/show_bug.cgi?id=**977706<http://a.redhat.com/show_bug.cgi?id=977706> >>>>>> > >>>>>> The patch is very simple. >>>>>> https://www.redhat.com/****archives/libvir-list/2013-**** >>>>>> July/msg00635.h<https://www.redhat.com/**archives/libvir-list/2013-**July/msg00635.h> >>>>>> tml<https://www.redhat.com/**archives/libvir-list/2013-** >>>>>> July/msg00635.h<https://www.redhat.com/archives/libvir-list/2013-July/msg00635.h> >>>>>> tml> >>>>>> " >>>>>> But it's also introduced by CloudStack, as cloudstack will call >>>>>> libvirt storage pool refresh method each time when access the >>>>>> storage pool. The code is added by commit: >>>>>> 2ffc9907f7b0d371737e39b7649f7a****f23026f5cf, >>>>>> about less than one year ago. >>>>>> >>>>>> As Wei suggested, we can call storage pool refresh only if needed, >>>>>> it will mitigate the issue(It's behavior I did on cloudstack >>>>>> pre-4.0), but it's only treat the symptom, not the cause. >>>>>> Or add a cluster wide lock, only one guy can access storage pool at >>>>>> one time, we can add a file lock on NFS primary storage. >>>>>> Any idea/feedback on how to fix this KVM issue? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>