Edison,

Please review the patch: https://reviews.apache.org/r/13223/

-Wei


2013/8/2 Wei ZHOU <[email protected]>

> Alex,
>
> Exactly.
>
> We can also use Enable Maintenance -> umount nfs point -> restart
> cloudstack-agent -> Cancel Maintenance to solve this issue.
>
> -Wei
>
>
> 2013/8/2 Alex Huang <[email protected]>
>
>> So I have very limited knowledge on KVM.  But, from my understanding from
>> Edison, we should consider what has to be done to fix this problem once it
>> occurs.
>>
>> - Shutdown all VMs on all hosts that are affected.
>> - umount the nfs mount point
>> - Reestablish the storage pool.
>> - Restart the VMs.
>>
>> Given how severe these actions are to the end user, I would vote for the
>> file lock to ensure it never happens, even if it's slower.
>>
>> --Alex
>>
>> > -----Original Message-----
>> > From: Wei ZHOU [mailto:[email protected]]
>> > Sent: Tuesday, July 16, 2013 3:35 AM
>> > To: [email protected]
>> > Subject: Re: How to fix libvirt storage pool refresh issue?
>> >
>> > I agree with Wido.
>> >
>> > Moreover, the file lock will cause performane degrade of VM deployment.
>> >
>> > -Wei
>> >
>> >
>> > 2013/7/16 Wido den Hollander <[email protected]>
>> >
>> > > On 07/16/2013 12:27 AM, Marcus Sorensen wrote:
>> > >
>> > >>     I'm ok with a symptom fix on our end, if the root cause is in
>> > >> Libvirt we can't do much about that. This is the sort of patch that
>> > >> tends to get pulled into the regular update cycle of the
>> > >> distributions, so unless there's more to it and it's not a good fix I
>> > >> imagine we will see it come through without having to wait for the
>> > >> next point releases. We still have to support existing users who
>> > >> might not be running the latest, though, so the symptom fix is
>> > >> probably ok as a temporary measure.
>> > >>
>> > >
>> > > I'm ok with not calling storagePoolRefresh every time we want a
>> > > capacity update, since that's also kind of I/O intensive for larger
>> storage
>> > arrays.
>> > >
>> > > However, we should make sure we have a GOOD comment in the code
>> > about
>> > > this "fix", since that's the reason I initially removed the old code
>> > > which invoked "df".
>> > >
>> > > I'll see if I can get this libvirt patch into Ubuntu when it hits
>> > > libvirt upstream, since this bug is really annoying.
>> > >
>> > > Wido
>> > >
>> > >
>> > >
>> > >> On Mon, Jul 15, 2013 at 3:42 PM, Edison Su <[email protected]>
>> wrote:
>> > >>
>> > >>> There is a serious issue on KVM(https://issues.apache.org/**
>> > >>> jira/browse/CLOUDSTACK-
>> > 2729<https://issues.apache.org/jira/browse/CLOUDSTACK-2729>):
>> > >>> a libvirt storage pool can disappear on KVM host, it's easy to be
>> > >>> reproduced in our internal QA environment.
>> > >>> Wei found the root cause, is on the libvirt:
>> > >>> "
>> > >>> This is a libvirt issue. I created a ticket for it.
>> > >>> https://bugzilla.redhat.com/**show_bug.cgi?id=977706<
>> https://bugzill
>> > >>> a.redhat.com/show_bug.cgi?id=977706>
>> > >>> The patch is very simple.
>> > >>>
>> https://www.redhat.com/**archives/libvir-list/2013-**July/msg00635.h
>> > >>> tml<
>> https://www.redhat.com/archives/libvir-list/2013-July/msg00635.h
>> > >>> tml>
>> > >>> "
>> > >>> But it's also introduced by CloudStack, as cloudstack will call
>> > >>> libvirt storage pool refresh method each time when access the
>> > >>> storage pool. The code is added by commit:
>> > >>> 2ffc9907f7b0d371737e39b7649f7a**f23026f5cf,
>> > >>> about less than one year ago.
>> > >>>
>> > >>> As Wei suggested, we can call storage pool refresh only if needed,
>> > >>> it will mitigate the issue(It's behavior I did on cloudstack
>> > >>> pre-4.0), but it's only treat the symptom, not the cause.
>> > >>> Or add a cluster wide lock, only one guy can access storage pool at
>> > >>> one time, we can add a file lock on NFS primary storage.
>> > >>> Any idea/feedback on how to fix this KVM issue?
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >
>>
>
>

Reply via email to