On Mar 4, 2014, at 3:38 PM, Marcus wrote:

> On Tue, Mar 4, 2014 at 3:34 AM, France <mailingli...@isg.si> wrote:
>> Hi Marcus and others.
>> 
>> There is no need to kill of the entire hypervisor, if one of the primary
>> storages fail.
>> You just need to kill the VMs and probably disable SR on XenServer, because
>> all other SRs and VMs have no problems.
>> if you kill those, then you can safely start them elsewhere. On XenServer
>> 6.2 you call destroy the VMs which lost access to NFS without any problems.
> 
> That's a great idea, but as already mentioned, it doesn't work in
> practice. You can't kill a VM that is hanging in D state, waiting on
> storage. I also mentioned that it causes problems for libvirt and much
> of the other system not using the storage.

You can on XS 6.2 as tried in in real life and reported by others as well.

> 
>> 
>> If you really want to still kill the entire host and it's VMs in one go, I
>> would suggest live migrating the VMs which have had not lost their storage
>> off first, and then kill those VMs on a stale NFS by doing hard reboot.
>> Additional time, while migrating working VMs, would even give some grace
>> time for NFS to maybe recover. :-)
> 
> You won't be able to live migrate a VM that is stuck in D state, or
> use libvirt to do so if one of its storage pools is unresponsive,
> anyway.
> 

I dont want to live migrate VMs in D state, just the working VMs. Those stuck 
can die with hypervisor reboot.


>> 
>> Hard reboot to recover from D state of NFS client can also be avoided by
>> using soft mount options.
> 
> As mentioned, soft and intr very rarely actually work, in my
> experience. I wish they did as I truly have come to loathe NFS for it.
> 
>> 
>> I run a bunch of Pacemaker/Corosync/Cman/Heartbeat/etc clusters and we don't
>> just kill whole nodes but fence services from specific nodes. STONITH is
>> implemented only when the node looses the quorum.
> 
> Sure, but how do you fence a KVM host from an NFS server? I don't
> think we've written a firewall plugin that works to fence hosts from
> any NFS server. Regardless, what CloudStack does is more of a poor
> man's clustering, the mgmt server is the locking in the sense that it
> is managing what's going on, but it's not a real clustering service.
> Heck, it doesn't even STONITH, it tries to clean shutdown, which fails
> as well due to hanging NFS (per the mentioned bug, to fix it they'll
> need IPMI fencing or something like that).

In my case as well as in the case of OP, the hypervisor got rebooted 
successfully.

> 
> I didn't write the code, I'm just saying that I can completely
> understand why it kills nodes when it deems that their storage has
> gone belly-up. It's dangerous to leave that D state VM hanging around,
> and it will until the NFS storage comes back. In a perfect world you'd
> just stop the VMs that were having the issue, or if there were no VMs
> you'd just de-register the storage from libvirt, I agree.

As previously stated on XS 6.2 you can "destroy" VMs with unaccessible NFS 
storage. I do not remember if processes were in the D state or whatever, cause 
i used the GUI, if i remember correctly. I am sure, you can test it yourself 
too.


> 
>> 
>> Regards,
>> F.
>> 
>> 
>> On 3/3/14 5:35 PM, Marcus wrote:
>>> 
>>> It's the standard clustering problem. Any software that does any sort
>>> of avtive clustering is going to fence nodes that have problems, or
>>> should if it cares about your data. If the risk of losing a host due
>>> to a storage pool outage is too great, you could perhaps look at
>>> rearranging your pool-to-host correlations (certain hosts run vms from
>>> certain pools) via clusters. Note that if you register a storage pool
>>> with a cluster, it will register the pool with libvirt when the pool
>>> is not in maintenance, which, when the storage pool goes down will
>>> cause problems for the host even if no VMs from that storage are
>>> running (fetching storage stats for example will cause agent threads
>>> to hang if its NFS), so you'd need to put ceph in its own cluster and
>>> NFS in its own cluster.
>>> 
>>> It's far more dangerous to leave a host in an unknown/bad state. If a
>>> host loses contact with one of your storage nodes, with HA, cloudstack
>>> will want to start the affected VMs elsewhere. If it does so, and your
>>> original host wakes up from it's NFS hang, you suddenly have a VM
>>> running in two locations, corruption ensues. You might think we could
>>> just stop the affected VMs, but NFS tends to make things that touch it
>>> go into D state, even with 'intr' and other parameters, which affects
>>> libvirt and the agent.
>>> 
>>> We could perhaps open a feature request to disable all HA and just
>>> leave things as-is, disallowing operations when there are outages. If
>>> that sounds useful you can create the feature request on
>>> https://issues.apache.org/jira.
>>> 
>>> 
>>> On Mon, Mar 3, 2014 at 5:37 AM, Andrei Mikhailovsky <and...@arhont.com>
>>> wrote:
>>>> 
>>>> Koushik, I understand that and I will put the storage into the
>>>> maintenance mode next time. However, things happen and servers crash from
>>>> time to time, which is not the reason to reboot all host servers, even 
>>>> those
>>>> which do not have any running vms with volumes on the nfs storage. The
>>>> bloody agent just rebooted every single host server regardless if they were
>>>> running vms with volumes on the rebooted nfs server. 95% of my vms are
>>>> running from ceph and those should have never been effected in the first
>>>> place.
>>>> ----- Original Message -----
>>>> 
>>>> From: "Koushik Das" <koushik....@citrix.com>
>>>> To: "<us...@cloudstack.apache.org>" <us...@cloudstack.apache.org>
>>>> Cc: dev@cloudstack.apache.org
>>>> Sent: Monday, 3 March, 2014 5:55:34 AM
>>>> Subject: Re: ALARM - ACS reboots host servers!!!
>>>> 
>>>> The primary storage needs to be put in maintenance before doing any
>>>> upgrade/reboot as mentioned in the previous mails.
>>>> 
>>>> -Koushik
>>>> 
>>>> On 03-Mar-2014, at 6:07 AM, Marcus <shadow...@gmail.com> wrote:
>>>> 
>>>>> Also, please note that in the bug you referenced it doesn't have a
>>>>> problem with the reboot being triggered, but with the fact that reboot
>>>>> never completes due to hanging NFS mount (which is why the reboot
>>>>> occurs, inaccessible primary storage).
>>>>> 
>>>>> On Sun, Mar 2, 2014 at 5:26 PM, Marcus <shadow...@gmail.com> wrote:
>>>>>> 
>>>>>> Or do you mean you have multiple primary storages and this one was not
>>>>>> in use and put into maintenance?
>>>>>> 
>>>>>> On Sun, Mar 2, 2014 at 5:25 PM, Marcus <shadow...@gmail.com> wrote:
>>>>>>> 
>>>>>>> I'm not sure I understand. How do you expect to reboot your primary
>>>>>>> storage while vms are running? It sounds like the host is being
>>>>>>> fenced since it cannot contact the resources it depends on.
>>>>>>> 
>>>>>>> On Sun, Mar 2, 2014 at 3:24 PM, Nux! <n...@li.nux.ro> wrote:
>>>>>>>> 
>>>>>>>> On 02.03.2014 21:17, Andrei Mikhailovsky wrote:
>>>>>>>>> 
>>>>>>>>> Hello guys,
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I've recently came across the bug CLOUDSTACK-5429 which has rebooted
>>>>>>>>> all of my host servers without properly shutting down the guest vms.
>>>>>>>>> I've simply upgraded and rebooted one of the nfs primary storage
>>>>>>>>> servers and a few minutes later, to my horror, i've found out that
>>>>>>>>> all
>>>>>>>>> of my host servers have been rebooted. Is it just me thinking so, or
>>>>>>>>> is this bug should be fixed ASAP and should be a blocker for any new
>>>>>>>>> ACS release. I mean not only does it cause downtime, but also
>>>>>>>>> possible
>>>>>>>>> data loss and server corruption.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Hi Andrei,
>>>>>>>> 
>>>>>>>> Do you have HA enabled and did you put that primary storage in
>>>>>>>> maintenance
>>>>>>>> mode before rebooting it?
>>>>>>>> It's my understanding that ACS relies on the shared storage to
>>>>>>>> perform HA so
>>>>>>>> if the storage goes it's expected to go berserk. I've noticed similar
>>>>>>>> behaviour in Xenserver pools without ACS.
>>>>>>>> I'd imagine a "cure" for this would be to use network distributed
>>>>>>>> "filesystems" like GlusterFS or CEPH.
>>>>>>>> 
>>>>>>>> Lucian
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Sent from the Delta quadrant using Borg technology!
>>>>>>>> 
>>>>>>>> Nux!
>>>>>>>> www.nux.ro
>>>> 
>>>> 
>> 

Reply via email to