[
https://issues.apache.org/jira/browse/CLOUDSTACK-5429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13919520#comment-13919520
]
Marcus Sorensen commented on CLOUDSTACK-5429:
---------------------------------------------
You could change from a clean reboot to sysrq triggers, which would be more
like ipmi/power fencing that would normally occur in a situation like this.
It's really bad to have VM processes running like this if we try to start them
elsewhere. Most distributions enable it by default. It would be nice if the
agent could also somehow tell the mgmt server that it's relinquishing those vms
prior to force-rebooting itself, since I know under normal circumstances the HA
VMs won't run anywhere else until the agent on this host is reachable again,
which could be a long time if there's actually a host-specific issue.
http://fedoraproject.org/wiki/QA/Sysrq
> KVM - Primary store down/Network Failure - Hosts attempt to reboot becasue of
> primary store being down hangs.
> -------------------------------------------------------------------------------------------------------------
>
> Key: CLOUDSTACK-5429
> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-5429
> Project: CloudStack
> Issue Type: Bug
> Security Level: Public(Anyone can view this level - this is the
> default.)
> Components: Management Server
> Affects Versions: 4.3.0
> Environment: Build from 4.3
> Reporter: Sangeetha Hariharan
> Assignee: edison su
> Priority: Critical
> Fix For: 4.4.0
>
> Attachments: kvm-networkshutdown.png, kvmhostreboot.png, psdown.rar
>
>
> KVM - Primary store down - Hosts attempt to reboot becasue of primary store
> being down hangs.
> Set up:
> Advanced zone with KVM (RHEL 6.3) hosts.
> Steps to reproduce the problem:
> 1. Deploy few Vms in each of the hosts with 10 GB ROOT volume size , so we
> start with 10 Vms.
> 2. Create snaposhot for ROOT volumes.
> 3. When snapshot is still in progress , Make the primary storage unavailable
> for 10 mts.
> This results in the KVM hosts to reboot.
> But reboot of KVM host is not successful.
> It is stuck at trying to unmount nfs mount points.
--
This message was sent by Atlassian JIRA
(v6.2#6252)