GitHub user rhtyd opened a pull request: https://github.com/apache/cloudstack/pull/1694
CLOUDSTACK-9509: Host Connects Without Storage KVM hosts on shared storage failure was accepted by mgmt server with the host state as Up, even though there was no primary/shared storage available on it. This patch offers a quick fix by throwing an exception in the storage monitor which connects storage pool on host. The failure is trapped by agent manager that disconnects the agent without any investigation. Based on Lab tests, KVM agent may take upto 2 minutes to attempt NFS mount when the storage is inaccessible (firewalled, or shutdown) before returning back with an error. It is safe to assume that this won't add pressure on mgmt server due to several reconnection attempts, and KVM agent would retry reconnection every 2 minutes. For such KVM hosts, where failure happens due to storage issues; they will be briefly put in Alert state but will be mostly be in Connecting state during which the KVM host attempts to mount/reconfigure NFS storage pool. /cc @jburwell @karuturi @blueorangutan package You can merge this pull request into a Git repository by running: $ git pull https://github.com/shapeblue/cloudstack kvm-no-storage-failfast Alternatively you can review and apply these changes as the patch at: https://github.com/apache/cloudstack/pull/1694.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1694 ---- commit e13e40ee9cc664ec9d326c8b6fae0c76f6adc01a Author: Rohit Yadav <rohit.ya...@shapeblue.com> Date: 2016-06-07T06:11:16Z CLOUDSTACK-9509: Host Connects Without Storage KVM hosts on shared storage failure was accepted by mgmt server with the host state as Up, even though there was no primary/shared storage available on it. This patch offers a quick fix by throwing an exception in the storage monitor which connects storage pool on host. The failure is trapped by agent manager that disconnects the agent without any investigation. Based on Lab tests, KVM agent may take upto 2 minutes to attempt NFS mount when the storage is inaccessible (firewalled, or shutdown) before returning back with an error. It is safe to assume that this won't add pressure on mgmt server due to several reconnection attempts, and KVM agent would retry reconnection every 2 minutes. For such KVM hosts, where failure happens due to storage issues; they will be briefly put in Alert state but will be mostly be in Connecting state during which the KVM host attempts to mount/reconfigure NFS storage pool. Signed-off-by: Rohit Yadav <rohit.ya...@shapeblue.com> ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---