GitHub user remibergsma opened a pull request:
https://github.com/apache/cloudstack/pull/211
return a state instead of null in AbstractInvestigatorImpl
When a full cluster is down or unreachable,
CloudStack currently reports everything the
same as the last known state, which is usually
Up. When it cannot reach a host and cannot
reach another host in the same cluster either,
it returns null and says "I don't know". This
prevents it from reporting the problem. Now,
we return an Alert or Disconnected state so
proper action can be taken.
Also logging was added, so we know what part
of the code put it to Alert or Disconnected.
When the host is available again, it goes
from Alert state back to Up and CloudStack
starts HA work to recover the VMs. I tested
it on 4.6/master and it works fine now.
As this is a nasty bug, we might want to fix
this also in 4.5 and 4.4.
Thanks to @dahn and @snuf for their
help solving this issue.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/remibergsma/cloudstack
investigator_null_state_fix
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/cloudstack/pull/211.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #211
----
commit 78e095e64b2344a49e96a7939ca6edd3b36d93dd
Author: Remi Bergsma <[email protected]>
Date: 2015-04-29T18:14:14Z
return a state instead of null
When a full cluster is down or unreachable,
CloudStack currently reports everything the
same as the last known state, which is usually
Up. When it cannot reach a host and cannot
reach another host in the same cluster either,
it returns null and says "I don't know". This
prevents it from reporting the problem. Now,
we return an Alert or Disconnected state so
proper action can be taken.
Also logging was added, so we know what part
of the code put it to Alert or Disconnected.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---