Hi Mike,

While not directly answering your question, allow me to share with you the 
OPNFV Doctor (https://wiki.opnfv.org/doctor), a fault management and 
maintenance project that extends and uses OpenStack.

The team should deliver today the final document for the two weeks project-wide 
review. In the meantime you can check the latest available draft here: 
http://lists.opnfv.org/pipermail/opnfv-tech-discuss/2015-March/001629.html.

Feedback is welcome!

Thanks,
Carlos

Carlos Gonçalves | NEC Europe Ltd. | Kurfürsten-Anlage 36 | 69115 Heidelberg | 
Germany | +49 6221 4342-217
NEC Europe Ltd | Registered Office: Athene, Odyssey Business Park, West End  
Road, London, HA4 6QE, GB | Registered in England 2832014

From: Mike Dorman [mailto:mdor...@godaddy.com]
Sent: 30 March 2015 05:26
To: OpenStack Operators
Subject: [Openstack-operators] What to do when a compute node dies?

Hi all,

I’m curious about how people deal with failures of compute nodes, as in total 
failure when the box is gone for good.  (Mainly care about KVM HV, but also 
interested in more general cases as well.)

The particular situation we’re looking at: how end users could identify or be 
notified of VMs that no longer exist, because their hypervisor is dead.  As I 
understand it, Nova will still believe VMs are running, and really has no way 
to know anything has changed (other than the nova-compute instance has dropped 
off.)

I understand failure detection is a tricky thing.  But it seems like there must 
be something a little better than this.

Thanks,
Mike

_______________________________________________
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Reply via email to