I like the current behavior of not changing the VM state if nova-compute goes down.
The cloud operators can identify the issue in the compute node and try to fix it without users noticing. Depending in the problem I can inform users if instances are affected and change the state if necessary. I wouldn't like is to expose any failure in nova-compute to users and be contacted because VM state changed. Belmiro On Wed, Jun 25, 2014 at 4:49 AM, Ahmed RAHAL <[email protected]> wrote: > Hi, > > Le 2014-06-24 20:12, Joe Gordon a écrit : > > >> Finally, assuming the customer had access to this 'unknown' state >> information, what would he be able to do with it ? Usually he has no >> lever to 'evacuate' or 'recover' the VM. All he could do is spawn >> another instance to replace the lost one. But only if the VM really >> is currently unavailable, an information he must get from other >> sources. >> >> >> If I was a user, and my instance went to an 'UNKNOWN' state, I would >> check if its still operating, and if not delete it and start another >> instance. >> > > If I was a user and polled nova list/show on a regular basis just in case > the management pane indicates a failure, I should have no expectation > whatsoever. If service availability is my concern, I should monitor the > service, nothing else. From there, once the service has failed, I can > imagine checking if VM management is telling me something. However, if my > service is down and I have no longer access to the VM ... simple case: > destroy and respawn. > > My point is that we should not make the nova state an expected source of > truth regarding service availability in the VM, as there is no way to tell > such a thing. If my VM is being DDOSed, nova would still say everything is > fine, while my service is really down. In that situation, console access > would help me determine if the VM management is wrong by stating everything > is ok or if there is another root cause. > Similarly, should nova show a state change if load in the VM is through > the roof and the service is not responsive ? or if OOM is killing all my > processes because of a memory shortage ? > > As stated before, providing such a state information is misleading because > there are cases where node unavailability is not service disruptive, thus > it would indicate a false positive while the opposite (everything is ok) is > not at all indicative of a healthy status of the service. > > Maybe am I overseeing a use case here where you absolutely need the user > of the service to know a potential problem with his hosting platform. > > Ahmed. > > -- > ================================================= > Ahmed Rahal <[email protected]> / iWeb Technologies > Spécialiste de l'Architecture TI > / IT Architecture Specialist > ================================================= > > > _______________________________________________ > OpenStack-dev mailing list > [email protected] > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >
_______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
