I don't think we should be flipping states for instances on a potentially 
downed compute. We definitely should not set an instance to ERROR. I think a 
time associated with the last power state check might be nice and be good 
enough.

- Chris

> On Jun 24, 2014, at 5:17 PM, Joe Gordon <[email protected]> wrote:
> 
> 
> 
> 
>> On Tue, Jun 24, 2014 at 5:12 PM, Joe Gordon <[email protected]> wrote:
>> 
>> 
>> 
>>> On Tue, Jun 24, 2014 at 4:16 PM, Ahmed RAHAL <[email protected]> wrote:
>>> Le 2014-06-24 17:38, Joe Gordon a écrit :
>>>> 
>>>> On Jun 24, 2014 2:31 PM, "Russell Bryant" <[email protected]
>>>> <mailto:[email protected]>> wrote:
>>> 
>>>>  > There be dragons here.  Just because Nova doesn't see the node reporting
>>>>  > in, doesn't mean the VMs aren't actually still running.  I think this
>>>>  > needs to be left to logic outside of Nova.
>>>>  >
>>>>  > For example, if your deployment monitoring really does think the host is
>>>>  > down, you want to make sure it's *completely* dead before taking further
>>>>  > action such as evacuating the host.  You certainly don't want to risk
>>>>  > having the VM running on two different hosts.  This is just a business I
>>>>  > don't think Nova should be getting in to.
>>>> 
>>>> I agree nova shouldn't take any actions. But I don't think leaving an
>>>> instance as 'active' is right either.  I was thinking move instance to
>>>> error state (maybe an unknown state would be more accurate) and let the
>>>> user deal with it, versus just letting the user deal with everything.
>>>> Since nova knows something *may* be wrong shouldn't we convey that to
>>>> the user (I'm not 100% sure we should myself).
>>> 
>>> I saw compute nodes going down, from a management perspective (say, 
>>> nova-compute disappeared), but VMs were just fine. Reporting on the state 
>>> may be misleading. The 'unknown' state would fit, but nothing lets us 
>>> presume the VMs are non-functional or impacted.
>> 
>> nothing lets us presume the opposite as well. We don't know if the instance 
>> is still up.
>>  
>>> 
>>> As far as an operator is concerned, a compute node not responding is a 
>>> reason enough to check the situation.
>>> 
>>> To go further about other comments related to customer feedback, there are 
>>> many reasons a customer may think his VM is down, so showing him a 'useful 
>>> information' in some cases will only trigger more anxiety.
>>> Besides people will start hammering the API to check 'state' instead of 
>>> using proper monitoring.
>>> But, state is already reported if the customer shuts down a VM, so ...
>>> 
>>> Currently, compute nodes state reporting is done by the nova-compute 
>>> process himself, reporting back with a time stamp to the database (through 
>>> conductor if I recall well). It's more like a watchdog than a reporting 
>>> system.
>>> For VMs (assuming we find it useful) the same kind of process could occur: 
>>> nova-compute reporting back all states with time stamps for all VMs he 
>>> hosts. This shall then be optional, as I already sense scaling/performance 
>>> issues here (ceilometer anyone ?).
>>> 
>>> Finally, assuming the customer had access to this 'unknown' state 
>>> information, what would he be able to do with it ? Usually he has no lever 
>>> to 'evacuate' or 'recover' the VM. All he could do is spawn another 
>>> instance to replace the lost one. But only if the VM really is currently 
>>> unavailable, an information he must get from other sources.
>> 
>> If I was a user, and my instance went to an 'UNKNOWN' state, I would check 
>> if its still operating, and if not delete it and start another instance.
> 
> The alternative is how things work today, if a nova-compute goes down we 
> don't change any instance states, and the user is responsible for making sure 
> there instance is still operating even if the instance is set to ACTIVE.
>  
>>  
>>> 
>>> So, I see how the state reporting could be a useful information, but am not 
>>> sure that nova Status is the right place for it.
>>> 
>>> Ahmed. in
>>> 
>>> 
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> [email protected]
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> _______________________________________________
> OpenStack-dev mailing list
> [email protected]
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to