Re: [openstack-dev] [nova] should we have a stale data indication in "nova list/show"?

Joe Gordon Tue, 24 Jun 2014 17:19:08 -0700

On Tue, Jun 24, 2014 at 5:12 PM, Joe Gordon <[email protected]> wrote:


>
>
>
> On Tue, Jun 24, 2014 at 4:16 PM, Ahmed RAHAL <[email protected]> wrote:
>
>> Le 2014-06-24 17:38, Joe Gordon a écrit :
>>
>>>
>>> On Jun 24, 2014 2:31 PM, "Russell Bryant" <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>
>>   > There be dragons here.  Just because Nova doesn't see the node
>>> reporting
>>>  > in, doesn't mean the VMs aren't actually still running.  I think this
>>>  > needs to be left to logic outside of Nova.
>>>  >
>>>  > For example, if your deployment monitoring really does think the host
>>> is
>>>  > down, you want to make sure it's *completely* dead before taking
>>> further
>>>  > action such as evacuating the host.  You certainly don't want to risk
>>>  > having the VM running on two different hosts.  This is just a
>>> business I
>>>  > don't think Nova should be getting in to.
>>>
>>> I agree nova shouldn't take any actions. But I don't think leaving an
>>> instance as 'active' is right either.  I was thinking move instance to
>>> error state (maybe an unknown state would be more accurate) and let the
>>> user deal with it, versus just letting the user deal with everything.
>>> Since nova knows something *may* be wrong shouldn't we convey that to
>>> the user (I'm not 100% sure we should myself).
>>>
>>
>> I saw compute nodes going down, from a management perspective (say,
>> nova-compute disappeared), but VMs were just fine. Reporting on the state
>> may be misleading. The 'unknown' state would fit, but nothing lets us
>> presume the VMs are non-functional or impacted.
>>
>
> nothing lets us presume the opposite as well. We don't know if the
> instance is still up.
>
>
>>
>> As far as an operator is concerned, a compute node not responding is a
>> reason enough to check the situation.
>>
>> To go further about other comments related to customer feedback, there
>> are many reasons a customer may think his VM is down, so showing him a
>> 'useful information' in some cases will only trigger more anxiety.
>> Besides people will start hammering the API to check 'state' instead of
>> using proper monitoring.
>> But, state is already reported if the customer shuts down a VM, so ...
>>
>> Currently, compute nodes state reporting is done by the nova-compute
>> process himself, reporting back with a time stamp to the database (through
>> conductor if I recall well). It's more like a watchdog than a reporting
>> system.
>> For VMs (assuming we find it useful) the same kind of process could
>> occur: nova-compute reporting back all states with time stamps for all VMs
>> he hosts. This shall then be optional, as I already sense
>> scaling/performance issues here (ceilometer anyone ?).
>>
>> Finally, assuming the customer had access to this 'unknown' state
>> information, what would he be able to do with it ? Usually he has no lever
>> to 'evacuate' or 'recover' the VM. All he could do is spawn another
>> instance to replace the lost one. But only if the VM really is currently
>> unavailable, an information he must get from other sources.
>>
>
> If I was a user, and my instance went to an 'UNKNOWN' state, I would check
> if its still operating, and if not delete it and start another instance.
>

The alternative is how things work today, if a nova-compute goes down we
don't change any instance states, and the user is responsible for making
sure there instance is still operating even if the instance is set to
ACTIVE.


>
>
>>
>> So, I see how the state reporting could be a useful information, but am
>> not sure that nova Status is the right place for it.
>>
>> Ahmed. in
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> [email protected]
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] should we have a stale data indication in "nova list/show"?

Reply via email to