Re: [opnfv-tech-discuss] [Doctor] Reset Server State and alarms in general

Yujun Zhang Wed, 21 Sep 2016 00:06:07 -0700

After reading the whole message, I could not agree more on the conclusion,
IIUC, we should probably raise a deducted alarm in inspector instead of
requesting the controller to reset server state.


On Wed, Sep 21, 2016 at 2:51 PM Juvonen, Tomi (Nokia - FI/Espoo) <
[email protected]> wrote:

> Hi,
>
> I had a lively discussion yesterday with OpenStack Nova cores about the
> reset server state. At first how to have that by one API call for all VMs
> on a host (hypervisor) as discussed in DOCTOR-78. But then it came to a
> question why we actually want the reset server state in the first place. It
> is not something that need to do if force down a host. If we want a
> notification about effected VMs and further an alarm, then that is another
> thing. So if we want that kind of notification, it is then something we
> should make a spec.
>

This sounds like a job of the inspector like vitrage, i.e. deduct a VM
error from host error and raise a deducted alarm.

Not to reset state to error for each VM on a host that we should not be
> doing in the first place if error was not on VM, but host level (yes before
> you ask, Nova can have the working VM state unchanged if host is down. You
> do not touch VM state if you do not want to do something for the VM or if
> it was actually the one having error. Yes and you do not want to do
> anything for the VM itself in all scenarios, but just be happy it comes up
> again on same host when host comes back.)
>

Agree


> Again I realize here and what I have said a long ago before we had
> anything. It will not be possible to make alarms correctly by changing
> state in Nova and other controllers and then triggering alarm from the
> notification about those state changes. That will never have what we want
> for the alarms, while otherwise we sure need to correct states. Even for
> things we get a notification triggered by state change, we will not have
> information needed in alarm and surely we do not call APIs in vain, just to
> have alarm (like reset server state) .
>
> We want tenant/VNFM specific alarms to tells which his VMs (virtual
> resources) are effected by fault and a cause (and surely alarms about
> physical faults that will not be consumed by tenant/VNFM and other fields
> needed by ETSI spec). Only way of having this correct for each kind of
> fault that can appear, is to form all the alarms (notification to form
> alarm) in the Inspector (Congress or Vitrage).
>

I have exactly the same understanding.

It is the only place that has all the information needed in different
> scenarios and can make this right and has the minimum delay that is crucial
> in Telco fault management. Also if looking to have OPNFV used in production
> and one would need to be OPNFV compliant, it means we need to make things
> right. I strongly suggest that while we have the way we make alarm as a
> great step we have achieved so far as proof of concept (changing states and
> having alarm under 1 second), let’s make next steps to go towards having
> conceptually correct way to achieve this and have correct alarms.
>
> Br,
> Tomi
>
>
>
> _______________________________________________
> opnfv-tech-discuss mailing list
> [email protected]
> https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss
>

_______________________________________________
opnfv-tech-discuss mailing list
[email protected]
https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss

Re: [opnfv-tech-discuss] [Doctor] Reset Server State and alarms in general

Reply via email to