After reading the whole message, I could not agree more on the conclusion, IIUC, we should probably raise a deducted alarm in inspector instead of requesting the controller to reset server state.
On Wed, Sep 21, 2016 at 2:51 PM Juvonen, Tomi (Nokia - FI/Espoo) < [email protected]> wrote: > Hi, > > I had a lively discussion yesterday with OpenStack Nova cores about the > reset server state. At first how to have that by one API call for all VMs > on a host (hypervisor) as discussed in DOCTOR-78. But then it came to a > question why we actually want the reset server state in the first place. It > is not something that need to do if force down a host. If we want a > notification about effected VMs and further an alarm, then that is another > thing. So if we want that kind of notification, it is then something we > should make a spec. > This sounds like a job of the inspector like vitrage, i.e. deduct a VM error from host error and raise a deducted alarm. Not to reset state to error for each VM on a host that we should not be > doing in the first place if error was not on VM, but host level (yes before > you ask, Nova can have the working VM state unchanged if host is down. You > do not touch VM state if you do not want to do something for the VM or if > it was actually the one having error. Yes and you do not want to do > anything for the VM itself in all scenarios, but just be happy it comes up > again on same host when host comes back.) > Agree > Again I realize here and what I have said a long ago before we had > anything. It will not be possible to make alarms correctly by changing > state in Nova and other controllers and then triggering alarm from the > notification about those state changes. That will never have what we want > for the alarms, while otherwise we sure need to correct states. Even for > things we get a notification triggered by state change, we will not have > information needed in alarm and surely we do not call APIs in vain, just to > have alarm (like reset server state) . > > We want tenant/VNFM specific alarms to tells which his VMs (virtual > resources) are effected by fault and a cause (and surely alarms about > physical faults that will not be consumed by tenant/VNFM and other fields > needed by ETSI spec). Only way of having this correct for each kind of > fault that can appear, is to form all the alarms (notification to form > alarm) in the Inspector (Congress or Vitrage). > I have exactly the same understanding. It is the only place that has all the information needed in different > scenarios and can make this right and has the minimum delay that is crucial > in Telco fault management. Also if looking to have OPNFV used in production > and one would need to be OPNFV compliant, it means we need to make things > right. I strongly suggest that while we have the way we make alarm as a > great step we have achieved so far as proof of concept (changing states and > having alarm under 1 second), let’s make next steps to go towards having > conceptually correct way to achieve this and have correct alarms. > > Br, > Tomi > > > > _______________________________________________ > opnfv-tech-discuss mailing list > [email protected] > https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss >
_______________________________________________ opnfv-tech-discuss mailing list [email protected] https://lists.opnfv.org/mailman/listinfo/opnfv-tech-discuss
