Re: [openstack-dev] 答复: Re: [Vitrage] About alarms reported by datasource and the alarms generated by vitrage evaluator

Afek, Ifat (Nokia - IL) Sun, 15 Jan 2017 05:21:13 -0800

Hi Yinliyin,

There are two use cases:
One is yours, where you have a single monitor that generates “real” alarms, and 
Vitrage that generates deduced alarms.
Another is where someone has a few monitors, and there might be a 
collision/equivalence between their alarms.


The solution that you suggested might solve the first use case, but I wouldn’t 
want to ignore the second one, which is also valid.

Regarding some of your specific suggestions:

1.       In templates, we only define the alarm entity for the datasource that 
the alarm is reported by, such as Nagios.
[Ifat] This will only work for a single monitor.
       2.  When evaluator deduce an alarm, it would raise the alarm with the 
type set to be the datasource that would report the alarm, not be vitrage.
[Ifat] I don’t think this is right. In Vitrage Alarm view in the UI, displaying 
the deduced alarm as “Nagios” is misleading, since Nagios did not report this 
alarm.

I can think of a solution that is specific to the deduced alarms case, where we 
will replace a Vitrage alarm with a “real” alarm whenever there is a collision. 
This solution is easier, but we should carefully examine all use cases to make 
sure there is no ambiguity. However, for the more general use case I would 
prefer the option that we discussed in a previous mail, of having two (or more) 
alarms connected with a ‘equivalent’ relationship.

What do you think?
Ifat.


From: "[email protected]" <[email protected]>
Date: Saturday, 14 January 2017 at 09:57


·         It won’t solve the general problem of two different monitors that 
raise the same alarm

·           [yinliyin] Generally, we would only deploy one monitor for a same 
alarm.

·         It won’t solve possible conflicts of timestamp and severity between 
different monitors

·          [yinliyin] Please see the following contents.

·         It will make the decision of when to delete the alarm more complex 
(delete it when the deduced alarm is deleted? When Nagios alarm is deleted? 
both? And how to change the timestamp and severity in these cases?)

·          [yinliyin] Please see the following contents.


   The following is the basic idea of solving the problem in this situation:

       1.  In templates, we only define the alarm entity for the datasource 
that the alarm is reported by, such as Nagios.

       2.  When evaluator deduce an alarm, it would raise the alarm with the 
type set to be the datasource that would report the alarm, not be vitrage.

       3.  When entity_graph get the events from the "evaluator_queue"(all the 
alarms in the "evaluator_queue" are deduced alarms), it queries the graph to 
find out whether there was a same alarm reported  by datasource. If  it was 
true,  it would discard the alarm.

      4.  When entity_graph get the events from "queue",  it queries the graph 
to find out whether there was a same alarm deduced by evaluator. If it was 
true, it would replace the alarm in the graph with the newly arrived alarm 
reported by the datasource.

     5.  When the evaluator deduced that an alarm would be deleted, it deletes 
the alarm whatever the generation type of the alarm be(Generated by datasource 
or deduced by evaluator).

     6. When datasource reports recover event of an alarm, entity_graph would 
query graph to find out whether the alarm was exist. If the alarm was not 
exist, entity_graph would discard the event.

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] 答复: Re: [Vitrage] About alarms reported by datasource and the alarms generated by vitrage evaluator

Reply via email to