Re: [openstack-dev] [Vitrage] About alarms reported by datasource and the alarms generated by vitrage evaluator

Afek, Ifat (Nokia - IL) Sat, 07 Jan 2017 23:02:11 -0800

Hi Yujun,

Thanks for the explanation, but I still don’t fully understand.


Let me start with the current state:
1.       introduce a flexible `metadata` dict in to ALARM entity
[Ifat] Already exists. An alarm is represented as a vertex in the entity graph, 
with a dictionary of properties.
2.       Allow generating update event[1] on metadata change
3.       Allow using ALARM metadata in scenario condition
[Ifat] Already exists. You can define properties in the ‘entities’ section in 
Vitrage templates
4.       Allow setting ALARM metadata in scenario action

If I understand correctly, you are suggesting that one scenario will add 
metadata to an existing alarm, which will trigger an event, and as a result 
another scenario might be executed?
Can you describe a use case where this behavior will help calculating the root 
cause?

Thanks,
Ifat.


From: Yujun Zhang <[email protected]>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" 
<[email protected]>
Date: Saturday, 7 January 2017 at 09:27
To: "OpenStack Development Mailing List (not for usage questions)" 
<[email protected]>
Cc: "[email protected]" <[email protected]>, "[email protected]" 
<[email protected]>, "[email protected]" <[email protected]>, 
"[email protected]" <[email protected]>, "[email protected]" 
<[email protected]>
Subject: Re: [openstack-dev] [Vitrage] About alarms reported by datasource and 
the alarms generated by vitrage evaluator

The two questions raised by YinLiYin is actually one, i.e. how to enrich the 
alarm properties that can be used as an condition in root cause deducing.

Both 'suspect' or 'datasource' are additional information that may be referred 
as a condition in general fault model, a.k.a. scenario in vitrage.

It seems it could be done by

  1.  introduce a flexible `metadata` dict in to ALARM entity
2.      Allow generating update event[1] on metadata change
3.      Allow using ALARM metadata in scenario condition
4.      Allow setting ALARM metadata in scenario action
This will leave the flexibility to continuous development by defining a complex 
scenario template and keep the vitrage evaluator simple and generic.

My two cents.

[1]: 
http://docs.openstack.org/developer/vitrage/scenario-evaluator.html#concepts-and-guidelines

On Sat, Jan 7, 2017 at 2:23 AM Afek, Ifat (Nokia - IL) 
<[email protected]<mailto:[email protected]>> wrote:
Hi YinLiYin,

This is an interesting question. Let me divide my answer to two parts.

First, the case that you described with Nagios and Vitrage. This problem 
depends on the specific Nagios tests that you configure in your system, as well 
as on the Vitrage templates that you use. For example, you can use 
Nagios/Zabbix to monitor the physical layer, and Vitrage to raise deduced 
alarms on the virtual and application layers. This way you will never have 
duplicated alarms. If you want to use Nagios to monitor the other layers as 
well, you can simply modify Vitrage templates so they don’t raise the deduced 
alarms that Nagios may generate, and use the templates to show RCA between 
different Nagios alarms.

Now let’s talk about the more general case. Vitrage can receive alarms from 
different monitors, including Nagios, Zabbix, collectd and Aodh. If you are 
using more than one monitor, it is possible that the same alarm (maybe with a 
different name) will be raised twice. We need to create a mechanism to identify 
such cases and create a single alarm with the properties of both monitors. This 
has not been designed in details yet, so if you have any suggestion we will be 
happy to hear them.

Best Regards,
Ifat.


From: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" 
<[email protected]<mailto:[email protected]>>
Date: Friday, 6 January 2017 at 03:27
To: 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Cc: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: [openstack-dev] [Vitrage] About alarms reported by datasource and the 
alarms generated by vitrage evaluator


Hi all,

   Vitrage generate alarms acording to the templates. All the alarms raised by 
vitrage has the type "vitrage". Suppose Nagios has an alarm A. Alarm A is 
raised by vitrage evaluator according to the action part of a scenario, type of 
alarm A is "vitrage". If Nagios reported alarm A latter, a new alarm A with 
type "Nagios" would be generator in the entity graph.     There would be two 
vertices for the same alarm in the graph. And we have to define two alarm 
entities, two relationships, two scenarios in the template file to make the 
alarm propagation procedure work.

   It is inconvenient to describe fault model of system with lot of alarms. How 
to solve this problem?



殷力殷 YinLiYin




[cid:[email protected]]

[cid:[email protected]]
上海市浦东新区碧波路889号中兴研发大楼D502
D502, ZTE Corporation R&D Center, 889# Bibo Road,
Zhangjiang Hi-tech Park, Shanghai, P.R.China, 201203
T: +86 21 68896229<tel:+86%2021%206889%206229>
M: +86 13641895907<tel:+86%20136%204189%205907>
E: [email protected]<mailto:[email protected]>
www.zte.com.cn<http://www.zte.com.cn/>



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
[email protected]?subject:unsubscribe<http://[email protected]?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Vitrage] About alarms reported by datasource and the alarms generated by vitrage evaluator

Reply via email to