Hi YinLiYin, This is an interesting question. Let me divide my answer to two parts.
First, the case that you described with Nagios and Vitrage. This problem depends on the specific Nagios tests that you configure in your system, as well as on the Vitrage templates that you use. For example, you can use Nagios/Zabbix to monitor the physical layer, and Vitrage to raise deduced alarms on the virtual and application layers. This way you will never have duplicated alarms. If you want to use Nagios to monitor the other layers as well, you can simply modify Vitrage templates so they don’t raise the deduced alarms that Nagios may generate, and use the templates to show RCA between different Nagios alarms. Now let’s talk about the more general case. Vitrage can receive alarms from different monitors, including Nagios, Zabbix, collectd and Aodh. If you are using more than one monitor, it is possible that the same alarm (maybe with a different name) will be raised twice. We need to create a mechanism to identify such cases and create a single alarm with the properties of both monitors. This has not been designed in details yet, so if you have any suggestion we will be happy to hear them. Best Regards, Ifat. From: "[email protected]" <[email protected]> Reply-To: "OpenStack Development Mailing List (not for usage questions)" <[email protected]> Date: Friday, 6 January 2017 at 03:27 To: "[email protected]" <[email protected]> Cc: "[email protected]" <[email protected]>, "[email protected]" <[email protected]>, "[email protected]" <[email protected]>, "[email protected]" <[email protected]>, "[email protected]" <[email protected]> Subject: [openstack-dev] [Vitrage] About alarms reported by datasource and the alarms generated by vitrage evaluator Hi all, Vitrage generate alarms acording to the templates. All the alarms raised by vitrage has the type "vitrage". Suppose Nagios has an alarm A. Alarm A is raised by vitrage evaluator according to the action part of a scenario, type of alarm A is "vitrage". If Nagios reported alarm A latter, a new alarm A with type "Nagios" would be generator in the entity graph. There would be two vertices for the same alarm in the graph. And we have to define two alarm entities, two relationships, two scenarios in the template file to make the alarm propagation procedure work. It is inconvenient to describe fault model of system with lot of alarms. How to solve this problem? 殷力殷 YinLiYin [cid:[email protected]] [cid:[email protected]] 上海市浦东新区碧波路889号中兴研发大楼D502 D502, ZTE Corporation R&D Center, 889# Bibo Road, Zhangjiang Hi-tech Park, Shanghai, P.R.China, 201203 T: +86 21 68896229 M: +86 13641895907 E: [email protected] www.zte.com.cn<http://www.zte.com.cn/>
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
