Hi Yujun,

See my comments inline.

Ifat.

From: Yujun Zhang <zhangyujun+...@gmail.com>
Date: Wednesday, 11 January 2017 at 12:12


I have just realized abstract alarm is not a good term. What I was talking 
about is fault and alarm.

Fault is what actually happens, and alarm is how it is detected (or deduced).


On Wed, Jan 11, 2017 at 5:13 PM Yujun Zhang 
<zhangyujun+...@gmail.com<mailto:zhangyujun%2b...@gmail.com>> wrote:

I think YinLiYin's idea is a reasonable requirement from end user. They care 
more about the real faults in the system, not how they are detected. Though it 
will bring much challenge to design and engineering, it creates value for 
customers. I'm quite positive on this evolution.

[Ifat] Of course. I never argued about the need, just tried to figure out how 
we should implement it.

One possible solution would be introducing a high level (abstract) template 
from users view. Then convert it to Vitrage scenario templates (or directly to 
graph). The more sources (nagios, vitrage deduction) for an abstract alarm we 
get from the system, the more confidence we get for a real fault. And the 
confidence of an alarm could be included in the scenario condition.

[Ifat] I understand your idea, not sure yet if it helps with the use case.
How would you imagine the ‘confidence’ property? As Boolean or a counter? One 
option is ‘deduced’ vs. ‘monitored’. Another option is to count the number of 
monitors that reported it. Personally, I don’t think this is needed. I think 
that if Nagios reports an error, then it is confident enough without getting it 
from another monitor.


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to