Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

2018-02-22 Thread Afek, Ifat (Nokia - IL/Kfar Sava)
Hi Paul,

Unfortunately I can’t figure out from the log what went wrong. It seems like 
the ‘up’ alarms are ignored. Two things that I would try next:

· Try calling an event with ‘status’:’up’ and see if it works. This is 
working for sure in my environment

· I suspect that the problem is somewhere in 
AlarmDriverBase._filter_and_cache_alarms(). Basically it should search the old 
alarm in the cache and update it. Try to add many debug messages so we could 
see the cache, the new alarm and the old alarm.

Let me know if it helped.
Ifat.

From: Paul Vaduva <paul.vad...@enea.com>
Date: Wednesday, 21 February 2018 at 19:11
To: "Afek, Ifat (Nokia - IL/Kfar Sava)" <ifat.a...@nokia.com>, "OpenStack 
Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org>
Cc: Ciprian Barbu <ciprian.ba...@enea.com>
Subject: RE: [openstack-dev] [vitrage] Vitrage alarm processing behavior

Hi Ifat,

Link to cuted log version
https://hastebin.com/upokifinuq.py

Plus full graph.log attached plus code for driver.py with Logging modifications

Thanks,
Paul

From: Afek, Ifat (Nokia - IL/Kfar Sava) [mailto:ifat.a...@nokia.com]
Sent: Wednesday, February 21, 2018 6:18 PM
To: OpenStack Development Mailing List (not for usage questions) 
<openstack-dev@lists.openstack.org>
Cc: Ciprian Barbu <ciprian.ba...@enea.com>
Subject: Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

Hi Paul,

I suggest that you do the following:

· Add a LOG message at the end of _get_alarms to print all alarms that 
are returned by this function

· Restart vitrage-graph and send me its log. I’d like to see if there 
is any difference between the alarm that is raised and the alarm that is 
deleted.

Thanks,
Ifat.

From: Paul Vaduva <paul.vad...@enea.com<mailto:paul.vad...@enea.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Date: Wednesday, 21 February 2018 at 16:30
To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Cc: Ciprian Barbu <ciprian.ba...@enea.com<mailto:ciprian.ba...@enea.com>>
Subject: Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

I attached also the driver.py that I am using.

From: Paul Vaduva [mailto:paul.vad...@enea.com]
Sent: Wednesday, February 21, 2018 3:22 PM
To: OpenStack Development Mailing List (not for usage questions) 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Cc: Ciprian Barbu <ciprian.ba...@enea.com<mailto:ciprian.ba...@enea.com>>
Subject: [Attachment removed] Re: [openstack-dev] [vitrage] Vitrage alarm 
processing behavior

Hi Ifat,

Sorry for the late reply.
To answer your questions
I started as an example from the doctor datasource (or a porting of it for the 
1.3.0 version of vitrage) but will call it something different so no need to 
worry about conflicting with present doctor datasource.
I added polling alarms to it but I have a more particular use case:
* I get compute host down alarm on event
* I can't get host up event or it's an intricate sollution to implement

I tried to see if I can make the following scenario work:
Let's call Scenario I
* Get a compute host down event (Raisng an alarm)
* Periodically poll for the status of the compute in method "def 
_get_alarms(self):" of the Driver object
Both type of Interactions seem to work (polling and event based).
However now comes the tricky part. I would need for the alarms (with status up 
/ compute host up) returned by method "def _get_alarms(self):" of this Driver 
object to cancel/clear the compute host down alarms raised by event. This 
unfortunatelly does not happen.

Oddely enough there is a mimic of this scenario that works but is not robust 
enough for out needs.
Let's call Scenario II:
* Gettting an event with compute host down(when one of our compute actually 
goes down)
* Polling alarm (also compute host down) is raised and somehow overwrites the 
event based one (I can see the updated time).
* After a while the actual compute reboots and polling for the alarms returns 
an alarm with status up that in this case clears the previous (I assume polling 
type now) alarm.

Now I can't understand why this second scenario works and the first one does 
not.
It seems as the same alarm type (compute host down with status down) obtained 
by polling can overwrite an identical type and status alarm raised by event, 
but An alarm with an updated status (i. e. up) got by polling mode cannot 
overwrite / clear and alarm with status down got by an event.
I am wondering if there is a reason of this behavior and if there is a way to 
modify it or is it a bug.

For the event's generatio

Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

2018-02-21 Thread Afek, Ifat (Nokia - IL/Kfar Sava)
Hi Paul,

I suggest that you do the following:

· Add a LOG message at the end of _get_alarms to print all alarms that 
are returned by this function

· Restart vitrage-graph and send me its log. I’d like to see if there 
is any difference between the alarm that is raised and the alarm that is 
deleted.

Thanks,
Ifat.

From: Paul Vaduva <paul.vad...@enea.com>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org>
Date: Wednesday, 21 February 2018 at 16:30
To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org>
Cc: Ciprian Barbu <ciprian.ba...@enea.com>
Subject: Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

I attached also the driver.py that I am using.

From: Paul Vaduva [mailto:paul.vad...@enea.com]
Sent: Wednesday, February 21, 2018 3:22 PM
To: OpenStack Development Mailing List (not for usage questions) 
<openstack-dev@lists.openstack.org>
Cc: Ciprian Barbu <ciprian.ba...@enea.com>
Subject: [Attachment removed] Re: [openstack-dev] [vitrage] Vitrage alarm 
processing behavior

Hi Ifat,

Sorry for the late reply.
To answer your questions
I started as an example from the doctor datasource (or a porting of it for the 
1.3.0 version of vitrage) but will call it something different so no need to 
worry about conflicting with present doctor datasource.
I added polling alarms to it but I have a more particular use case:
* I get compute host down alarm on event
* I can't get host up event or it's an intricate sollution to implement

I tried to see if I can make the following scenario work:
Let's call Scenario I
* Get a compute host down event (Raisng an alarm)
* Periodically poll for the status of the compute in method "def 
_get_alarms(self):" of the Driver object
Both type of Interactions seem to work (polling and event based).
However now comes the tricky part. I would need for the alarms (with status up 
/ compute host up) returned by method "def _get_alarms(self):" of this Driver 
object to cancel/clear the compute host down alarms raised by event. This 
unfortunatelly does not happen.

Oddely enough there is a mimic of this scenario that works but is not robust 
enough for out needs.
Let's call Scenario II:
* Gettting an event with compute host down(when one of our compute actually 
goes down)
* Polling alarm (also compute host down) is raised and somehow overwrites the 
event based one (I can see the updated time).
* After a while the actual compute reboots and polling for the alarms returns 
an alarm with status up that in this case clears the previous (I assume polling 
type now) alarm.

Now I can't understand why this second scenario works and the first one does 
not.
It seems as the same alarm type (compute host down with status down) obtained 
by polling can overwrite an identical type and status alarm raised by event, 
but An alarm with an updated status (i. e. up) got by polling mode cannot 
overwrite / clear and alarm with status down got by an event.
I am wondering if there is a reason of this behavior and if there is a way to 
modify it or is it a bug.

For the event's generation I use modified version of zabbix_vitrage.py script 
that publishes to rabbitmq
vitrage_notifications.info queue. I have attached this python script.
The code is still experimental But I wanted to know if it's logically posible 
to create The scenario we need, Scenario I.

Best Regards
Paul

From: Afek, Ifat (Nokia - IL/Kfar Sava) [mailto:ifat.a...@nokia.com]
Sent: Wednesday, February 7, 2018 7:16 PM
To: OpenStack Development Mailing List (not for usage questions) 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Cc: Ciprian Barbu <ciprian.ba...@enea.com<mailto:ciprian.ba...@enea.com>>
Subject: Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

Hi Paul,

I’m glad that my fix helped.

Regarding the Doctor datasource: the purpose of this datasource was to be used 
by the Doctor test scripts. Do you intend to modify it, or to create a new 
similar datasource that also supports polling? Modifying the existing 
datasource could be problematic, since we need to make sure the existing 
functionality and tests stay the same.

In general, most of our datasources support both polling and notifications. A 
simple example is the Cinder datasource [1]. For example of an alarm 
datasource, you can look at Zabbix datasource [2]. You can also go over the 
documentation of how to add a new datasource [3].

As for your question, it is the responsibility of the datasource to clear the 
alarms that it created. For the Doctor datasource, you can send an event with 
“status”:”up” in the details and the datasource will clear the alarm.

[1] 
https://github.com/openstack/vitrage/tree/master/vitrage/datasourc

Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

2018-02-21 Thread Paul Vaduva
Sorry forgot to add you.

From: Paul Vaduva
Sent: Wednesday, February 21, 2018 4:31 PM
To: OpenStack Development Mailing List (not for usage questions) 
<openstack-dev@lists.openstack.org>
Cc: Ciprian Barbu <ciprian.ba...@enea.com>
Subject: RE: [openstack-dev] [vitrage] Vitrage alarm processing behavior

I attached also the driver.py that I am using.

From: Paul Vaduva [mailto:paul.vad...@enea.com]
Sent: Wednesday, February 21, 2018 3:22 PM
To: OpenStack Development Mailing List (not for usage questions) 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Cc: Ciprian Barbu <ciprian.ba...@enea.com<mailto:ciprian.ba...@enea.com>>
Subject: [Attachment removed] Re: [openstack-dev] [vitrage] Vitrage alarm 
processing behavior

Hi Ifat,

Sorry for the late reply.
To answer your questions
I started as an example from the doctor datasource (or a porting of it for the 
1.3.0 version of vitrage) but will call it something different so no need to 
worry about conflicting with present doctor datasource.
I added polling alarms to it but I have a more particular use case:
* I get compute host down alarm on event
* I can't get host up event or it's an intricate sollution to implement

I tried to see if I can make the following scenario work:
Let's call Scenario I
* Get a compute host down event (Raisng an alarm)
* Periodically poll for the status of the compute in method "def 
_get_alarms(self):" of the Driver object
Both type of Interactions seem to work (polling and event based).
However now comes the tricky part. I would need for the alarms (with status up 
/ compute host up) returned by method "def _get_alarms(self):" of this Driver 
object to cancel/clear the compute host down alarms raised by event. This 
unfortunatelly does not happen.

Oddely enough there is a mimic of this scenario that works but is not robust 
enough for out needs.
Let's call Scenario II:
* Gettting an event with compute host down(when one of our compute actually 
goes down)
* Polling alarm (also compute host down) is raised and somehow overwrites the 
event based one (I can see the updated time).
* After a while the actual compute reboots and polling for the alarms returns 
an alarm with status up that in this case clears the previous (I assume polling 
type now) alarm.

Now I can't understand why this second scenario works and the first one does 
not.
It seems as the same alarm type (compute host down with status down) obtained 
by polling can overwrite an identical type and status alarm raised by event, 
but An alarm with an updated status (i. e. up) got by polling mode cannot 
overwrite / clear and alarm with status down got by an event.
I am wondering if there is a reason of this behavior and if there is a way to 
modify it or is it a bug.

For the event's generation I use modified version of zabbix_vitrage.py script 
that publishes to rabbitmq
vitrage_notifications.info queue. I have attached this python script.
The code is still experimental But I wanted to know if it's logically posible 
to create The scenario we need, Scenario I.

Best Regards
Paul

From: Afek, Ifat (Nokia - IL/Kfar Sava) [mailto:ifat.a...@nokia.com]
Sent: Wednesday, February 7, 2018 7:16 PM
To: OpenStack Development Mailing List (not for usage questions) 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Cc: Ciprian Barbu <ciprian.ba...@enea.com<mailto:ciprian.ba...@enea.com>>
Subject: Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

Hi Paul,

I’m glad that my fix helped.

Regarding the Doctor datasource: the purpose of this datasource was to be used 
by the Doctor test scripts. Do you intend to modify it, or to create a new 
similar datasource that also supports polling? Modifying the existing 
datasource could be problematic, since we need to make sure the existing 
functionality and tests stay the same.

In general, most of our datasources support both polling and notifications. A 
simple example is the Cinder datasource [1]. For example of an alarm 
datasource, you can look at Zabbix datasource [2]. You can also go over the 
documentation of how to add a new datasource [3].

As for your question, it is the responsibility of the datasource to clear the 
alarms that it created. For the Doctor datasource, you can send an event with 
“status”:”up” in the details and the datasource will clear the alarm.

[1] 
https://github.com/openstack/vitrage/tree/master/vitrage/datasources/cinder/volume<https://url10.mailanyone.net/v1/?m=1ejTL3-0003ZV-4n=57e1b682=Pe0SmnJrux3qg2aeVKwciP-we0PY0bk3JoTO_20fQHQ70cIoAgpMPXrk8JuN_BWqpqnpygQerGyzW2Snm5KfUQ7Y-INhOKG5eybo-thEBodvAhGSFpyXWQxPXS0Auc9aF0vGy2Ea4hrWfL6eeD0bQycBJN8lTLZnuIQx59ZeULyqstlxVBL34dcnQOFQf-5nS76n_X9owe_iNZrV57fmTrGKDogeMocpOJwlz9vnzzCDaL7RjjqCRLcbAxwkyRas3lujR6oZKt9NK1NBb-hb3uc721qSI6SR8SVN6zZGjQE>
[2] 
https://github.com/openstack/vitrage/tree/master/v

Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

2018-02-21 Thread Paul Vaduva
I attached also the driver.py that I am using.

From: Paul Vaduva [mailto:paul.vad...@enea.com]
Sent: Wednesday, February 21, 2018 3:22 PM
To: OpenStack Development Mailing List (not for usage questions) 
<openstack-dev@lists.openstack.org>
Cc: Ciprian Barbu <ciprian.ba...@enea.com>
Subject: [Attachment removed] Re: [openstack-dev] [vitrage] Vitrage alarm 
processing behavior

Hi Ifat,

Sorry for the late reply.
To answer your questions
I started as an example from the doctor datasource (or a porting of it for the 
1.3.0 version of vitrage) but will call it something different so no need to 
worry about conflicting with present doctor datasource.
I added polling alarms to it but I have a more particular use case:
* I get compute host down alarm on event
* I can't get host up event or it's an intricate sollution to implement

I tried to see if I can make the following scenario work:
Let's call Scenario I
* Get a compute host down event (Raisng an alarm)
* Periodically poll for the status of the compute in method "def 
_get_alarms(self):" of the Driver object
Both type of Interactions seem to work (polling and event based).
However now comes the tricky part. I would need for the alarms (with status up 
/ compute host up) returned by method "def _get_alarms(self):" of this Driver 
object to cancel/clear the compute host down alarms raised by event. This 
unfortunatelly does not happen.

Oddely enough there is a mimic of this scenario that works but is not robust 
enough for out needs.
Let's call Scenario II:
* Gettting an event with compute host down(when one of our compute actually 
goes down)
* Polling alarm (also compute host down) is raised and somehow overwrites the 
event based one (I can see the updated time).
* After a while the actual compute reboots and polling for the alarms returns 
an alarm with status up that in this case clears the previous (I assume polling 
type now) alarm.

Now I can't understand why this second scenario works and the first one does 
not.
It seems as the same alarm type (compute host down with status down) obtained 
by polling can overwrite an identical type and status alarm raised by event, 
but An alarm with an updated status (i. e. up) got by polling mode cannot 
overwrite / clear and alarm with status down got by an event.
I am wondering if there is a reason of this behavior and if there is a way to 
modify it or is it a bug.

For the event's generation I use modified version of zabbix_vitrage.py script 
that publishes to rabbitmq
vitrage_notifications.info queue. I have attached this python script.
The code is still experimental But I wanted to know if it's logically posible 
to create The scenario we need, Scenario I.

Best Regards
Paul

From: Afek, Ifat (Nokia - IL/Kfar Sava) [mailto:ifat.a...@nokia.com]
Sent: Wednesday, February 7, 2018 7:16 PM
To: OpenStack Development Mailing List (not for usage questions) 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Cc: Ciprian Barbu <ciprian.ba...@enea.com<mailto:ciprian.ba...@enea.com>>
Subject: Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

Hi Paul,

I’m glad that my fix helped.

Regarding the Doctor datasource: the purpose of this datasource was to be used 
by the Doctor test scripts. Do you intend to modify it, or to create a new 
similar datasource that also supports polling? Modifying the existing 
datasource could be problematic, since we need to make sure the existing 
functionality and tests stay the same.

In general, most of our datasources support both polling and notifications. A 
simple example is the Cinder datasource [1]. For example of an alarm 
datasource, you can look at Zabbix datasource [2]. You can also go over the 
documentation of how to add a new datasource [3].

As for your question, it is the responsibility of the datasource to clear the 
alarms that it created. For the Doctor datasource, you can send an event with 
“status”:”up” in the details and the datasource will clear the alarm.

[1] 
https://github.com/openstack/vitrage/tree/master/vitrage/datasources/cinder/volume<https://url10.mailanyone.net/v1/?m=1ejTL3-0003ZV-4n=57e1b682=Pe0SmnJrux3qg2aeVKwciP-we0PY0bk3JoTO_20fQHQ70cIoAgpMPXrk8JuN_BWqpqnpygQerGyzW2Snm5KfUQ7Y-INhOKG5eybo-thEBodvAhGSFpyXWQxPXS0Auc9aF0vGy2Ea4hrWfL6eeD0bQycBJN8lTLZnuIQx59ZeULyqstlxVBL34dcnQOFQf-5nS76n_X9owe_iNZrV57fmTrGKDogeMocpOJwlz9vnzzCDaL7RjjqCRLcbAxwkyRas3lujR6oZKt9NK1NBb-hb3uc721qSI6SR8SVN6zZGjQE>
[2] 
https://github.com/openstack/vitrage/tree/master/vitrage/datasources/zabbix<https://url10.mailanyone.net/v1/?m=1ejTL3-0003ZV-4n=57e1b682=uGgIuECLH17WmCqispfyornk-y9i4E2eyyvxC5fH2sepif7vNt0e_Op9ifHIcOuZLWy4fzJMsbItzfWpk5qNeYW2O3iEr5sPuXnguxKSRm6yrD12oGtjjJibDR7oVJnkQSNtu5caCM1BoguJiXBL7WisodfHGVdbYJDe2W2m11dc3ZmARXYI1FlmVWOPQiAGlzNtUgcQ_wpYwHtTJJaur8wiS415nr2oRHwU4C9hawW9HWktVVEH877WI_P1xf3VI1PjGVf75imEW-bHo3lAtCIAv4hWKcrxtHdL48oP7kQ>
[3] 
https://docs.op

Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

2018-02-21 Thread Paul Vaduva
Hi Ifat,

Sorry for the late reply.
To answer your questions
I started as an example from the doctor datasource (or a porting of it for the 
1.3.0 version of vitrage) but will call it something different so no need to 
worry about conflicting with present doctor datasource.
I added polling alarms to it but I have a more particular use case:
* I get compute host down alarm on event
* I can't get host up event or it's an intricate sollution to implement

I tried to see if I can make the following scenario work:
Let's call Scenario I
* Get a compute host down event (Raisng an alarm)
* Periodically poll for the status of the compute in method "def 
_get_alarms(self):" of the Driver object
Both type of Interactions seem to work (polling and event based).
However now comes the tricky part. I would need for the alarms (with status up 
/ compute host up) returned by method "def _get_alarms(self):" of this Driver 
object to cancel/clear the compute host down alarms raised by event. This 
unfortunatelly does not happen.

Oddely enough there is a mimic of this scenario that works but is not robust 
enough for out needs.
Let's call Scenario II:
* Gettting an event with compute host down(when one of our compute actually 
goes down)
* Polling alarm (also compute host down) is raised and somehow overwrites the 
event based one (I can see the updated time).
* After a while the actual compute reboots and polling for the alarms returns 
an alarm with status up that in this case clears the previous (I assume polling 
type now) alarm.

Now I can't understand why this second scenario works and the first one does 
not.
It seems as the same alarm type (compute host down with status down) obtained 
by polling can overwrite an identical type and status alarm raised by event, 
but An alarm with an updated status (i. e. up) got by polling mode cannot 
overwrite / clear and alarm with status down got by an event.
I am wondering if there is a reason of this behavior and if there is a way to 
modify it or is it a bug.

For the event's generation I use modified version of zabbix_vitrage.py script 
that publishes to rabbitmq
vitrage_notifications.info queue. I have attached this python script.
The code is still experimental But I wanted to know if it's logically posible 
to create The scenario we need, Scenario I.

Best Regards
Paul

From: Afek, Ifat (Nokia - IL/Kfar Sava) [mailto:ifat.a...@nokia.com]
Sent: Wednesday, February 7, 2018 7:16 PM
To: OpenStack Development Mailing List (not for usage questions) 
<openstack-dev@lists.openstack.org>
Cc: Ciprian Barbu <ciprian.ba...@enea.com>
Subject: Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

Hi Paul,

I’m glad that my fix helped.

Regarding the Doctor datasource: the purpose of this datasource was to be used 
by the Doctor test scripts. Do you intend to modify it, or to create a new 
similar datasource that also supports polling? Modifying the existing 
datasource could be problematic, since we need to make sure the existing 
functionality and tests stay the same.

In general, most of our datasources support both polling and notifications. A 
simple example is the Cinder datasource [1]. For example of an alarm 
datasource, you can look at Zabbix datasource [2]. You can also go over the 
documentation of how to add a new datasource [3].

As for your question, it is the responsibility of the datasource to clear the 
alarms that it created. For the Doctor datasource, you can send an event with 
“status”:”up” in the details and the datasource will clear the alarm.

[1] 
https://github.com/openstack/vitrage/tree/master/vitrage/datasources/cinder/volume<https://url10.mailanyone.net/v1/?m=1ejTL3-0003ZV-4n=57e1b682=Pe0SmnJrux3qg2aeVKwciP-we0PY0bk3JoTO_20fQHQ70cIoAgpMPXrk8JuN_BWqpqnpygQerGyzW2Snm5KfUQ7Y-INhOKG5eybo-thEBodvAhGSFpyXWQxPXS0Auc9aF0vGy2Ea4hrWfL6eeD0bQycBJN8lTLZnuIQx59ZeULyqstlxVBL34dcnQOFQf-5nS76n_X9owe_iNZrV57fmTrGKDogeMocpOJwlz9vnzzCDaL7RjjqCRLcbAxwkyRas3lujR6oZKt9NK1NBb-hb3uc721qSI6SR8SVN6zZGjQE>
[2] 
https://github.com/openstack/vitrage/tree/master/vitrage/datasources/zabbix<https://url10.mailanyone.net/v1/?m=1ejTL3-0003ZV-4n=57e1b682=uGgIuECLH17WmCqispfyornk-y9i4E2eyyvxC5fH2sepif7vNt0e_Op9ifHIcOuZLWy4fzJMsbItzfWpk5qNeYW2O3iEr5sPuXnguxKSRm6yrD12oGtjjJibDR7oVJnkQSNtu5caCM1BoguJiXBL7WisodfHGVdbYJDe2W2m11dc3ZmARXYI1FlmVWOPQiAGlzNtUgcQ_wpYwHtTJJaur8wiS415nr2oRHwU4C9hawW9HWktVVEH877WI_P1xf3VI1PjGVf75imEW-bHo3lAtCIAv4hWKcrxtHdL48oP7kQ>
[3] 
https://docs.openstack.org/vitrage/latest/contributor/add-new-datasource.html<https://url10.mailanyone.net/v1/?m=1ejTL3-0003ZV-4n=57e1b682=A08vm8gwOUlRCFuV_ZDNRKrFdo7lGQmqtrZE-ZXEB6yLzcanUHFW1Aue5PnhXvrALgd0apyK5SAU9-PPc5Pqi5uod_I2JAHONug3ILQ9e3RvoKWyoYcuehJzRa3bqH3g_r5GQnKIRRNnYccSg6T4wkA-Wl6PHZ7KXq7cYp9zY7Fhz2jCK_zTUNBGJvLR2W_bqwPdTe2iyetPXPa0N_JrF38KrkUOVppDYgfi4_onM9N6QUUEECArxlYPl-T3xDM5cMSrTf9iE38OJrg_nKG8Fkwr7rAV5L8tAEZ5vGMDQxc>


Best Regards,
Ifat.


From: Paul Vaduva <pau

Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

2018-02-07 Thread Afek, Ifat (Nokia - IL/Kfar Sava)
Hi Paul,

I’m glad that my fix helped.

Regarding the Doctor datasource: the purpose of this datasource was to be used 
by the Doctor test scripts. Do you intend to modify it, or to create a new 
similar datasource that also supports polling? Modifying the existing 
datasource could be problematic, since we need to make sure the existing 
functionality and tests stay the same.

In general, most of our datasources support both polling and notifications. A 
simple example is the Cinder datasource [1]. For example of an alarm 
datasource, you can look at Zabbix datasource [2]. You can also go over the 
documentation of how to add a new datasource [3].

As for your question, it is the responsibility of the datasource to clear the 
alarms that it created. For the Doctor datasource, you can send an event with 
“status”:”up” in the details and the datasource will clear the alarm.

[1] 
https://github.com/openstack/vitrage/tree/master/vitrage/datasources/cinder/volume
[2] https://github.com/openstack/vitrage/tree/master/vitrage/datasources/zabbix
[3] 
https://docs.openstack.org/vitrage/latest/contributor/add-new-datasource.html


Best Regards,
Ifat.


From: Paul Vaduva <paul.vad...@enea.com>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org>
Date: Wednesday, 7 February 2018 at 15:50
To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org>
Cc: Ciprian Barbu <ciprian.ba...@enea.com>
Subject: Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

Hi Ifat,

Yes I’ve checked the 1.3.1 refers to a deb package (python-vitrage) version 
built by us, so the git tag used to build that deb is 1.3.0.
But I also backported doctor datasource from vitreage git master branch.

I also noticed that when I configure snapshots_interval=10 I also get this 
exception in
/var/log/vitrage/graph.log around the time the alarms disapear.
https://hastebin.com/ukisajojef.sql

I've cherry picked your before mentioned change and the alarm that came from 
event is now persistent and the exception is gone.
So it was a bug.
I understand that for doctor datasources I need to have events for raising the 
alarm and also for clearing it is that correct?


Best Regards,
Paul

From: Afek, Ifat (Nokia - IL/Kfar Sava) [mailto:ifat.a...@nokia.com]
Sent: Wednesday, February 7, 2018 1:24 PM
To: OpenStack Development Mailing List (not for usage questions) 
<openstack-dev@lists.openstack.org>
Subject: Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

Hi Paul,

It sounds like a bug. Alarms created by a datasource are not supposed to be 
deleted later on. It might be a bug that was fixed in Queens [1].

I’m not sure which Vitrage version you are actually using. I failed to find a 
vitrage version 1.3.1. Could it be that you are referring to a version of 
python-vitrageclient or vitrage-dashboard?

In any case, if you are using an older version, I suggest that you try to use 
the fix that I mentioned [1] and see if it helps.


[1] 
https://review.openstack.org/#/c/524228<https://url10.mailanyone.net/v1/?m=1ejNt4-0001fR-4I=57e1b682=LqJB68i5VuuaUnZ6iOIMHVhcsHMatfhcTwtLpAT-Rn5UZ3qnX4tq4XOTjYR1XqQIDRQGrqGMwZI31cnT-bEHTFX95wRD-iENXse8JBDHIyv8iJUD7RiwDp74HqNHBFZ-BybLQgQ6-sVcf62n2ogMk31b-Sp0xUJZXxH_0q2Iu-4Hodt4gxhKuFMTT2breh42c7OT5kdHzPJThKClzSEBQ2NWkNTCy112gxlapjMCVxSNQ9nsLg4f0XyJaAVUnAHO>


Best Regards,
Ifat.


From: Paul Vaduva <paul.vad...@enea.com<mailto:paul.vad...@enea.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Date: Wednesday, 7 February 2018 at 11:58
To: 
"openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>" 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Subject: [openstack-dev] [vitrage] Vitrage alarm processing behavior

Hi Vitrage developers,

I have a question about vitrage innerworkings, I ported doctor datasource from 
master branch to an earlier version of vitrage (1.3.1).
I noticed some behavior I am wondering if it's ok or it is bug of some sort.
Here it is:
1. I am sending some event for rasing an alarm to doctor datasource of vitrage.
2. I am receiving the event hence the alarm is displayed on vitrage dashboard 
attached to the affected resource (as expected)
3. If I have configured snapshot_interval=10 in /etc/vitrage/vitrage.conf The 
alarm disapears after a while
fragment from /etc/vitrage/vitrage.conf
***
[datasources]
types = 
nova.host,nova.instance,nova.zone,cinder.volume,neutron.network,neutron.port,doctor
snapshots_interval=10
***
On the other hand if I comment it out the alarm persists
**
[datasources]
types = 
nova.host,nova.instance,nova.zone,cinder.volume,neutron.network,neutron.port,do

Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

2018-02-07 Thread Paul Vaduva
Hi Ifat,

Yes I’ve checked the 1.3.1 refers to a deb package (python-vitrage) version 
built by us, so the git tag used to build that deb is 1.3.0.
But I also backported doctor datasource from vitreage git master branch.

I also noticed that when I configure snapshots_interval=10 I also get this 
exception in
/var/log/vitrage/graph.log around the time the alarms disapear.
https://hastebin.com/ukisajojef.sql

I've cherry picked your before mentioned change and the alarm that came from 
event is now persistent and the exception is gone.
So it was a bug.
I understand that for doctor datasources I need to have events for raising the 
alarm and also for clearing it is that correct?


Best Regards,
Paul

From: Afek, Ifat (Nokia - IL/Kfar Sava) [mailto:ifat.a...@nokia.com]
Sent: Wednesday, February 7, 2018 1:24 PM
To: OpenStack Development Mailing List (not for usage questions) 
<openstack-dev@lists.openstack.org>
Subject: Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

Hi Paul,

It sounds like a bug. Alarms created by a datasource are not supposed to be 
deleted later on. It might be a bug that was fixed in Queens [1].

I’m not sure which Vitrage version you are actually using. I failed to find a 
vitrage version 1.3.1. Could it be that you are referring to a version of 
python-vitrageclient or vitrage-dashboard?

In any case, if you are using an older version, I suggest that you try to use 
the fix that I mentioned [1] and see if it helps.


[1] 
https://review.openstack.org/#/c/524228<https://url10.mailanyone.net/v1/?m=1ejNt4-0001fR-4I=57e1b682=LqJB68i5VuuaUnZ6iOIMHVhcsHMatfhcTwtLpAT-Rn5UZ3qnX4tq4XOTjYR1XqQIDRQGrqGMwZI31cnT-bEHTFX95wRD-iENXse8JBDHIyv8iJUD7RiwDp74HqNHBFZ-BybLQgQ6-sVcf62n2ogMk31b-Sp0xUJZXxH_0q2Iu-4Hodt4gxhKuFMTT2breh42c7OT5kdHzPJThKClzSEBQ2NWkNTCy112gxlapjMCVxSNQ9nsLg4f0XyJaAVUnAHO>


Best Regards,
Ifat.


From: Paul Vaduva <paul.vad...@enea.com<mailto:paul.vad...@enea.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Date: Wednesday, 7 February 2018 at 11:58
To: 
"openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>" 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Subject: [openstack-dev] [vitrage] Vitrage alarm processing behavior

Hi Vitrage developers,

I have a question about vitrage innerworkings, I ported doctor datasource from 
master branch to an earlier version of vitrage (1.3.1).
I noticed some behavior I am wondering if it's ok or it is bug of some sort.
Here it is:
1. I am sending some event for rasing an alarm to doctor datasource of vitrage.
2. I am receiving the event hence the alarm is displayed on vitrage dashboard 
attached to the affected resource (as expected)
3. If I have configured snapshot_interval=10 in /etc/vitrage/vitrage.conf The 
alarm disapears after a while
fragment from /etc/vitrage/vitrage.conf
***
[datasources]
types = 
nova.host,nova.instance,nova.zone,cinder.volume,neutron.network,neutron.port,doctor
snapshots_interval=10
***
On the other hand if I comment it out the alarm persists
**
[datasources]
types = 
nova.host,nova.instance,nova.zone,cinder.volume,neutron.network,neutron.port,doctor
#snapshots_interval=10
**

I am interested if this behavior is correct or is this a bug.
My intention is to create some sort of hybrid datasource starting from the 
doctor one, that receives events for raising alarms like compute.host.down
but uses polling to clear them.

Best Regards,
Paul Vaduva
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [vitrage] Vitrage alarm processing behavior

2018-02-07 Thread Afek, Ifat (Nokia - IL/Kfar Sava)
Hi Paul,

It sounds like a bug. Alarms created by a datasource are not supposed to be 
deleted later on. It might be a bug that was fixed in Queens [1].

I’m not sure which Vitrage version you are actually using. I failed to find a 
vitrage version 1.3.1. Could it be that you are referring to a version of 
python-vitrageclient or vitrage-dashboard?

In any case, if you are using an older version, I suggest that you try to use 
the fix that I mentioned [1] and see if it helps.


[1] https://review.openstack.org/#/c/524228


Best Regards,
Ifat.


From: Paul Vaduva 
Reply-To: "OpenStack Development Mailing List (not for usage questions)" 

Date: Wednesday, 7 February 2018 at 11:58
To: "openstack-dev@lists.openstack.org" 
Subject: [openstack-dev] [vitrage] Vitrage alarm processing behavior

Hi Vitrage developers,

I have a question about vitrage innerworkings, I ported doctor datasource from 
master branch to an earlier version of vitrage (1.3.1).
I noticed some behavior I am wondering if it's ok or it is bug of some sort.
Here it is:
1. I am sending some event for rasing an alarm to doctor datasource of vitrage.
2. I am receiving the event hence the alarm is displayed on vitrage dashboard 
attached to the affected resource (as expected)
3. If I have configured snapshot_interval=10 in /etc/vitrage/vitrage.conf The 
alarm disapears after a while
fragment from /etc/vitrage/vitrage.conf
***
[datasources]
types = 
nova.host,nova.instance,nova.zone,cinder.volume,neutron.network,neutron.port,doctor
snapshots_interval=10
***
On the other hand if I comment it out the alarm persists
**
[datasources]
types = 
nova.host,nova.instance,nova.zone,cinder.volume,neutron.network,neutron.port,doctor
#snapshots_interval=10
**

I am interested if this behavior is correct or is this a bug.
My intention is to create some sort of hybrid datasource starting from the 
doctor one, that receives events for raising alarms like compute.host.down
but uses polling to clear them.

Best Regards,
Paul Vaduva
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev