Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-20 Thread AFEK, Ifat (Ifat)
> -Original Message-
> From: Ryota Mibu [mailto:r-m...@cq.jp.nec.com]
> Sent: Tuesday, December 08, 2015 11:17 AM
>
> Hi Ifat,
> 
> In short, 'event' is generated in OpenStack, 'alarm' is defined by a
> user. 'event' is a container of data passed from other OpenStack
> services through OpenStack notification bus. 'event' and contained data
> will be stored in ceilometer DB and exposed via event api [1]. 'alarm'
> is pre-configured alerting rule defined by a user via alarm API [2].
> 'Alarm' also has state like 'ok' and 'alarm', and history as well.
> 
> [1]
> http://docs.openstack.org/developer/ceilometer/webapi/v2.html#events-
> and-traits
> [2] http://docs.openstack.org/developer/aodh/webapi/v2.html#alarms
> 
> 
> The point is whether we should use 'event' or 'alarm' for all failure
> representation. Maybe we can use 'event' for all raw error/fault
> notification, and use 'alarm' for exposing deduced/wrapped failure.
> This is my view, so might be wrong.
> 

Hi,

Let me summarize the issue. 

What we need in Vitrage is:

- custom alarms, where we can set metadata like: {"resource_type":"switch", 
"resource_name":"switch-2"} or {"resource_type":"nova.instance", 
"resource_id":} or {"nagios_test_name":"check_ovs_vswitchd", 
"nagios_test_status":"warning"}

- the ability to define an alarm once, and instantiate it multiple times for 
every instance

- the ability to define an alarm on-the-fly (since we can't predict all alarm 
types)

- an option to trigger the alarm from vitrage


The optimal solution for us would be to have alarm templates and alarm 
metadata. Or, we can have a workaround... The current workarounds that I see 
are:

1. Create an event-alarm on the fly for every alarm on every instance and set 
its state immediately using Aodh API. The alarm will be stored in the database, 
but this will not trigger a notification or a call to alarm-actions. The alarm 
name will have to include the resource name/id, like "Instance  is at 
risk due to public switch problem" to make it unique. This might work for 
Vitrage horizon use cases in Mitaka, but not for future use cases that will 
require alarm-actions.

2. Send notifications in order to trigger event alarms "by the book". Vitrage 
notification "Alarm: Instance is at risk due to public switch problem" with 
metadata {"switch_name":"switch-2", "instance_id":} will be converted to 
a corresponding event, then to an alarm. We will still need to create a 
different alarm for every instance. And we will have to wait until the cache is 
refreshed. 


I will be happy to hear your thoughts about it.

Thanks,
Ifat.















__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-10 Thread AFEK, Ifat (Ifat)
Hi Ryota,

> -Original Message-
> From: Ryota Mibu [mailto:r-m...@cq.jp.nec.com]
> Sent: Tuesday, December 08, 2015 11:17 AM
>
> In short, 'event' is generated in OpenStack, 'alarm' is defined by a
> user. 'event' is a container of data passed from other OpenStack
> services through OpenStack notification bus. 'event' and contained data
> will be stored in ceilometer DB and exposed via event api [1]. 'alarm'
> is pre-configured alerting rule defined by a user via alarm API [2].
> 'Alarm' also has state like 'ok' and 'alarm', and history as well.
> 
> [1]
> http://docs.openstack.org/developer/ceilometer/webapi/v2.html#events-
> and-traits
> [2] http://docs.openstack.org/developer/aodh/webapi/v2.html#alarms
> 
> 
> The point is whether we should use 'event' or 'alarm' for all failure
> representation. Maybe we can use 'event' for all raw error/fault
> notification, and use 'alarm' for exposing deduced/wrapped failure.
> This is my view, so might be wrong.
> 

I believe Vitrage should define alarms, as we want the alarm to have
a state and history (that can be queried in horizon UI). Moreover, 
in the future I can imagine that some other OpenStack services might 
want to add their alarm actions to the alarms that Vitrage generated. 
I think this applies both for Vitrage deduced alarms, and for alarms
that Vitrage generated as a result of Nagios test failures for example.
Does that make sense?

Best Regards,
Ifat.



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-08 Thread Ryota Mibu
Hi Ifat,



> > Can we clarify use case again in terms of service role definition?
> 
> Our use cases focus on giving value to the cloud admin, who will be able to:
> 
> - view the topology of his environment, the relations between the physical, 
> virtual and applicative layer and the
> statuses all resources
> - view the alarms history
> - view alarms about problems that Vitrage deduced could happen, even if no 
> other OpenStack component reported these
> problems (yet)
> - view RCA information about the alarms

OK, thanks.

> > Aodh provides alarming mechanism to *notify* events and situations
> > calculated from various data sources. But, original/master information
> > of resource including latest resource state is owned by other services
> > such as nova.
> >
> > So, user who wants to know current resource state to find out dead
> > resources (instances), can simply query instances via nova api. And,
> > user who wants to know when/what failure occurred can query events via
> > ceilometer api. Aodh has alarm state and history though.
> 
> I'm not sure I fully understand the difference between Aodh events and 
> alarms. If the user wants to know what failure
> occurred, is it part of Aodh events, alarms, or both?

In short, 'event' is generated in OpenStack, 'alarm' is defined by a user. 
'event' is a container of data passed from other OpenStack services through 
OpenStack notification bus. 'event' and contained data will be stored in 
ceilometer DB and exposed via event api [1]. 'alarm' is pre-configured alerting 
rule defined by a user via alarm API [2]. 'Alarm' also has state like 'ok' and 
'alarm', and history as well.

[1] 
http://docs.openstack.org/developer/ceilometer/webapi/v2.html#events-and-traits
[2] http://docs.openstack.org/developer/aodh/webapi/v2.html#alarms


The point is whether we should use 'event' or 'alarm' for all failure 
representation. Maybe we can use 'event' for all raw error/fault notification, 
and use 'alarm' for exposing deduced/wrapped failure. This is my view, so might 
be wrong.


Best regards,
Ryota

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-07 Thread AFEK, Ifat (Ifat)
> -Original Message-
> From: Julien Danjou [mailto:jul...@danjou.info]
> Sent: Monday, December 07, 2015 12:00 PM
>
> I find it odd to have UI use cases first, as their terribly large for a
> MVP. Unless Vitrage already exists and you have all the code figured
> out. :)

We have most of it figured out.
We have an RCA engine written in java as a proprietary CloudBand code, 
with UI for showing the topology and RCA, and it is already working 
in production environments. 
We have decided to write a similar project in python as part of 
OpenStack project. Obviously, writing in OpenStack brings up new 
challenges which we are now trying to solve.

> > In case you haven't seen in yet, our high level architecture is on
> > Vitrage main page[2], and in the coming days we plan to document also
> > the lower level design.
> 
> I just looked at it, at it's very interesting. All the high level
> functionalities make sense and provide values. But if you try to solve
> them all 5 at once, I'm afraid you're going to either build a monster
> (with a lot of overlap with other projects, hard to maintain, etc) or
> just crash because you'll be blocked by all other OpenStack projects.
> That's the big issue when starting to build a project on top of others
> OpenStack bricks.
> 
> Overall I'm just saying that because it's still not clear to me which
> part you're trying to solve in this thread and how we can help you.
> What can we provide in our projects, that you miss, that could help
> you, concretely? What feature we need to work on next?
> 
> It would be awesome to have _one_ use-case described end-to-end that
> you would like to solve with Vitrage, leveraging various OpenStack
> projects, that you cannot solve right now because of missing pieces.
> Then we could start identifying these missing pieces and implement/fix
> them. :-)

We are not going to implement 5 use cases at once :-)
We will start with the physical-to-virtual mapping + a UI for visualizing 
this topology. This is the basic functionality for our next use cases.
Next, we will move to the RCA and the deduced alarms use cases. Alarm 
aggregation probably won't be implemented for mitaka.


Let me describe in details the deduced alarms use case.

1. Vitrage gets an alarm from Nagios about a public switch failure

2. Vitrage evaluator decides (based on its templates) that an "Instance is 
at risk due to public switch problem" alarm should be triggered for every 
instance on every host attached to this public switch

3. Vitrage notifier creates corresponding alarm definitions in Aodh 

4. Aodh stores these alarms in its database 

5. Vitrage triggers the alarms (sets their states)

6. Aodh updates the alarms states and notifies about it 

7. Horizon user queries Aodh for a list of all alarms. Aodh returns a list 
that includes the alarms that were triggered by Vitrage.

The added value of this use case, is that the Cloud Admin can see that
some instances are at risk, even thought their Nova statuses are ok.

For the integration with Aodh, we need the ability to create alarm
definitions that are not based on metrics, and to trigger them ourselves.

What do you think?


Thanks for your feedback, it is very helpful! 

Ifat and Alexey.



















__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-07 Thread Julien Danjou
On Mon, Dec 07 2015, AFEK, Ifat (Ifat) wrote:

> Our goal is to get as much information as we can from various data 
> sources. If you connect Nagios to telemetry project, and we can get 
> nagios alarms directly from Aodh, it would be great. Is it something 
> that you planned on doing for Mitaka?

Unfortunately nobody planned to work on a Nagios -> Ceilometer/Gnocchi
connector. That maybe a good idea, and the fact that is not planned is
not necessarily a blocker. If someone wants to jump in…

> Our current use cases focus on giving value to the cloud admin. These 
> are mostly UI use cases; the admin will be able to:
>
> - view the topology of his environment, the relations between the 
> physical, virtual and applicative layer and the statuses all resources
> - view the alarms history (there is an existing blueprint for it[1])
> - view alarms about problems that Vitrage deduced could happen, even
> if no other OpenStack component reported these problems (yet)
> - view RCA information about the alarms

I find it odd to have UI use cases first, as their terribly large for a
MVP. Unless Vitrage already exists and you have all the code figured
out. :)

The way I see the big pictures, Vitrage should be done as some sort of
an engine on top of Ceilometer/Gnocchi/Aodh and leverage them to do RCA
analysis. So what's missing in those projects to make that happen should
be done, and Vitrage should start as a MVP; and then we can iterate,
both on Vitrage side and both on the telemetry projects.

I have the feeling that you're trying to bite a too large portion at
once and that you may crash because of that.

> In order to support these use cases, we will get input from various 
> data sources, process and evaluate it based on configurable templates, 
> trigger new alarms in Aodh and calculate RCA information. 
> On top of it, we will have Vitrage API to query the information and
> show it in horizon. 
> In case you haven't seen in yet, our high level architecture is on 
> Vitrage main page[2], and in the coming days we plan to document also 
> the lower level design.

I just looked at it, at it's very interesting. All the high level
functionalities make sense and provide values. But if you try to solve
them all 5 at once, I'm afraid you're going to either build a monster
(with a lot of overlap with other projects, hard to maintain, etc) or
just crash because you'll be blocked by all other OpenStack projects.
That's the big issue when starting to build a project on top of others
OpenStack bricks.

Overall I'm just saying that because it's still not clear to me which
part you're trying to solve in this thread and how we can help you. What
can we provide in our projects, that you miss, that could help you,
concretely? What feature we need to work on next?

It would be awesome to have _one_ use-case described end-to-end that you
would like to solve with Vitrage, leveraging various OpenStack projects,
that you cannot solve right now because of missing pieces. Then we could
start identifying these missing pieces and implement/fix them. :-)

-- 
Julien Danjou
;; Free Software hacker
;; https://julien.danjou.info


signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-06 Thread AFEK, Ifat (Ifat)
Hi Julien, 

> -Original Message-
> From: Julien Danjou [mailto:jul...@danjou.info]
> Sent: Thursday, December 03, 2015 4:27 PM
>
> I think that I would be more interested by connecting Nagios to
> Ceilometer/Gnocchi/Aodh with maybe the long-term goal of replacing it
> by that stack, which should be more scalable and dynamic.
> 
> That would make Vitrage only needing to build on top of telemetry
> projects. It would also bring Nagios & co to telemetry not only for
> Vitrage, but for the whole stack.
> 
> Maybe there's some good reasons you're going the way you do, I don't
> have the pretension to have though about that as long as you probably
> did. :-)

Our goal is to get as much information as we can from various data 
sources. If you connect Nagios to telemetry project, and we can get 
nagios alarms directly from Aodh, it would be great. Is it something 
that you planned on doing for Mitaka?

> Do you have something like a MVP based on Telemetry you target? I saw
> you were already talking about Horizon, which to me is something that
> (sh|c)ould be way further into the pipeline, so I'm worried. ;)

Our current use cases focus on giving value to the cloud admin. These 
are mostly UI use cases; the admin will be able to:

- view the topology of his environment, the relations between the 
physical, virtual and applicative layer and the statuses all resources
- view the alarms history (there is an existing blueprint for it[1])
- view alarms about problems that Vitrage deduced could happen, even
if no other OpenStack component reported these problems (yet)
- view RCA information about the alarms

In order to support these use cases, we will get input from various 
data sources, process and evaluate it based on configurable templates, 
trigger new alarms in Aodh and calculate RCA information. 
On top of it, we will have Vitrage API to query the information and
show it in horizon. 
In case you haven't seen in yet, our high level architecture is on 
Vitrage main page[2], and in the coming days we plan to document also 
the lower level design.

Best Regards,
Ifat.


[1] 
https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page
[2] https://wiki.openstack.org/wiki/Vitrage

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-06 Thread AFEK, Ifat (Ifat)
Hi Ryota,

> -Original Message-
> From: Ryota Mibu [mailto:r-m...@cq.jp.nec.com]
> Sent: Friday, December 04, 2015 9:42 AM
> 
> > The next step can happen if and when Aodh supports alarm templates.
> > If Vitrage can handle about 30 alarm types, and there are 100
> > instances, we don't want to pre-configure 3000 alarms, which most
> likely will never be triggered.
> 
> I understand your concern. Aodh is user facing service, so having lots
> of alarms doesn't make sense.
> 
> Can we clarify use case again in terms of service role definition?

Our use cases focus on giving value to the cloud admin, who will be 
able to:

- view the topology of his environment, the relations between the 
physical, virtual and applicative layer and the statuses all resources
- view the alarms history
- view alarms about problems that Vitrage deduced could happen, even
if no other OpenStack component reported these problems (yet)
- view RCA information about the alarms

> 
> Aodh provides alarming mechanism to *notify* events and situations
> calculated from various data sources. But, original/master information
> of resource including latest resource state is owned by other services
> such as nova.
> 
> So, user who wants to know current resource state to find out dead
> resources (instances), can simply query instances via nova api. And,
> user who wants to know when/what failure occurred can query events via
> ceilometer api. Aodh has alarm state and history though.

I'm not sure I fully understand the difference between Aodh events and 
alarms. If the user wants to know what failure occurred, is it part of 
Aodh events, alarms, or both?

> > > OK. The 'combination' type alarm enables you to aggregate multiple
> > > alarm to one alarm. This can be used when you want to receive alarm
> > > when the both of physical NIC ports are downed to recognize logical
> > > connection unavailability if the ports are teamed for redundancy.
> > > Now, the combination alarms are evaluated periodically that means
> > > you can receive combination alarm not on-the-fly while you are
> using
> > > event alarms as source of combination alarm though.
> >
> > I think I understand your point. It means that certain alarms will
> > arrive to Vitrage in delay, due to your evaluation policy. I think we
> will have to address this issue at some point, but it won't change our
> overall design.
> 
> Yes, I'm just curious if there is any user can get benefit from this
> improvement to set priority.

I don't see a need for that improvement in our current use cases. Not so
sure about the future use cases, I will keep this limitation in mind.

Best Regards,
Ifat.



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-03 Thread Ryota Mibu
Hi Ifat,


> > > Let me see if I got this right: are you suggesting that we create
> > > on-the-fly alarm definitions with no alarm_actions, for every
> > > deduced
> > alarm that we want to raise? And this will spare us the extra alarm
> > evaluation in AODH?
> >
> > Yes. But, please note that could be the first step. The next step
> > would be make vitrage to send out alarm event to ceilometer/aodh the
> > pre- configured event alarm will recognize the alarm and fire the
> > alarm notification to another service or an end user. Eventually, we
> > should have relevant alarm type and evaluator to proxy evaluation in
> > vitrage, I think.
> 
> The next step can happen if and when Aodh supports alarm templates.
> If Vitrage can handle about 30 alarm types, and there are 100 instances, we 
> don't want to pre-configure 3000 alarms,
> which most likely will never be triggered.


I understand your concern. Aodh is user facing service, so having lots of 
alarms doesn't make sense.

Can we clarify use case again in terms of service role definition?

Aodh provides alarming mechanism to *notify* events and situations calculated 
from various data sources. But, original/master information of resource 
including latest resource state is owned by other services such as nova.

So, user who wants to know current resource state to find out dead resources 
(instances), can simply query instances via nova api. And, user who wants to 
know when/what failure occurred can query events via ceilometer api. Aodh has 
alarm state and history though.



> > > Another question is our need to get alarms from other sources, like
> > > Nagios, zabbix, ganglia, etc. We thought that Vitrage would query
> > > these Alarms from each source directly, and then create alarms in
> > AODH in the same way as our deduced alarms: for example create
> > nagios_ovs_vswitchd alarm if nagios check_ovs_vswitchd test failed.
> > > An alternative could be to integrate nagios directly with AODH.
> > > What do you think?
> >
> > Hmm, I don't have clear view on this. If the source can includes
> > OpenStack IDs and can be generate relevant meter/sample, it should be
> > useful to integrate with ceilometer. But if you want to do some
> > operations (like correlation), then it is reasonable to integrate with
> > vitrage.
> 
> The source may include alarms on resources that are not defined in OpenStack, 
> like switches or ports. And the alarms
> are not necessarily related to meters, they can be test nagios failures for 
> example.


Yes, so it depends on type of resource and its parameter.



> > > > BTW, is it useful to have on-the-fly evaluation of combination
> > alarm
> > > > with event alarms for alarm aggregation or other cases?
> > >
> > > I'm not sure I understand. Can you give a detailed example?
> >
> > OK. The 'combination' type alarm enables you to aggregate multiple
> > alarm to one alarm. This can be used when you want to receive alarm
> > when the both of physical NIC ports are downed to recognize logical
> > connection unavailability if the ports are teamed for redundancy. Now,
> > the combination alarms are evaluated periodically that means you can
> > receive combination alarm not on-the-fly while you are using event
> > alarms as source of combination alarm though.
> 
> I think I understand your point. It means that certain alarms will arrive to 
> Vitrage in delay, due to your evaluation
> policy. I think we will have to address this issue at some point, but it 
> won't change our overall design.

Yes, I'm just curious if there is any user can get benefit from this 
improvement to set priority.



> > > In addition, in Vitrage we plan to handle alarm aggregation by
> > > creating aggregation rule templates, for example based on the RCA
> > information.
> > > The user will be able to see only the root cause alarms, and then
> > > drill down to all specific alarms. But I doubt if this will be done
> > for Mitaka.
> >
> > I think 'the RCA information' means information for RCA. I mean
> > vitrage will use the resource topologies or relationship in
> > aggregation, rather than result of RCA. Am I right?
> 
> The term "aggregation" is used in different contexts, which may be confusing. 
> Our plan is to examine the already-computed
> RCA information, and see, for example, that a switch failure alarm caused 
> alarms on 100 related instances. In horizon,
> the result will be 101 alarms shown to the user in a flat list.
> By "alarm aggregation based on RCA" we mean that we will have an API to get 
> root cause alarms, which will return only
> the switch alarm. The horizon user will see one alarm, and may then ask to 
> expand the view and see all the other alarms
> that were caused by it.

I see. I used the term "aggregation" for aggregation process in alarm 
evaluation.



Thanks,
Ryota


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 

Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-03 Thread Ryota Mibu
Hi Ifat,


> > One approach we can take, is that you configure aodh to pass each row
> > event (e.g. each VM downed) wrapped in alarm notification to vitrage,
> > then do some operation (e.g. deducing, aggregating) and store
> > resource- level alarm without any alarm_actions, so that users can see
> > the alarms in horizon view. This may not require alarm evaluation, so
> > we can forget the problem I raised (cache refresh interval).
> 
> Let me see if I got this right: are you suggesting that we create on-the-fly 
> alarm definitions with no alarm_actions,
> for every deduced alarm that we want to raise? And this will spare us the 
> extra alarm evaluation in AODH?

Yes. But, please note that could be the first step. The next step would be make 
vitrage to send out alarm event to ceilometer/aodh the pre-configured event 
alarm will recognize the alarm and fire the alarm notification to another 
service or an end user. Eventually, we should have relevant alarm type and 
evaluator to proxy evaluation in vitrage, I think.


> My next question is how exactly we should create these resource-level alarms. 
> Can we create an alarm definition with
> no rule, no actions, and initial state set to "alarm"? (I'm not sure it can 
> be done in the current AODH API)

You can. This is not proper way of using aodh though. But, this is easy to 
create an alarm entry to show it in horizon.


> Another question is our need to get alarms from other sources, like Nagios, 
> zabbix, ganglia, etc. We thought that
> Vitrage would query these Alarms from each source directly, and then create 
> alarms in AODH in the same way as our
> deduced alarms: for example create nagios_ovs_vswitchd alarm if nagios 
> check_ovs_vswitchd test failed.
> An alternative could be to integrate nagios directly with AODH.
> What do you think?

Hmm, I don't have clear view on this. If the source can includes OpenStack IDs 
and can be generate relevant meter/sample, it should be useful to integrate 
with ceilometer. But if you want to do some operations (like correlation), then 
it is reasonable to integrate with vitrage.


> > BTW, is it useful to have on-the-fly evaluation of combination alarm
> > with event alarms for alarm aggregation or other cases?
> 
> I'm not sure I understand. Can you give a detailed example?

OK. The 'combination' type alarm enables you to aggregate multiple alarm to one 
alarm. This can be used when you want to receive alarm when the both of 
physical NIC ports are downed to recognize logical connection unavailability if 
the ports are teamed for redundancy. Now, the combination alarms are evaluated 
periodically that means you can receive combination alarm not on-the-fly while 
you are using event alarms as source of combination alarm though.

> > Horizon view is the different topic. Maybe we can reduce the number of
> > alarms listed in user view by creating raw alarms in admin space that
> > is not visible from end user, or using relevant severity or tag so
> > that user can filter out uninterested alarms.
> 
> Referring to this[1] blueprint, do you have specific concerns regarding the 
> usability/performance of Horizon view
> when there are many alarms?
> I think that your ideas make sense, and we can implement them if there is a 
> need.

Sorry, I'm not familiar with horizon these days... But, if you need change in 
aodh side, I can help you.


> In addition, in Vitrage we plan to handle alarm aggregation by creating 
> aggregation rule templates, for example based
> on the RCA information.
> The user will be able to see only the root cause alarms, and then drill down 
> to all specific alarms. But I doubt if
> this will be done for Mitaka.

I think 'the RCA information' means information for RCA. I mean vitrage will 
use the resource topologies or relationship in aggregation, rather than result 
of RCA. Am I right?


Best regards,
Ryota

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-03 Thread Julien Danjou
On Thu, Dec 03 2015, AFEK, Ifat (Ifat) wrote:

> Another question is our need to get alarms from other sources, like 
> Nagios, zabbix, ganglia, etc. We thought that Vitrage would query these 
> Alarms from each source directly, and then create alarms in AODH in the 
> same way as our deduced alarms: for example create nagios_ovs_vswitchd 
> alarm if nagios check_ovs_vswitchd test failed. 
> An alternative could be to integrate nagios directly with AODH. 
> What do you think?

I think I'd like to be able to answer this question, but I kind of lack
the bigger picture of what you need these alarms for, and what you would
like them to do with?

I think we don't have everything right now in Ceilometer/Gnocchi/Aodh to
replace something like Nagios _but_ we have a base framework that should
be more powerful and way more scalable. That could be leveraged to built
something better that Nagios, while staying compatible.

What Nagios does is polling, storing state, and doing action based on
that state. Which is more or less what Ceilometer does (polling),
Gnocchi does (storing things) and Aodh does (triggering action based on
the state). Obviously there's more to that (e.g. dependencies) that are
not handled currently, and that could be added later – maybe in some
parts of the current telemetry projects, or maybe in Vitrage.

So how fitting such tools (Nagios, Zabbix, whatever) in those projects
is an interesting problem. But I'm not clear on the first steps and
how/why you want to leverage alarms first. :)

-- 
Julien Danjou
# Free Software hacker
# https://julien.danjou.info


signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-03 Thread Julien Danjou
On Thu, Dec 03 2015, AFEK, Ifat (Ifat) wrote:

> One of Vitrage's goals is to gather information from different layers - 
> Physical, virtual and applicative - create a topology tree with the 
> Relations between the different entities in all layers, and perform 
> alarm analysis based on this topology.
>
> Currently, we can get alarms on the virtual layer from Ceilometer, and 
> alarms on the physical layer from Nagios for example. We can then try
> to correlate all these alarms, compute RCA, and optionally trigger other
> alarms, for example that an application might be running in suboptimal 
> state due to cpu threshold alarm on the instance.  

You can't really say that Nagios is for hardware and Ceilometer is for
virtual. This may be the way you view or deploy things, but this is not
a reality. We have plugins to check hardware (SNMP, IPMI…) in
Ceilometer, and I'm sure you can configure Nagios to check OpenStack
resources.

My point is that here is no hard line between the tools. They both
exists, and it's OK to use both of them – they do different things and
things differently – but how you make them work together isn't clear.

> We didn't suggest that Ceilometer will replace Nagios, rather that 
> Ceilometer might get Nagios test results as input/events, and trigger
> Corresponding alarms. Since right now Nagios and Ceilometer are not 
> connected, we thought that at the first stage we will query alarms 
> separately from Ceilometer and from Nagios. 
>
> Is it more clear?

Yes it is, thanks!.

I think that I would be more interested by connecting Nagios to
Ceilometer/Gnocchi/Aodh with maybe the long-term goal of replacing it by
that stack, which should be more scalable and dynamic.

That would make Vitrage only needing to build on top of telemetry
projects. It would also bring Nagios & co to telemetry not only for
Vitrage, but for the whole stack.

Maybe there's some good reasons you're going the way you do, I don't
have the pretension to have though about that as long as you probably
did. :-)

Though I think there's value in what you're trying to do, so it'd be
cool to be able to move your forward. That's why I'm trying to insist
that the current telemetry stuff should be able to solve as many problem
you have as we can!

Do you have something like a MVP based on Telemetry you target? I saw
you were already talking about Horizon, which to me is something that
(sh|c)ould be way further into the pipeline, so I'm worried. ;)

-- 
Julien Danjou
# Free Software hacker
# https://julien.danjou.info


signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-03 Thread AFEK, Ifat (Ifat)
Hi Julien,

> From: Julien Danjou [mailto:jul...@danjou.info]
> Sent: Thursday, December 03, 2015 10:53 AM
> 
> On Thu, Dec 03 2015, AFEK, Ifat (Ifat) wrote:
> 
> > Another question is our need to get alarms from other sources, like
> > Nagios, zabbix, ganglia, etc. We thought that Vitrage would query
> > these Alarms from each source directly, and then create alarms in
> AODH
> > in the same way as our deduced alarms: for example create
> > nagios_ovs_vswitchd alarm if nagios check_ovs_vswitchd test failed.
> > An alternative could be to integrate nagios directly with AODH.
> > What do you think?
> 
> I think I'd like to be able to answer this question, but I kind of lack
> the bigger picture of what you need these alarms for, and what you
> would like them to do with?
> 
> I think we don't have everything right now in Ceilometer/Gnocchi/Aodh
> to replace something like Nagios _but_ we have a base framework that
> should be more powerful and way more scalable. That could be leveraged
> to built something better that Nagios, while staying compatible.
> 
> What Nagios does is polling, storing state, and doing action based on
> that state. Which is more or less what Ceilometer does (polling),
> Gnocchi does (storing things) and Aodh does (triggering action based on
> the state). Obviously there's more to that (e.g. dependencies) that are
> not handled currently, and that could be added later – maybe in some
> parts of the current telemetry projects, or maybe in Vitrage.
> 
> So how fitting such tools (Nagios, Zabbix, whatever) in those projects
> is an interesting problem. But I'm not clear on the first steps and
> how/why you want to leverage alarms first. :)

One of Vitrage's goals is to gather information from different layers - 
Physical, virtual and applicative - create a topology tree with the 
Relations between the different entities in all layers, and perform 
alarm analysis based on this topology.

Currently, we can get alarms on the virtual layer from Ceilometer, and 
alarms on the physical layer from Nagios for example. We can then try
to correlate all these alarms, compute RCA, and optionally trigger other
alarms, for example that an application might be running in suboptimal 
state due to cpu threshold alarm on the instance.  

We didn't suggest that Ceilometer will replace Nagios, rather that 
Ceilometer might get Nagios test results as input/events, and trigger
Corresponding alarms. Since right now Nagios and Ceilometer are not 
connected, we thought that at the first stage we will query alarms 
separately from Ceilometer and from Nagios. 

Is it more clear?

Best Regards,
Ifat.


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-03 Thread AFEK, Ifat (Ifat)
Hi Ryota,

> From: Ryota Mibu [mailto:r-m...@cq.jp.nec.com]
> >
> > Let me see if I got this right: are you suggesting that we create
> > on-the-fly alarm definitions with no alarm_actions, for every deduced
> alarm that we want to raise? And this will spare us the extra alarm
> evaluation in AODH?
> 
> Yes. But, please note that could be the first step. The next step would
> be make vitrage to send out alarm event to ceilometer/aodh the pre-
> configured event alarm will recognize the alarm and fire the alarm
> notification to another service or an end user. Eventually, we should
> have relevant alarm type and evaluator to proxy evaluation in vitrage,
> I think.

The next step can happen if and when Aodh supports alarm templates. 
If Vitrage can handle about 30 alarm types, and there are 100 instances, 
we don't want to pre-configure 3000 alarms, which most likely will never 
be triggered.

> > Another question is our need to get alarms from other sources, like
> > Nagios, zabbix, ganglia, etc. We thought that Vitrage would query
> > these Alarms from each source directly, and then create alarms in
> AODH in the same way as our deduced alarms: for example create
> nagios_ovs_vswitchd alarm if nagios check_ovs_vswitchd test failed.
> > An alternative could be to integrate nagios directly with AODH.
> > What do you think?
> 
> Hmm, I don't have clear view on this. If the source can includes
> OpenStack IDs and can be generate relevant meter/sample, it should be
> useful to integrate with ceilometer. But if you want to do some
> operations (like correlation), then it is reasonable to integrate with
> vitrage.

The source may include alarms on resources that are not defined in 
OpenStack, like switches or ports. And the alarms are not necessarily 
related to meters, they can be test nagios failures for example.

> > > BTW, is it useful to have on-the-fly evaluation of combination
> alarm
> > > with event alarms for alarm aggregation or other cases?
> >
> > I'm not sure I understand. Can you give a detailed example?
> 
> OK. The 'combination' type alarm enables you to aggregate multiple
> alarm to one alarm. This can be used when you want to receive alarm
> when the both of physical NIC ports are downed to recognize logical
> connection unavailability if the ports are teamed for redundancy. Now,
> the combination alarms are evaluated periodically that means you can
> receive combination alarm not on-the-fly while you are using event
> alarms as source of combination alarm though.

I think I understand your point. It means that certain alarms will 
arrive to Vitrage in delay, due to your evaluation policy. I think we 
will have to address this issue at some point, but it won't change our
overall design.

> > In addition, in Vitrage we plan to handle alarm aggregation by
> > creating aggregation rule templates, for example based on the RCA
> information.
> > The user will be able to see only the root cause alarms, and then
> > drill down to all specific alarms. But I doubt if this will be done
> for Mitaka.
> 
> I think 'the RCA information' means information for RCA. I mean vitrage
> will use the resource topologies or relationship in aggregation, rather
> than result of RCA. Am I right?

The term "aggregation" is used in different contexts, which may be 
confusing. Our plan is to examine the already-computed RCA information,
and see, for example, that a switch failure alarm caused alarms on 100
related instances. In horizon, the result will be 101 alarms shown to 
the user in a flat list. 
By "alarm aggregation based on RCA" we mean that we will have 
an API to get root cause alarms, which will return only the switch 
alarm. The horizon user will see one alarm, and may then ask to expand 
the view and see all the other alarms that were caused by it. 

Best Regards,
Ifat.





__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-02 Thread Julien Danjou
On Wed, Dec 02 2015, AFEK, Ifat (Ifat) wrote:

> As we understand it, if we take the first approach you describe, then we can
> have an alarm refer to all the VMs in the system, but then if the alarm is
> triggered by one VM or by five VMs, the result will be the same - only one
> alarm will be active. What we want is to be able to distinguish between the
> different VMs - to know which alarms were triggered on each specific VM.
>
> One of the motivations for this is that in Horizon we would like to display 
> all
> the alarms, where we would like to be able to see that a problem occurred on
> instance1, instance2 and instance8, not just that there was a problem on some
> VMs out of a group.

Ok, that's clearer.

> Can this be supported without defining an alarm for every VM separately?

No, it's not possible. You'd have to create the alarm for each instance
for now.

Honestly, I'd say start with this at a first step, and if it starts
becoming a problem, we can envision a better way to define some sort of
alarm template for example in Aodh. I wouldn't put the cart before the
horse.

> This is what Ryota Mibu wrote us:
>
>> The reason is that aodh evaluator may not be aware of new alarm
>> definitions and won't send notification until its alarm definition >
>> cache is refreshed in less than 60 sec (default value).
>
> Did we misunderstand?

Oh no, but I thought you were mentioning at it being slow. This is a
cache, you can lower it to 1s if you want, with the potential
performance impact it may have. :)

-- 
Julien Danjou
;; Free Software hacker
;; https://julien.danjou.info


signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-02 Thread AFEK, Ifat (Ifat)
Hi Julien,

Please see our questions below.

Ifat and Elisha.

> -Original Message-
> From: Julien Danjou [mailto:jul...@danjou.info]
> 
> On Wed, Dec 02 2015, ROSENSWEIG, ELISHA (ELISHA) wrote:
> > Regarding the second point: Say we have 30 different types of alarms
> > we might want to raise on an OpenStack instance (VM). What I
> > understand from your explanation is that when we create a new
> > instance, we need to create 30 new alarms in Aodh that can be
> > triggered some time in the future. If we have 100 instances, we will
> > effectively have 3,000 alarms created in Aodh, and so on with more
> instances.
> 
> Not necessarily. You can create one alarm that has conditions large
> enough to match e.g. all your VMs, and an alarm action that can be
> generic enough so that it will do the right thing for each VM.
> 

As we understand it, if we take the first approach you describe, then we can 
have an alarm refer to all the VMs in the system, but then if the alarm is 
triggered by one VM or by five VMs, the result will be the same - only one 
alarm will be active. What we want is to be able to distinguish between the 
different VMs - to know which alarms were triggered on each specific VM.

One of the motivations for this is that in Horizon we would like to display all 
the alarms, where we would like to be able to see that a problem occurred on 
instance1, instance2 and instance8, not just that there was a problem on some 
VMs out of a group. 

Can this be supported without defining an alarm for every VM separately?

> The alarm system provided by Aodh is really a simple event -> trigger
> system in this area. How precise or large is your event really depends
> on the granularity that your trigger (which is usually a Web hook) can
> handle.
> 
> > A different approach might be to create a new alarm in Aodh on-the-
> fly.
> > However, we are under the impression that the creation time can be up
> > to one minute, which will cause a large delay. Is there any way to
> shorten this?
> 
> Creation time of an alarm of one minute? That's not normal. It should
> consist of just a record in the database so it should be pretty fast.
> 

This is what Ryota Mibu wrote us:

> The reason is that aodh evaluator may not be aware of new alarm definitions 
> and won't send notification until its alarm definition > cache is refreshed 
> in less than 60 sec (default value).

Did we misunderstand?


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-02 Thread Julien Danjou
On Wed, Dec 02 2015, ROSENSWEIG, ELISHA (ELISHA) wrote:

> Regarding the second point: Say we have 30 different types of alarms we might
> want to raise on an OpenStack instance (VM). What I understand from your
> explanation is that when we create a new instance, we need to create 30 new
> alarms in Aodh that can be triggered some time in the future. If we have 100
> instances, we will effectively have 3,000 alarms created in Aodh, and so on
> with more instances.

Not necessarily. You can create one alarm that has conditions large
enough to match e.g. all your VMs, and an alarm action that can be
generic enough so that it will do the right thing for each VM.

The alarm system provided by Aodh is really a simple event -> trigger
system in this area. How precise or large is your event really depends
on the granularity that your trigger (which is usually a Web hook) can
handle.

> A different approach might be to create a new alarm in Aodh on-the-fly.
> However, we are under the impression that the creation time can be up to one
> minute, which will cause a large delay. Is there any way to shorten this?

Creation time of an alarm of one minute? That's not normal. It should
consist of just a record in the database so it should be pretty fast.

-- 
Julien Danjou
// Free Software hacker
// https://julien.danjou.info


signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-02 Thread AFEK, Ifat (Ifat)
Hi Ryota,

Thanks for your response, please see my comments below.

Ifat.

> -Original Message-
> From: Ryota Mibu [mailto:r-m...@cq.jp.nec.com]
> 
> Hi,
> 
> 
> Sorry for my late response...
> 
> It seems like a fundamental question whether we should have rich
> function or intelligence in on-the-fly event alarm evaluation. I think
> we can add simple operations (like aggregating alarm) in aodh
> evaluator, and other operations (like deducing with referring some
> external DB) should be done outside of the evaluation process to reduce
> impact on other evaluations. But, if we separate too much, then there
> will be many interactions between two services that makes slow to
> finish sequence of alarm handling.
> 
> One approach we can take, is that you configure aodh to pass each row
> event (e.g. each VM downed) wrapped in alarm notification to vitrage,
> then do some operation (e.g. deducing, aggregating) and store resource-
> level alarm without any alarm_actions, so that users can see the alarms
> in horizon view. This may not require alarm evaluation, so we can
> forget the problem I raised (cache refresh interval).

Let me see if I got this right: are you suggesting that we create 
on-the-fly alarm definitions with no alarm_actions, for every deduced 
alarm that we want to raise? And this will spare us the extra alarm 
evaluation in AODH?

It does make sense. 

My next question is how exactly we should create these resource-level 
alarms. Can we create an alarm definition with no rule, no actions, 
and initial state set to "alarm"? (I'm not sure it can be done in the 
current AODH API)

Another question is our need to get alarms from other sources, like 
Nagios, zabbix, ganglia, etc. We thought that Vitrage would query these 
Alarms from each source directly, and then create alarms in AODH in the 
same way as our deduced alarms: for example create nagios_ovs_vswitchd 
alarm if nagios check_ovs_vswitchd test failed. 
An alternative could be to integrate nagios directly with AODH. 
What do you think?

> BTW, is it useful to have on-the-fly evaluation of combination alarm
> with event alarms for alarm aggregation or other cases?

I'm not sure I understand. Can you give a detailed example?

> Horizon view is the different topic. Maybe we can reduce the number of
> alarms listed in user view by creating raw alarms in admin space that
> is not visible from end user, or using relevant severity or tag so that
> user can filter out uninterested alarms.

Referring to this[1] blueprint, do you have specific concerns regarding 
the usability/performance of Horizon view when there are many alarms? 
I think that your ideas make sense, and we can implement them if there 
is a need. 

In addition, in Vitrage we plan to handle alarm aggregation by creating 
aggregation rule templates, for example based on the RCA information. 
The user will be able to see only the root cause alarms, and then drill 
down to all specific alarms. But I doubt if this will be done for Mitaka.


[1] 
https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page

Thanks,
Ifat.


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-02 Thread Ryota Mibu
Hi,


Sorry for my late response...

It seems like a fundamental question whether we should have rich function or 
intelligence in on-the-fly event alarm evaluation. I think we can add simple 
operations (like aggregating alarm) in aodh evaluator, and other operations 
(like deducing with referring some external DB) should be done outside of the 
evaluation process to reduce impact on other evaluations. But, if we separate 
too much, then there will be many interactions between two services that makes 
slow to finish sequence of alarm handling.

One approach we can take, is that you configure aodh to pass each row event 
(e.g. each VM downed) wrapped in alarm notification to vitrage, then do some 
operation (e.g. deducing, aggregating) and store resource-level alarm without 
any alarm_actions, so that users can see the alarms in horizon view. This may 
not require alarm evaluation, so we can forget the problem I raised (cache 
refresh interval).

BTW, is it useful to have on-the-fly evaluation of combination alarm with event 
alarms for alarm aggregation or other cases?

Horizon view is the different topic. Maybe we can reduce the number of alarms 
listed in user view by creating raw alarms in admin space that is not visible 
from end user, or using relevant severity or tag so that user can filter out 
uninterested alarms.


Best regards,
Ryota


---
"Ryota Mibu" 
NEC Corporation


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-02 Thread AFEK, Ifat (Ifat)
> -Original Message-
> From: Julien Danjou [mailto:jul...@danjou.info]
> 
> On Wed, Dec 02 2015, AFEK, Ifat (Ifat) wrote:
> 
> > Can this be supported without defining an alarm for every VM
> separately?
> 
> No, it's not possible. You'd have to create the alarm for each instance
> for now.
> 
> Honestly, I'd say start with this at a first step, and if it starts
> becoming a problem, we can envision a better way to define some sort of
> alarm template for example in Aodh. I wouldn't put the cart before the
> horse.

Ok, makes sense.
So we can start by creating on-the-fly alarm definitions for every resource,
and optionally request an Aodh enhancement in the future for alarm templates
creation.

Also, please see my response to Ryota Mibu, regarding the other alarms that
we need to define.

Thanks,
Ifat.



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-02 Thread Julien Danjou
On Tue, Dec 01 2015, ROSENSWEIG, ELISHA (ELISHA) wrote:

> 1. Does AODH currently support raising alarms on resources not modeled
> in OpenStack? For example, raising an alarm on a Switch? Or does each
> alarm have to relate to a resource ID (or IDs)(

Yes, Aodh does not really care, especially with the Gnocchi backend. It
can evaluate any metric on any resource type and just trigger the alarm.

> 2. What we feel is missing is some way to raise an alarm on-the-fly. In
> Vitrage, we have this concept of "deduced alarms", where based on some 
> analysis
> Vitrage determines we need to raise an alarm on some resource. As we 
> understand
> it, currently to raise an alarm in AODH we need to register the alarm in
> advance, wait for it to be registered and only then we can trigger the event.
> This will delay our response time to events.

What you need to do, is create the alarm that you want to trigger in
Aodh. Let's say Vitrage knows that if switch A is going down, it needs
to send an email to an admin.

You create an alarm in Aodh that says on event "switch A is down -> send
a mail to admin". Then Vitrage runs, and just have to emit an event
"switch A is down". You can do that via oslo.messaging or I guess via
the REST API (not sure the mechanism is here but it could be I guess).
Then Aodh will trigger the alarm actions for you.

Next step could be give more job to Aodh, such as determining that
"switch A is down" by doing some evaluation – unless the switch sends an
event when it's going down, but I imagine it's not the point. ;-)

Does that help?

-- 
Julien Danjou
# Free Software hacker
# https://julien.danjou.info


signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-02 Thread ROSENSWEIG, ELISHA (ELISHA)
Thanks for your response. It definitely helped clarify things.

Regarding the second point: Say we have 30 different types of alarms we might 
want to raise on an OpenStack instance (VM). What I understand from your 
explanation is that when we create a new instance, we need to create 30 new 
alarms in Aodh that can be triggered some time in the future. If we have 100 
instances, we will effectively have 3,000 alarms created in Aodh, and so on 
with more instances. 

Is this a correct depiction of the situation? If so, do you think it will scale?

A different approach might be to create a new alarm in Aodh on-the-fly. 
However, we are under the impression that the creation time can be up to one 
minute, which will cause a large delay. Is there any way to shorten this?

Thanks

Elisha

> -Original Message-
> From: Julien Danjou [mailto:jul...@danjou.info]
> Sent: Wednesday, December 02, 2015 12:26 PM
> To: ROSENSWEIG, ELISHA (ELISHA)
> Cc: OpenStack Development Mailing List (not for usage questions); AFEK,
> Ifat (Ifat)
> Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom
> alarms in AODH
> 
> On Tue, Dec 01 2015, ROSENSWEIG, ELISHA (ELISHA) wrote:
> 
> > 1. Does AODH currently support raising alarms on resources not
> modeled
> > in OpenStack? For example, raising an alarm on a Switch? Or does each
> > alarm have to relate to a resource ID (or IDs)(
> 
> Yes, Aodh does not really care, especially with the Gnocchi backend. It
> can evaluate any metric on any resource type and just trigger the
> alarm.
> 
> > 2. What we feel is missing is some way to raise an alarm on-the-fly.
> > In Vitrage, we have this concept of "deduced alarms", where based on
> > some analysis Vitrage determines we need to raise an alarm on some
> > resource. As we understand it, currently to raise an alarm in AODH we
> > need to register the alarm in advance, wait for it to be registered
> and only then we can trigger the event.
> > This will delay our response time to events.
> 
> What you need to do, is create the alarm that you want to trigger in
> Aodh. Let's say Vitrage knows that if switch A is going down, it needs
> to send an email to an admin.
> 
> You create an alarm in Aodh that says on event "switch A is down ->
> send a mail to admin". Then Vitrage runs, and just have to emit an
> event "switch A is down". You can do that via oslo.messaging or I guess
> via the REST API (not sure the mechanism is here but it could be I
> guess).
> Then Aodh will trigger the alarm actions for you.
> 
> Next step could be give more job to Aodh, such as determining that
> "switch A is down" by doing some evaluation – unless the switch sends
> an event when it's going down, but I imagine it's not the point. ;-)
> 
> Does that help?
> 
> --
> Julien Danjou
> # Free Software hacker
> # https://julien.danjou.info
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-01 Thread AFEK, Ifat (Ifat)
Hi,

After some further discussions with Vitrage team, let me go one step back and 
ask a more basic question:

In Vitrage, we would like to evaluate and correlate different kinds of alarms: 
AODH threshold alarms, event alarms, Nagios alarms, Ganglia alarms, Zabbix 
alarms, etc. This includes alarms on physical resources that are not part of 
OpenStack, like switches or ports, in order to understand their effect on 
OpenStack resources.

Our question is: do you vision AODH as a "general OpenStack alarm engine", 
which serves as a database for alarms of all kinds? Or does AODH focus on 
metric-related alarms?

Thanks,
Ifat.


> -Original Message-
> From: AFEK, Ifat (Ifat) [mailto:ifat.a...@alcatel-lucent.com]
> Sent: Monday, November 30, 2015 2:47 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom
> alarms in AODH
> 
> Hi,
> 
> A few days ago I sent you this email (see below). Resending in case you
> didn't see it.
> If you could get back to me soon it would be most appreciated, as we
> are quite blocked with our AODH integration right now.
> 
> Thanks,
> Ifat.
> 
> 
> -Original Message-
> From: AFEK, Ifat (Ifat) [mailto:ifat.a...@alcatel-lucent.com]
> Sent: Tuesday, November 24, 2015 7:37 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom
> alarms in AODH
> 
> Hi Gord, Hi Ryota,
> 
> (I sent the same mail again in a more readable format)
> 
> Thanks for your detailed responses.
> Hope you don't mind that I'm sending one reply to both of your emails.
> I think it would be easier to have one thread for this discussion.
> 
> 
> Let me explain our use case in more details.
> Here is an example of how we would like to integrate with AODH. Let me
> know what you think about it.
> 
> 1. Vitrage gets an alarm from Nagios about high cpu load on one of the
> hosts
> 
> 2. Vitrage evaluator decides (based on its templates) that an "instance
> might be suffering due to high cpu load on the host" alarm should be
> triggered for every instance on this host
> 
> 3. Vitrage notifier creates corresponding alarm definitions in AODH
> 
> 4. AODH stores these alarms in its database
> 
> 5. Vitrage triggers the alarms
> 
> 6. AODH updates the alarms states and notifies about it
> 
> 7. Horizon user queries AODH for a list of all alarms (we are currently
> checking the status of a blueprint that should implement it[2]). AODH
> returns a list that includes the alarms that were triggered by Vitrage.
> 
> 8. Horizon user selects one of the alarms that Vitrage generated, and
> asks to see its root cause (we will create a new blueprint for that).
> Vitrage API returns the RCA information for this alarm.
> 
> 
> Our current discussion is on steps 3-6 (as far as we understand, and
> please correct me if I'm wrong, nothing blocks the implementation of
> the blueprint for step 7).
> 
> 
> 
> Looking at AODH API again, here is what I think we need to do:
> 
> 1. Define an alarm with an external_trigger_rule or something like
> that. This alarm has no metric data. We just want to be able to trigger
> it and query its state.
> 
> 2. Use AODH API for triggering this alarm. Will "PUT
> /v2/alarms/(alarm_id)/state" do the job?
> 
> 
> Please see also my comments below.
> 
> Thanks,
> Ifat.
> 
> 
> [2] https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-
> management-page
> 
> 
> 
> 
> > -Original Message-
> > From: gord chung [mailto:g...@live.ca]
> > Sent: Monday, November 23, 2015 9:45 PM
> > To: openstack-dev@lists.openstack.org
> > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising
> > custom alarms in AODH
> >
> >
> >
> > On 23/11/2015 11:14 AM, AFEK, Ifat (Ifat) wrote:
> > > I guess I would like to do both: create a new alarm definition,
> then
> > > trigger it (call alarm_actions), and possibly later on set its
> state
> > > back to OK (call ok_action).
> > > I understood that currently all alarm triggering is internal in
> > > AODH, according to threshold/events/combination alarm rules. Would
> > > it be possible to add a new kind of rule, that will allow
> triggering
> > > the alarm externally?
> > what type of rule?
> >
> > i have https://review.openstack.org/#/c/247211 which would
> > theoretically allow you to push an action into queue which would then
> > trigger appropriate REST call. not sure if it helps you plug into
> Aodh
> > eas

Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-01 Thread ROSENSWEIG, ELISHA (ELISHA)
Thanks for the quick reply. We have a few more questions, for clarification:

1. Does AODH currently support raising alarms on resources not modeled in 
OpenStack? For example, raising an alarm on a Switch? Or does each alarm have 
to relate to a resource ID (or IDs)(

2.  What we feel is missing is some way to raise an alarm on-the-fly. In 
Vitrage, we have this concept of "deduced alarms", where based on some analysis 
Vitrage determines we need to raise an alarm on some resource. As we understand 
it, currently to raise an alarm in AODH we need to register the alarm in 
advance, wait for it to be registered and only then we can trigger the event. 
This will delay our response time to events. 

Could you clarify these two points, or correct any misconceptions on our part? 

Thanks,

Elisha Rosensweig, PhD
CloudBand, Alcatel-Lucent

> -Original Message-
> From: Julien Danjou [mailto:jul...@danjou.info]
> Sent: Tuesday, December 01, 2015 3:25 PM
> To: AFEK, Ifat (Ifat)
> Cc: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom
> alarms in AODH
> 
> On Tue, Dec 01 2015, AFEK, Ifat (Ifat) wrote:
> 
> > In Vitrage, we would like to evaluate and correlate different kinds
> of alarms:
> > AODH threshold alarms, event alarms, Nagios alarms, Ganglia alarms,
> > Zabbix alarms, etc. This includes alarms on physical resources that
> > are not part of OpenStack, like switches or ports, in order to
> > understand their effect on OpenStack resources.
> >
> > Our question is: do you vision AODH as a "general OpenStack alarm
> > engine", which serves as a database for alarms of all kinds? Or does
> > AODH focus on metric-related alarms?
> 
> I think we would be happy to have any kind of alarm supported in Aodh.
> Though currently I'm not really seeing what is missing since we have
> evaluation based alarms and event based alarms.
> 
> Aodh is also meant to be generic enough to be consumed outside of
> OpenStack itself – it works pretty well in standalone with Gnocchi for
> example.
> 
> --
> Julien Danjou
> // Free Software hacker
> // https://julien.danjou.info
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-12-01 Thread Julien Danjou
On Tue, Dec 01 2015, AFEK, Ifat (Ifat) wrote:

> In Vitrage, we would like to evaluate and correlate different kinds of alarms:
> AODH threshold alarms, event alarms, Nagios alarms, Ganglia alarms, Zabbix
> alarms, etc. This includes alarms on physical resources that are not part of
> OpenStack, like switches or ports, in order to understand their effect on
> OpenStack resources.
>
> Our question is: do you vision AODH as a "general OpenStack alarm engine",
> which serves as a database for alarms of all kinds? Or does AODH focus on
> metric-related alarms?

I think we would be happy to have any kind of alarm supported in Aodh.
Though currently I'm not really seeing what is missing since we have
evaluation based alarms and event based alarms.

Aodh is also meant to be generic enough to be consumed outside of
OpenStack itself – it works pretty well in standalone with Gnocchi for
example.

-- 
Julien Danjou
// Free Software hacker
// https://julien.danjou.info


signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-11-30 Thread AFEK, Ifat (Ifat)
Hi,

A few days ago I sent you this email (see below). Resending in case you didn't 
see it. 
If you could get back to me soon it would be most appreciated, as we are quite 
blocked with our AODH integration right now.

Thanks,
Ifat.


-Original Message-
From: AFEK, Ifat (Ifat) [mailto:ifat.a...@alcatel-lucent.com] 
Sent: Tuesday, November 24, 2015 7:37 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms 
in AODH

Hi Gord, Hi Ryota,

(I sent the same mail again in a more readable format)

Thanks for your detailed responses.
Hope you don't mind that I'm sending one reply to both of your emails. I think 
it would be easier to have one thread for this discussion.


Let me explain our use case in more details. 
Here is an example of how we would like to integrate with AODH. Let me know 
what you think about it. 

1. Vitrage gets an alarm from Nagios about high cpu load on one of the hosts 

2. Vitrage evaluator decides (based on its templates) that an "instance might 
be suffering due to high cpu load on the host" alarm should be triggered for 
every instance on this host 

3. Vitrage notifier creates corresponding alarm definitions in AODH 

4. AODH stores these alarms in its database 

5. Vitrage triggers the alarms 

6. AODH updates the alarms states and notifies about it 

7. Horizon user queries AODH for a list of all alarms (we are currently 
checking the status of a blueprint that should implement it[2]). AODH returns a 
list that includes the alarms that were triggered by Vitrage.

8. Horizon user selects one of the alarms that Vitrage generated, and asks to 
see its root cause (we will create a new blueprint for that). Vitrage API 
returns the RCA information for this alarm.


Our current discussion is on steps 3-6 (as far as we understand, and please 
correct me if I'm wrong, nothing blocks the implementation of the blueprint for 
step 7).



Looking at AODH API again, here is what I think we need to do:

1. Define an alarm with an external_trigger_rule or something like that. This 
alarm has no metric data. We just want to be able to trigger it and query its 
state.

2. Use AODH API for triggering this alarm. Will "PUT 
/v2/alarms/(alarm_id)/state" do the job? 


Please see also my comments below.

Thanks,
Ifat.


[2] 
https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page 




> -Original Message-
> From: gord chung [mailto:g...@live.ca]
> Sent: Monday, November 23, 2015 9:45 PM
> To: openstack-dev@lists.openstack.org
> Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising 
> custom alarms in AODH
> 
> 
> 
> On 23/11/2015 11:14 AM, AFEK, Ifat (Ifat) wrote:
> > I guess I would like to do both: create a new alarm definition, then 
> > trigger it (call alarm_actions), and possibly later on set its state 
> > back to OK (call ok_action).
> > I understood that currently all alarm triggering is internal in 
> > AODH, according to threshold/events/combination alarm rules. Would 
> > it be possible to add a new kind of rule, that will allow triggering 
> > the alarm externally?
> what type of rule?
> 
> i have https://review.openstack.org/#/c/247211 which would 
> theoretically allow you to push an action into queue which would then 
> trigger appropriate REST call. not sure if it helps you plug into Aodh 
> easier or not?

We need to add an alarm definition with an "external_rule", and then trigger 
it. It is important for us that the alarm definition will be stored in AODH 
database for future queries. As far as I understand, the queue should help only 
with the triggering?

> 
> --
> gord


> -Original Message-
> From: Ryota Mibu [mailto:r-m...@cq.jp.nec.com]
> Sent: Tuesday, November 24, 2015 10:00 AM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising 
> custom alarms in AODH
> 
> Hi Ifat,
> 
> 
> Thank you for starting discussion how AODH can be integrated with 
> Vitrage that would be a good example of AODH integration with other 
> OpenStack components.
> 
> The key role of creating alarm definition is to set endpoint
> (alarm_actins) which can be receive alarm notification from AODH. How 
> the endpoints can be set in your use case? Those endpoints are 
> configured via virtage API and stored in its DB?

We have a graph database that will include resources and alarms imported from 
few sources of information (including Ceilometer), as well as alarms generated 
by Vitrage. However, we would like our alarms to be stored in AODH as well. If 
I understood you correctly, we will need the endpoints in order to be notified 
on Ceilometer alarms.

> 
> I agree with Gordon, you can

Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-11-24 Thread AFEK, Ifat (Ifat)
Hi Gord, Hi Ryota,

Thanks for your detailed responses.
Hope you don't mind that I'm sending one reply to both of your emails. I think 
it would be easier to have one thread for this discussion.

Let me explain our use case in more details. 
Here is an example of how we would like to integrate with AODH. Let me know 
what you think about it. 

1. Vitrage gets an alarm from Nagios about high cpu load on one of the hosts
2. Vitrage evaluator decides (based on its templates) that an "instance might 
be suffering due to high cpu load on the host" alarm should be triggered for 
every instance on this host
3. Vitrage notifier creates corresponding alarm definitions in AODH
4. AODH stores these alarms in its database
5. Vitrage triggers the alarms
6. AODH updates the alarms states and notifies about it
7. Horizon user queries AODH for a list of all alarms (we are currently 
checking the status of a blueprint that should implement it[2]). AODH returns a 
list that includes the alarms that were triggered by Vitrage.
8. Horizon user selects one of the alarms that Vitrage generated, and asks to 
see its root cause (we will create a new blueprint for that). Vitrage API 
returns the RCA information for this alarm.

Our current discussion is on steps 3-6 (as far as we understand, and please 
correct me if I'm wrong, nothing blocks the implementation of the blueprint for 
step 7).

Looking at AODH API again, here is what I think we need to do:

1. Define an alarm with an external_trigger_rule or something like that. This 
alarm has no metric data. We just want to be able to trigger it and query its 
state.
2. Use AODH API for triggering this alarm. Will "PUT 
/v2/alarms/(alarm_id)/state" do the job? 


Please see also my comments below.

Thanks,
Ifat.


[2] 
https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page 




> -Original Message-
> From: gord chung [mailto:g...@live.ca]
> Sent: Monday, November 23, 2015 9:45 PM
> To: openstack-dev@lists.openstack.org
> Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom
> alarms in AODH
> 
> 
> 
> On 23/11/2015 11:14 AM, AFEK, Ifat (Ifat) wrote:
> > I guess I would like to do both: create a new alarm definition, then
> > trigger it (call alarm_actions), and possibly later on set its state
> > back to OK (call ok_action).
> > I understood that currently all alarm triggering is internal in AODH,
> > according to threshold/events/combination alarm rules. Would it be
> > possible to add a new kind of rule, that will allow triggering the
> > alarm externally?
> what type of rule?
> 
> i have https://review.openstack.org/#/c/247211 which would
> theoretically allow you to push an action into queue which would then
> trigger appropriate REST call. not sure if it helps you plug into Aodh
> easier or not?

We need to add an alarm definition with an "external_rule", and then trigger 
it. It is important for us that the alarm definition will be stored in AODH 
database for future queries. As far as I understand, the queue should help only 
with the triggering?

> 
> --
> gord


> -Original Message-
> From: Ryota Mibu [mailto:r-m...@cq.jp.nec.com]
> Sent: Tuesday, November 24, 2015 10:00 AM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom
> alarms in AODH
> 
> Hi Ifat,
> 
> 
> Thank you for starting discussion how AODH can be integrated with
> Vitrage that would be a good example of AODH integration with other
> OpenStack components.
> 
> The key role of creating alarm definition is to set endpoint
> (alarm_actins) which can be receive alarm notification from AODH. How
> the endpoints can be set in your use case? Those endpoints are
> configured via virtage API and stored in its DB?

We have a graph database that will include resources and alarms imported from 
few sources of information (including Ceilometer), as well as alarms generated 
by Vitrage. However, we would like our alarms to be stored in AODH as well. If 
I understood you correctly, we will need the endpoints in order to be notified 
on Ceilometer alarms.

> 
> I agree with Gordon, you can use even-alarm with generating "event"
> containing alarming message that can be captured in aodh if vitrage
> relay the alarm definition to aodh. That is more feasible way rather
> than creating alarm definition right before triggering alarm
> notification. The reason is that aodh evaluator may not be aware of new
> alarm definitions and won't send notification until its alarm
> definition cache is refreshed in less than 60 sec (default value).

Logically speaking, we would like to create alarms and not events. Our goal is 
to alert when something is wrong. Creating events mi

Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-11-24 Thread AFEK, Ifat (Ifat)
Hi Gord, Hi Ryota,

(I sent the same mail again in a more readable format)

Thanks for your detailed responses.
Hope you don't mind that I'm sending one reply to both of your emails. I think 
it would be easier to have one thread for this discussion.


Let me explain our use case in more details. 
Here is an example of how we would like to integrate with AODH. Let me know 
what you think about it. 

1. Vitrage gets an alarm from Nagios about high cpu load on one of the hosts 

2. Vitrage evaluator decides (based on its templates) that an "instance might 
be suffering due to high cpu load on the host" alarm should be triggered for 
every instance on this host 

3. Vitrage notifier creates corresponding alarm definitions in AODH 

4. AODH stores these alarms in its database 

5. Vitrage triggers the alarms 

6. AODH updates the alarms states and notifies about it 

7. Horizon user queries AODH for a list of all alarms (we are currently 
checking the status of a blueprint that should implement it[2]). AODH returns a 
list that includes the alarms that were triggered by Vitrage.

8. Horizon user selects one of the alarms that Vitrage generated, and asks to 
see its root cause (we will create a new blueprint for that). Vitrage API 
returns the RCA information for this alarm.


Our current discussion is on steps 3-6 (as far as we understand, and please 
correct me if I'm wrong, nothing blocks the implementation of the blueprint for 
step 7).



Looking at AODH API again, here is what I think we need to do:

1. Define an alarm with an external_trigger_rule or something like that. This 
alarm has no metric data. We just want to be able to trigger it and query its 
state.

2. Use AODH API for triggering this alarm. Will "PUT 
/v2/alarms/(alarm_id)/state" do the job? 


Please see also my comments below.

Thanks,
Ifat.


[2] 
https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page 




> -Original Message-
> From: gord chung [mailto:g...@live.ca]
> Sent: Monday, November 23, 2015 9:45 PM
> To: openstack-dev@lists.openstack.org
> Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising 
> custom alarms in AODH
> 
> 
> 
> On 23/11/2015 11:14 AM, AFEK, Ifat (Ifat) wrote:
> > I guess I would like to do both: create a new alarm definition, then 
> > trigger it (call alarm_actions), and possibly later on set its state 
> > back to OK (call ok_action).
> > I understood that currently all alarm triggering is internal in 
> > AODH, according to threshold/events/combination alarm rules. Would 
> > it be possible to add a new kind of rule, that will allow triggering 
> > the alarm externally?
> what type of rule?
> 
> i have https://review.openstack.org/#/c/247211 which would 
> theoretically allow you to push an action into queue which would then 
> trigger appropriate REST call. not sure if it helps you plug into Aodh 
> easier or not?

We need to add an alarm definition with an "external_rule", and then trigger 
it. It is important for us that the alarm definition will be stored in AODH 
database for future queries. As far as I understand, the queue should help only 
with the triggering?

> 
> --
> gord


> -Original Message-
> From: Ryota Mibu [mailto:r-m...@cq.jp.nec.com]
> Sent: Tuesday, November 24, 2015 10:00 AM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising 
> custom alarms in AODH
> 
> Hi Ifat,
> 
> 
> Thank you for starting discussion how AODH can be integrated with 
> Vitrage that would be a good example of AODH integration with other 
> OpenStack components.
> 
> The key role of creating alarm definition is to set endpoint
> (alarm_actins) which can be receive alarm notification from AODH. How 
> the endpoints can be set in your use case? Those endpoints are 
> configured via virtage API and stored in its DB?

We have a graph database that will include resources and alarms imported from 
few sources of information (including Ceilometer), as well as alarms generated 
by Vitrage. However, we would like our alarms to be stored in AODH as well. If 
I understood you correctly, we will need the endpoints in order to be notified 
on Ceilometer alarms.

> 
> I agree with Gordon, you can use even-alarm with generating "event"
> containing alarming message that can be captured in aodh if vitrage 
> relay the alarm definition to aodh. That is more feasible way rather 
> than creating alarm definition right before triggering alarm 
> notification. The reason is that aodh evaluator may not be aware of 
> new alarm definitions and won't send notification until its alarm 
> definition cache is refreshed in less than 60 sec (default value).

Logically speaking, we would like to 

Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-11-24 Thread Ryota Mibu
Hi Ifat,


Thank you for starting discussion how AODH can be integrated with Vitrage that 
would be a good example of AODH integration with other OpenStack components.

The key role of creating alarm definition is to set endpoint (alarm_actins) 
which can be receive alarm notification from AODH. How the endpoints can be set 
in your use case? Those endpoints are configured via virtage API and stored in 
its DB?

I agree with Gordon, you can use even-alarm with generating "event" containing 
alarming message that can be captured in aodh if vitrage relay the alarm 
definition to aodh. That is more feasible way rather than creating alarm 
definition right before triggering alarm notification. The reason is that aodh 
evaluator may not be aware of new alarm definitions and won't send notification 
until its alarm definition cache is refreshed in less than 60 sec (default 
value).

Having special rule and external evaluator would be alternative, but it should 
be difficult to catch up latest aodh, since it will be changed faster with 
small code base as result of split from ceilometer.


BR,
Ryota

> -Original Message-
> From: AFEK, Ifat (Ifat) [mailto:ifat.a...@alcatel-lucent.com]
> Sent: Tuesday, November 24, 2015 1:15 AM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom 
> alarms in AODH
> 
> Hi Gord,
> 
> Please see my answers below.
> 
> Ifat.
> 
> 
> > -Original Message-
> > From: gord chung [mailto:g...@live.ca]
> > Sent: Monday, November 23, 2015 4:57 PM
> > To: openstack-dev@lists.openstack.org
> > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising
> > custom alarms in AODH
> >
> > hi Ifat,
> >
> > i added some questions below.
> >
> > On 23/11/2015 7:16 AM, AFEK, Ifat (Ifat) wrote:
> > > Hi,
> > >
> > > We have a couple of questions regarding AODH alarms.
> > >
> > > In Vitrage[1] project we have two use cases that involve Ceilometer:
> > >
> > > 1. Import Ceilometer alarms, as well as alarms and resources from
> > other sources (Nagios, Zabbix, Nova, Heat, etc.), and produce RCA
> > insights about the connection between different alarms.
> > to clarify, Ceilometer alarms is deprecated for Aodh and will be
> > removed very, very soon.
> 
> Right, I meant Aodh alarms.
> 
> >
> > > 2. Raise "deduced alarms". For example, in case we detect a high
> > memory consumption on a host, we would like to raise deduced alarms
> > saying "instance might be suffering due to high memory consumption on
> > the host" on all related instances. Then, we can further deduce that
> > applications running on these instances might also be affected, and
> > raise alarms on them as well.
> > >
> > > Initially we planned to raise these deduced alarms in AODH, so other
> > Openstack components may consume them as well. Then, when we looked at
> > AODH alarms documentation, we noticed that there is currently no way
> > of raising custom alarms. We saw only three types of alarms: threshold
> > alarms, combination alarms and event alarms.
> > >
> > > So, our questions are:
> > >
> > > * Is there an alternative way of raising alarms in AODH?
> > what do we mean by raising alarms? do you want to create a new alarm
> > definition for Aodh or do you want to trigger an action? do you want
> > to have a new non-REST action?
> 
> I guess I would like to do both: create a new alarm definition, then trigger 
> it (call alarm_actions), and possibly
> later on set its state back to OK (call ok_action).
> I understood that currently all alarm triggering is internal in AODH, 
> according to threshold/events/combination alarm
> rules. Would it be possible to add a new kind of rule, that will allow 
> triggering the alarm externally?
> 
> >
> > > * Do you think custom alarms belong in AODH? Are you interested in
> > adding this capability to AODH?
> > >
> > > We would be happy to hear your vision and thoughts about it.
> > >
> > >
> > > Thanks,
> > > Ifat and Alexey.
> > >
> > >
> > > [1] https://wiki.openstack.org/wiki/Vitrage
> > >
> > >
> > >
> > >
> > >
> > __
> > >  OpenStack Development Mailing List (not for usage questions)
> > > Unsubscribe:
> > > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > > http://lists.openstack.org/cgi-

Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-11-23 Thread gord chung



On 23/11/2015 11:14 AM, AFEK, Ifat (Ifat) wrote:

I guess I would like to do both: create a new alarm definition, then
trigger it (call alarm_actions), and possibly later on set its state
back to OK (call ok_action).
I understood that currently all alarm triggering is internal in AODH,
according to threshold/events/combination alarm rules. Would it be
possible to add a new kind of rule, that will allow triggering the
alarm externally?

what type of rule?

i have https://review.openstack.org/#/c/247211 which would theoretically 
allow you to push an action into queue which would then trigger 
appropriate REST call. not sure if it helps you plug into Aodh easier or 
not?


--
gord


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-11-23 Thread AFEK, Ifat (Ifat)
Hi,

We have a couple of questions regarding AODH alarms.

In Vitrage[1] project we have two use cases that involve Ceilometer: 

1. Import Ceilometer alarms, as well as alarms and resources from other sources 
(Nagios, Zabbix, Nova, Heat, etc.), and produce RCA insights about the 
connection between different alarms.
2. Raise "deduced alarms". For example, in case we detect a high memory 
consumption on a host, we would like to raise deduced alarms saying "instance 
might be suffering due to high memory consumption on the host" on all related 
instances. Then, we can further deduce that applications running on these 
instances might also be affected, and raise alarms on them as well.

Initially we planned to raise these deduced alarms in AODH, so other Openstack 
components may consume them as well. Then, when we looked at AODH alarms 
documentation, we noticed that there is currently no way of raising custom 
alarms. We saw only three types of alarms: threshold alarms, combination alarms 
and event alarms.

So, our questions are: 

* Is there an alternative way of raising alarms in AODH?
* Do you think custom alarms belong in AODH? Are you interested in adding this 
capability to AODH? 

We would be happy to hear your vision and thoughts about it.


Thanks,
Ifat and Alexey.


[1] https://wiki.openstack.org/wiki/Vitrage 




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-11-23 Thread gord chung

hi Ifat,

i added some questions below.

On 23/11/2015 7:16 AM, AFEK, Ifat (Ifat) wrote:

Hi,

We have a couple of questions regarding AODH alarms.

In Vitrage[1] project we have two use cases that involve Ceilometer:

1. Import Ceilometer alarms, as well as alarms and resources from other sources 
(Nagios, Zabbix, Nova, Heat, etc.), and produce RCA insights about the 
connection between different alarms.
to clarify, Ceilometer alarms is deprecated for Aodh and will be removed 
very, very soon.



2. Raise "deduced alarms". For example, in case we detect a high memory consumption on a 
host, we would like to raise deduced alarms saying "instance might be suffering due to high 
memory consumption on the host" on all related instances. Then, we can further deduce that 
applications running on these instances might also be affected, and raise alarms on them as well.

Initially we planned to raise these deduced alarms in AODH, so other Openstack 
components may consume them as well. Then, when we looked at AODH alarms 
documentation, we noticed that there is currently no way of raising custom 
alarms. We saw only three types of alarms: threshold alarms, combination alarms 
and event alarms.

So, our questions are:

* Is there an alternative way of raising alarms in AODH?
what do we mean by raising alarms? do you want to create a new alarm 
definition for Aodh or do you want to trigger an action? do you want to 
have a new non-REST action?



* Do you think custom alarms belong in AODH? Are you interested in adding this 
capability to AODH?

We would be happy to hear your vision and thoughts about it.


Thanks,
Ifat and Alexey.


[1] https://wiki.openstack.org/wiki/Vitrage




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


--
gord


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH

2015-11-23 Thread AFEK, Ifat (Ifat)
Hi Gord,

Please see my answers below.

Ifat.


> -Original Message-
> From: gord chung [mailto:g...@live.ca]
> Sent: Monday, November 23, 2015 4:57 PM
> To: openstack-dev@lists.openstack.org
> Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom
> alarms in AODH
> 
> hi Ifat,
> 
> i added some questions below.
> 
> On 23/11/2015 7:16 AM, AFEK, Ifat (Ifat) wrote:
> > Hi,
> >
> > We have a couple of questions regarding AODH alarms.
> >
> > In Vitrage[1] project we have two use cases that involve Ceilometer:
> >
> > 1. Import Ceilometer alarms, as well as alarms and resources from
> other sources (Nagios, Zabbix, Nova, Heat, etc.), and produce RCA
> insights about the connection between different alarms.
> to clarify, Ceilometer alarms is deprecated for Aodh and will be
> removed very, very soon.

Right, I meant Aodh alarms.

> 
> > 2. Raise "deduced alarms". For example, in case we detect a high
> memory consumption on a host, we would like to raise deduced alarms
> saying "instance might be suffering due to high memory consumption on
> the host" on all related instances. Then, we can further deduce that
> applications running on these instances might also be affected, and
> raise alarms on them as well.
> >
> > Initially we planned to raise these deduced alarms in AODH, so other
> Openstack components may consume them as well. Then, when we looked at
> AODH alarms documentation, we noticed that there is currently no way of
> raising custom alarms. We saw only three types of alarms: threshold
> alarms, combination alarms and event alarms.
> >
> > So, our questions are:
> >
> > * Is there an alternative way of raising alarms in AODH?
> what do we mean by raising alarms? do you want to create a new alarm
> definition for Aodh or do you want to trigger an action? do you want to
> have a new non-REST action?

I guess I would like to do both: create a new alarm definition, then 
trigger it (call alarm_actions), and possibly later on set its state 
back to OK (call ok_action).
I understood that currently all alarm triggering is internal in AODH, 
according to threshold/events/combination alarm rules. Would it be 
possible to add a new kind of rule, that will allow triggering the 
alarm externally? 

> 
> > * Do you think custom alarms belong in AODH? Are you interested in
> adding this capability to AODH?
> >
> > We would be happy to hear your vision and thoughts about it.
> >
> >
> > Thanks,
> > Ifat and Alexey.
> >
> >
> > [1] https://wiki.openstack.org/wiki/Vitrage
> >
> >
> >
> >
> >
> __
> >  OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe:
> > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> --
> gord
> 
> 
> ___
> ___
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-
> requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev