Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
> -Original Message- > From: Ryota Mibu [mailto:r-m...@cq.jp.nec.com] > Sent: Tuesday, December 08, 2015 11:17 AM > > Hi Ifat, > > In short, 'event' is generated in OpenStack, 'alarm' is defined by a > user. 'event' is a container of data passed from other OpenStack > services through OpenStack notification bus. 'event' and contained data > will be stored in ceilometer DB and exposed via event api [1]. 'alarm' > is pre-configured alerting rule defined by a user via alarm API [2]. > 'Alarm' also has state like 'ok' and 'alarm', and history as well. > > [1] > http://docs.openstack.org/developer/ceilometer/webapi/v2.html#events- > and-traits > [2] http://docs.openstack.org/developer/aodh/webapi/v2.html#alarms > > > The point is whether we should use 'event' or 'alarm' for all failure > representation. Maybe we can use 'event' for all raw error/fault > notification, and use 'alarm' for exposing deduced/wrapped failure. > This is my view, so might be wrong. > Hi, Let me summarize the issue. What we need in Vitrage is: - custom alarms, where we can set metadata like: {"resource_type":"switch", "resource_name":"switch-2"} or {"resource_type":"nova.instance", "resource_id":} or {"nagios_test_name":"check_ovs_vswitchd", "nagios_test_status":"warning"} - the ability to define an alarm once, and instantiate it multiple times for every instance - the ability to define an alarm on-the-fly (since we can't predict all alarm types) - an option to trigger the alarm from vitrage The optimal solution for us would be to have alarm templates and alarm metadata. Or, we can have a workaround... The current workarounds that I see are: 1. Create an event-alarm on the fly for every alarm on every instance and set its state immediately using Aodh API. The alarm will be stored in the database, but this will not trigger a notification or a call to alarm-actions. The alarm name will have to include the resource name/id, like "Instance is at risk due to public switch problem" to make it unique. This might work for Vitrage horizon use cases in Mitaka, but not for future use cases that will require alarm-actions. 2. Send notifications in order to trigger event alarms "by the book". Vitrage notification "Alarm: Instance is at risk due to public switch problem" with metadata {"switch_name":"switch-2", "instance_id":} will be converted to a corresponding event, then to an alarm. We will still need to create a different alarm for every instance. And we will have to wait until the cache is refreshed. I will be happy to hear your thoughts about it. Thanks, Ifat. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi Ryota, > -Original Message- > From: Ryota Mibu [mailto:r-m...@cq.jp.nec.com] > Sent: Tuesday, December 08, 2015 11:17 AM > > In short, 'event' is generated in OpenStack, 'alarm' is defined by a > user. 'event' is a container of data passed from other OpenStack > services through OpenStack notification bus. 'event' and contained data > will be stored in ceilometer DB and exposed via event api [1]. 'alarm' > is pre-configured alerting rule defined by a user via alarm API [2]. > 'Alarm' also has state like 'ok' and 'alarm', and history as well. > > [1] > http://docs.openstack.org/developer/ceilometer/webapi/v2.html#events- > and-traits > [2] http://docs.openstack.org/developer/aodh/webapi/v2.html#alarms > > > The point is whether we should use 'event' or 'alarm' for all failure > representation. Maybe we can use 'event' for all raw error/fault > notification, and use 'alarm' for exposing deduced/wrapped failure. > This is my view, so might be wrong. > I believe Vitrage should define alarms, as we want the alarm to have a state and history (that can be queried in horizon UI). Moreover, in the future I can imagine that some other OpenStack services might want to add their alarm actions to the alarms that Vitrage generated. I think this applies both for Vitrage deduced alarms, and for alarms that Vitrage generated as a result of Nagios test failures for example. Does that make sense? Best Regards, Ifat. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi Ifat, > > Can we clarify use case again in terms of service role definition? > > Our use cases focus on giving value to the cloud admin, who will be able to: > > - view the topology of his environment, the relations between the physical, > virtual and applicative layer and the > statuses all resources > - view the alarms history > - view alarms about problems that Vitrage deduced could happen, even if no > other OpenStack component reported these > problems (yet) > - view RCA information about the alarms OK, thanks. > > Aodh provides alarming mechanism to *notify* events and situations > > calculated from various data sources. But, original/master information > > of resource including latest resource state is owned by other services > > such as nova. > > > > So, user who wants to know current resource state to find out dead > > resources (instances), can simply query instances via nova api. And, > > user who wants to know when/what failure occurred can query events via > > ceilometer api. Aodh has alarm state and history though. > > I'm not sure I fully understand the difference between Aodh events and > alarms. If the user wants to know what failure > occurred, is it part of Aodh events, alarms, or both? In short, 'event' is generated in OpenStack, 'alarm' is defined by a user. 'event' is a container of data passed from other OpenStack services through OpenStack notification bus. 'event' and contained data will be stored in ceilometer DB and exposed via event api [1]. 'alarm' is pre-configured alerting rule defined by a user via alarm API [2]. 'Alarm' also has state like 'ok' and 'alarm', and history as well. [1] http://docs.openstack.org/developer/ceilometer/webapi/v2.html#events-and-traits [2] http://docs.openstack.org/developer/aodh/webapi/v2.html#alarms The point is whether we should use 'event' or 'alarm' for all failure representation. Maybe we can use 'event' for all raw error/fault notification, and use 'alarm' for exposing deduced/wrapped failure. This is my view, so might be wrong. Best regards, Ryota __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
> -Original Message- > From: Julien Danjou [mailto:jul...@danjou.info] > Sent: Monday, December 07, 2015 12:00 PM > > I find it odd to have UI use cases first, as their terribly large for a > MVP. Unless Vitrage already exists and you have all the code figured > out. :) We have most of it figured out. We have an RCA engine written in java as a proprietary CloudBand code, with UI for showing the topology and RCA, and it is already working in production environments. We have decided to write a similar project in python as part of OpenStack project. Obviously, writing in OpenStack brings up new challenges which we are now trying to solve. > > In case you haven't seen in yet, our high level architecture is on > > Vitrage main page[2], and in the coming days we plan to document also > > the lower level design. > > I just looked at it, at it's very interesting. All the high level > functionalities make sense and provide values. But if you try to solve > them all 5 at once, I'm afraid you're going to either build a monster > (with a lot of overlap with other projects, hard to maintain, etc) or > just crash because you'll be blocked by all other OpenStack projects. > That's the big issue when starting to build a project on top of others > OpenStack bricks. > > Overall I'm just saying that because it's still not clear to me which > part you're trying to solve in this thread and how we can help you. > What can we provide in our projects, that you miss, that could help > you, concretely? What feature we need to work on next? > > It would be awesome to have _one_ use-case described end-to-end that > you would like to solve with Vitrage, leveraging various OpenStack > projects, that you cannot solve right now because of missing pieces. > Then we could start identifying these missing pieces and implement/fix > them. :-) We are not going to implement 5 use cases at once :-) We will start with the physical-to-virtual mapping + a UI for visualizing this topology. This is the basic functionality for our next use cases. Next, we will move to the RCA and the deduced alarms use cases. Alarm aggregation probably won't be implemented for mitaka. Let me describe in details the deduced alarms use case. 1. Vitrage gets an alarm from Nagios about a public switch failure 2. Vitrage evaluator decides (based on its templates) that an "Instance is at risk due to public switch problem" alarm should be triggered for every instance on every host attached to this public switch 3. Vitrage notifier creates corresponding alarm definitions in Aodh 4. Aodh stores these alarms in its database 5. Vitrage triggers the alarms (sets their states) 6. Aodh updates the alarms states and notifies about it 7. Horizon user queries Aodh for a list of all alarms. Aodh returns a list that includes the alarms that were triggered by Vitrage. The added value of this use case, is that the Cloud Admin can see that some instances are at risk, even thought their Nova statuses are ok. For the integration with Aodh, we need the ability to create alarm definitions that are not based on metrics, and to trigger them ourselves. What do you think? Thanks for your feedback, it is very helpful! Ifat and Alexey. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
On Mon, Dec 07 2015, AFEK, Ifat (Ifat) wrote: > Our goal is to get as much information as we can from various data > sources. If you connect Nagios to telemetry project, and we can get > nagios alarms directly from Aodh, it would be great. Is it something > that you planned on doing for Mitaka? Unfortunately nobody planned to work on a Nagios -> Ceilometer/Gnocchi connector. That maybe a good idea, and the fact that is not planned is not necessarily a blocker. If someone wants to jump in… > Our current use cases focus on giving value to the cloud admin. These > are mostly UI use cases; the admin will be able to: > > - view the topology of his environment, the relations between the > physical, virtual and applicative layer and the statuses all resources > - view the alarms history (there is an existing blueprint for it[1]) > - view alarms about problems that Vitrage deduced could happen, even > if no other OpenStack component reported these problems (yet) > - view RCA information about the alarms I find it odd to have UI use cases first, as their terribly large for a MVP. Unless Vitrage already exists and you have all the code figured out. :) The way I see the big pictures, Vitrage should be done as some sort of an engine on top of Ceilometer/Gnocchi/Aodh and leverage them to do RCA analysis. So what's missing in those projects to make that happen should be done, and Vitrage should start as a MVP; and then we can iterate, both on Vitrage side and both on the telemetry projects. I have the feeling that you're trying to bite a too large portion at once and that you may crash because of that. > In order to support these use cases, we will get input from various > data sources, process and evaluate it based on configurable templates, > trigger new alarms in Aodh and calculate RCA information. > On top of it, we will have Vitrage API to query the information and > show it in horizon. > In case you haven't seen in yet, our high level architecture is on > Vitrage main page[2], and in the coming days we plan to document also > the lower level design. I just looked at it, at it's very interesting. All the high level functionalities make sense and provide values. But if you try to solve them all 5 at once, I'm afraid you're going to either build a monster (with a lot of overlap with other projects, hard to maintain, etc) or just crash because you'll be blocked by all other OpenStack projects. That's the big issue when starting to build a project on top of others OpenStack bricks. Overall I'm just saying that because it's still not clear to me which part you're trying to solve in this thread and how we can help you. What can we provide in our projects, that you miss, that could help you, concretely? What feature we need to work on next? It would be awesome to have _one_ use-case described end-to-end that you would like to solve with Vitrage, leveraging various OpenStack projects, that you cannot solve right now because of missing pieces. Then we could start identifying these missing pieces and implement/fix them. :-) -- Julien Danjou ;; Free Software hacker ;; https://julien.danjou.info signature.asc Description: PGP signature __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi Ryota, > -Original Message- > From: Ryota Mibu [mailto:r-m...@cq.jp.nec.com] > Sent: Friday, December 04, 2015 9:42 AM > > > The next step can happen if and when Aodh supports alarm templates. > > If Vitrage can handle about 30 alarm types, and there are 100 > > instances, we don't want to pre-configure 3000 alarms, which most > likely will never be triggered. > > I understand your concern. Aodh is user facing service, so having lots > of alarms doesn't make sense. > > Can we clarify use case again in terms of service role definition? Our use cases focus on giving value to the cloud admin, who will be able to: - view the topology of his environment, the relations between the physical, virtual and applicative layer and the statuses all resources - view the alarms history - view alarms about problems that Vitrage deduced could happen, even if no other OpenStack component reported these problems (yet) - view RCA information about the alarms > > Aodh provides alarming mechanism to *notify* events and situations > calculated from various data sources. But, original/master information > of resource including latest resource state is owned by other services > such as nova. > > So, user who wants to know current resource state to find out dead > resources (instances), can simply query instances via nova api. And, > user who wants to know when/what failure occurred can query events via > ceilometer api. Aodh has alarm state and history though. I'm not sure I fully understand the difference between Aodh events and alarms. If the user wants to know what failure occurred, is it part of Aodh events, alarms, or both? > > > OK. The 'combination' type alarm enables you to aggregate multiple > > > alarm to one alarm. This can be used when you want to receive alarm > > > when the both of physical NIC ports are downed to recognize logical > > > connection unavailability if the ports are teamed for redundancy. > > > Now, the combination alarms are evaluated periodically that means > > > you can receive combination alarm not on-the-fly while you are > using > > > event alarms as source of combination alarm though. > > > > I think I understand your point. It means that certain alarms will > > arrive to Vitrage in delay, due to your evaluation policy. I think we > will have to address this issue at some point, but it won't change our > overall design. > > Yes, I'm just curious if there is any user can get benefit from this > improvement to set priority. I don't see a need for that improvement in our current use cases. Not so sure about the future use cases, I will keep this limitation in mind. Best Regards, Ifat. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi Julien, > -Original Message- > From: Julien Danjou [mailto:jul...@danjou.info] > Sent: Thursday, December 03, 2015 4:27 PM > > I think that I would be more interested by connecting Nagios to > Ceilometer/Gnocchi/Aodh with maybe the long-term goal of replacing it > by that stack, which should be more scalable and dynamic. > > That would make Vitrage only needing to build on top of telemetry > projects. It would also bring Nagios & co to telemetry not only for > Vitrage, but for the whole stack. > > Maybe there's some good reasons you're going the way you do, I don't > have the pretension to have though about that as long as you probably > did. :-) Our goal is to get as much information as we can from various data sources. If you connect Nagios to telemetry project, and we can get nagios alarms directly from Aodh, it would be great. Is it something that you planned on doing for Mitaka? > Do you have something like a MVP based on Telemetry you target? I saw > you were already talking about Horizon, which to me is something that > (sh|c)ould be way further into the pipeline, so I'm worried. ;) Our current use cases focus on giving value to the cloud admin. These are mostly UI use cases; the admin will be able to: - view the topology of his environment, the relations between the physical, virtual and applicative layer and the statuses all resources - view the alarms history (there is an existing blueprint for it[1]) - view alarms about problems that Vitrage deduced could happen, even if no other OpenStack component reported these problems (yet) - view RCA information about the alarms In order to support these use cases, we will get input from various data sources, process and evaluate it based on configurable templates, trigger new alarms in Aodh and calculate RCA information. On top of it, we will have Vitrage API to query the information and show it in horizon. In case you haven't seen in yet, our high level architecture is on Vitrage main page[2], and in the coming days we plan to document also the lower level design. Best Regards, Ifat. [1] https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page [2] https://wiki.openstack.org/wiki/Vitrage __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi Ifat, > > > Let me see if I got this right: are you suggesting that we create > > > on-the-fly alarm definitions with no alarm_actions, for every > > > deduced > > alarm that we want to raise? And this will spare us the extra alarm > > evaluation in AODH? > > > > Yes. But, please note that could be the first step. The next step > > would be make vitrage to send out alarm event to ceilometer/aodh the > > pre- configured event alarm will recognize the alarm and fire the > > alarm notification to another service or an end user. Eventually, we > > should have relevant alarm type and evaluator to proxy evaluation in > > vitrage, I think. > > The next step can happen if and when Aodh supports alarm templates. > If Vitrage can handle about 30 alarm types, and there are 100 instances, we > don't want to pre-configure 3000 alarms, > which most likely will never be triggered. I understand your concern. Aodh is user facing service, so having lots of alarms doesn't make sense. Can we clarify use case again in terms of service role definition? Aodh provides alarming mechanism to *notify* events and situations calculated from various data sources. But, original/master information of resource including latest resource state is owned by other services such as nova. So, user who wants to know current resource state to find out dead resources (instances), can simply query instances via nova api. And, user who wants to know when/what failure occurred can query events via ceilometer api. Aodh has alarm state and history though. > > > Another question is our need to get alarms from other sources, like > > > Nagios, zabbix, ganglia, etc. We thought that Vitrage would query > > > these Alarms from each source directly, and then create alarms in > > AODH in the same way as our deduced alarms: for example create > > nagios_ovs_vswitchd alarm if nagios check_ovs_vswitchd test failed. > > > An alternative could be to integrate nagios directly with AODH. > > > What do you think? > > > > Hmm, I don't have clear view on this. If the source can includes > > OpenStack IDs and can be generate relevant meter/sample, it should be > > useful to integrate with ceilometer. But if you want to do some > > operations (like correlation), then it is reasonable to integrate with > > vitrage. > > The source may include alarms on resources that are not defined in OpenStack, > like switches or ports. And the alarms > are not necessarily related to meters, they can be test nagios failures for > example. Yes, so it depends on type of resource and its parameter. > > > > BTW, is it useful to have on-the-fly evaluation of combination > > alarm > > > > with event alarms for alarm aggregation or other cases? > > > > > > I'm not sure I understand. Can you give a detailed example? > > > > OK. The 'combination' type alarm enables you to aggregate multiple > > alarm to one alarm. This can be used when you want to receive alarm > > when the both of physical NIC ports are downed to recognize logical > > connection unavailability if the ports are teamed for redundancy. Now, > > the combination alarms are evaluated periodically that means you can > > receive combination alarm not on-the-fly while you are using event > > alarms as source of combination alarm though. > > I think I understand your point. It means that certain alarms will arrive to > Vitrage in delay, due to your evaluation > policy. I think we will have to address this issue at some point, but it > won't change our overall design. Yes, I'm just curious if there is any user can get benefit from this improvement to set priority. > > > In addition, in Vitrage we plan to handle alarm aggregation by > > > creating aggregation rule templates, for example based on the RCA > > information. > > > The user will be able to see only the root cause alarms, and then > > > drill down to all specific alarms. But I doubt if this will be done > > for Mitaka. > > > > I think 'the RCA information' means information for RCA. I mean > > vitrage will use the resource topologies or relationship in > > aggregation, rather than result of RCA. Am I right? > > The term "aggregation" is used in different contexts, which may be confusing. > Our plan is to examine the already-computed > RCA information, and see, for example, that a switch failure alarm caused > alarms on 100 related instances. In horizon, > the result will be 101 alarms shown to the user in a flat list. > By "alarm aggregation based on RCA" we mean that we will have an API to get > root cause alarms, which will return only > the switch alarm. The horizon user will see one alarm, and may then ask to > expand the view and see all the other alarms > that were caused by it. I see. I used the term "aggregation" for aggregation process in alarm evaluation. Thanks, Ryota __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstac
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
On Thu, Dec 03 2015, AFEK, Ifat (Ifat) wrote: > One of Vitrage's goals is to gather information from different layers - > Physical, virtual and applicative - create a topology tree with the > Relations between the different entities in all layers, and perform > alarm analysis based on this topology. > > Currently, we can get alarms on the virtual layer from Ceilometer, and > alarms on the physical layer from Nagios for example. We can then try > to correlate all these alarms, compute RCA, and optionally trigger other > alarms, for example that an application might be running in suboptimal > state due to cpu threshold alarm on the instance. You can't really say that Nagios is for hardware and Ceilometer is for virtual. This may be the way you view or deploy things, but this is not a reality. We have plugins to check hardware (SNMP, IPMI…) in Ceilometer, and I'm sure you can configure Nagios to check OpenStack resources. My point is that here is no hard line between the tools. They both exists, and it's OK to use both of them – they do different things and things differently – but how you make them work together isn't clear. > We didn't suggest that Ceilometer will replace Nagios, rather that > Ceilometer might get Nagios test results as input/events, and trigger > Corresponding alarms. Since right now Nagios and Ceilometer are not > connected, we thought that at the first stage we will query alarms > separately from Ceilometer and from Nagios. > > Is it more clear? Yes it is, thanks!. I think that I would be more interested by connecting Nagios to Ceilometer/Gnocchi/Aodh with maybe the long-term goal of replacing it by that stack, which should be more scalable and dynamic. That would make Vitrage only needing to build on top of telemetry projects. It would also bring Nagios & co to telemetry not only for Vitrage, but for the whole stack. Maybe there's some good reasons you're going the way you do, I don't have the pretension to have though about that as long as you probably did. :-) Though I think there's value in what you're trying to do, so it'd be cool to be able to move your forward. That's why I'm trying to insist that the current telemetry stuff should be able to solve as many problem you have as we can! Do you have something like a MVP based on Telemetry you target? I saw you were already talking about Horizon, which to me is something that (sh|c)ould be way further into the pipeline, so I'm worried. ;) -- Julien Danjou # Free Software hacker # https://julien.danjou.info signature.asc Description: PGP signature __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi Julien, > From: Julien Danjou [mailto:jul...@danjou.info] > Sent: Thursday, December 03, 2015 10:53 AM > > On Thu, Dec 03 2015, AFEK, Ifat (Ifat) wrote: > > > Another question is our need to get alarms from other sources, like > > Nagios, zabbix, ganglia, etc. We thought that Vitrage would query > > these Alarms from each source directly, and then create alarms in > AODH > > in the same way as our deduced alarms: for example create > > nagios_ovs_vswitchd alarm if nagios check_ovs_vswitchd test failed. > > An alternative could be to integrate nagios directly with AODH. > > What do you think? > > I think I'd like to be able to answer this question, but I kind of lack > the bigger picture of what you need these alarms for, and what you > would like them to do with? > > I think we don't have everything right now in Ceilometer/Gnocchi/Aodh > to replace something like Nagios _but_ we have a base framework that > should be more powerful and way more scalable. That could be leveraged > to built something better that Nagios, while staying compatible. > > What Nagios does is polling, storing state, and doing action based on > that state. Which is more or less what Ceilometer does (polling), > Gnocchi does (storing things) and Aodh does (triggering action based on > the state). Obviously there's more to that (e.g. dependencies) that are > not handled currently, and that could be added later – maybe in some > parts of the current telemetry projects, or maybe in Vitrage. > > So how fitting such tools (Nagios, Zabbix, whatever) in those projects > is an interesting problem. But I'm not clear on the first steps and > how/why you want to leverage alarms first. :) One of Vitrage's goals is to gather information from different layers - Physical, virtual and applicative - create a topology tree with the Relations between the different entities in all layers, and perform alarm analysis based on this topology. Currently, we can get alarms on the virtual layer from Ceilometer, and alarms on the physical layer from Nagios for example. We can then try to correlate all these alarms, compute RCA, and optionally trigger other alarms, for example that an application might be running in suboptimal state due to cpu threshold alarm on the instance. We didn't suggest that Ceilometer will replace Nagios, rather that Ceilometer might get Nagios test results as input/events, and trigger Corresponding alarms. Since right now Nagios and Ceilometer are not connected, we thought that at the first stage we will query alarms separately from Ceilometer and from Nagios. Is it more clear? Best Regards, Ifat. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi Ryota, > From: Ryota Mibu [mailto:r-m...@cq.jp.nec.com] > > > > Let me see if I got this right: are you suggesting that we create > > on-the-fly alarm definitions with no alarm_actions, for every deduced > alarm that we want to raise? And this will spare us the extra alarm > evaluation in AODH? > > Yes. But, please note that could be the first step. The next step would > be make vitrage to send out alarm event to ceilometer/aodh the pre- > configured event alarm will recognize the alarm and fire the alarm > notification to another service or an end user. Eventually, we should > have relevant alarm type and evaluator to proxy evaluation in vitrage, > I think. The next step can happen if and when Aodh supports alarm templates. If Vitrage can handle about 30 alarm types, and there are 100 instances, we don't want to pre-configure 3000 alarms, which most likely will never be triggered. > > Another question is our need to get alarms from other sources, like > > Nagios, zabbix, ganglia, etc. We thought that Vitrage would query > > these Alarms from each source directly, and then create alarms in > AODH in the same way as our deduced alarms: for example create > nagios_ovs_vswitchd alarm if nagios check_ovs_vswitchd test failed. > > An alternative could be to integrate nagios directly with AODH. > > What do you think? > > Hmm, I don't have clear view on this. If the source can includes > OpenStack IDs and can be generate relevant meter/sample, it should be > useful to integrate with ceilometer. But if you want to do some > operations (like correlation), then it is reasonable to integrate with > vitrage. The source may include alarms on resources that are not defined in OpenStack, like switches or ports. And the alarms are not necessarily related to meters, they can be test nagios failures for example. > > > BTW, is it useful to have on-the-fly evaluation of combination > alarm > > > with event alarms for alarm aggregation or other cases? > > > > I'm not sure I understand. Can you give a detailed example? > > OK. The 'combination' type alarm enables you to aggregate multiple > alarm to one alarm. This can be used when you want to receive alarm > when the both of physical NIC ports are downed to recognize logical > connection unavailability if the ports are teamed for redundancy. Now, > the combination alarms are evaluated periodically that means you can > receive combination alarm not on-the-fly while you are using event > alarms as source of combination alarm though. I think I understand your point. It means that certain alarms will arrive to Vitrage in delay, due to your evaluation policy. I think we will have to address this issue at some point, but it won't change our overall design. > > In addition, in Vitrage we plan to handle alarm aggregation by > > creating aggregation rule templates, for example based on the RCA > information. > > The user will be able to see only the root cause alarms, and then > > drill down to all specific alarms. But I doubt if this will be done > for Mitaka. > > I think 'the RCA information' means information for RCA. I mean vitrage > will use the resource topologies or relationship in aggregation, rather > than result of RCA. Am I right? The term "aggregation" is used in different contexts, which may be confusing. Our plan is to examine the already-computed RCA information, and see, for example, that a switch failure alarm caused alarms on 100 related instances. In horizon, the result will be 101 alarms shown to the user in a flat list. By "alarm aggregation based on RCA" we mean that we will have an API to get root cause alarms, which will return only the switch alarm. The horizon user will see one alarm, and may then ask to expand the view and see all the other alarms that were caused by it. Best Regards, Ifat. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi Ifat, > > One approach we can take, is that you configure aodh to pass each row > > event (e.g. each VM downed) wrapped in alarm notification to vitrage, > > then do some operation (e.g. deducing, aggregating) and store > > resource- level alarm without any alarm_actions, so that users can see > > the alarms in horizon view. This may not require alarm evaluation, so > > we can forget the problem I raised (cache refresh interval). > > Let me see if I got this right: are you suggesting that we create on-the-fly > alarm definitions with no alarm_actions, > for every deduced alarm that we want to raise? And this will spare us the > extra alarm evaluation in AODH? Yes. But, please note that could be the first step. The next step would be make vitrage to send out alarm event to ceilometer/aodh the pre-configured event alarm will recognize the alarm and fire the alarm notification to another service or an end user. Eventually, we should have relevant alarm type and evaluator to proxy evaluation in vitrage, I think. > My next question is how exactly we should create these resource-level alarms. > Can we create an alarm definition with > no rule, no actions, and initial state set to "alarm"? (I'm not sure it can > be done in the current AODH API) You can. This is not proper way of using aodh though. But, this is easy to create an alarm entry to show it in horizon. > Another question is our need to get alarms from other sources, like Nagios, > zabbix, ganglia, etc. We thought that > Vitrage would query these Alarms from each source directly, and then create > alarms in AODH in the same way as our > deduced alarms: for example create nagios_ovs_vswitchd alarm if nagios > check_ovs_vswitchd test failed. > An alternative could be to integrate nagios directly with AODH. > What do you think? Hmm, I don't have clear view on this. If the source can includes OpenStack IDs and can be generate relevant meter/sample, it should be useful to integrate with ceilometer. But if you want to do some operations (like correlation), then it is reasonable to integrate with vitrage. > > BTW, is it useful to have on-the-fly evaluation of combination alarm > > with event alarms for alarm aggregation or other cases? > > I'm not sure I understand. Can you give a detailed example? OK. The 'combination' type alarm enables you to aggregate multiple alarm to one alarm. This can be used when you want to receive alarm when the both of physical NIC ports are downed to recognize logical connection unavailability if the ports are teamed for redundancy. Now, the combination alarms are evaluated periodically that means you can receive combination alarm not on-the-fly while you are using event alarms as source of combination alarm though. > > Horizon view is the different topic. Maybe we can reduce the number of > > alarms listed in user view by creating raw alarms in admin space that > > is not visible from end user, or using relevant severity or tag so > > that user can filter out uninterested alarms. > > Referring to this[1] blueprint, do you have specific concerns regarding the > usability/performance of Horizon view > when there are many alarms? > I think that your ideas make sense, and we can implement them if there is a > need. Sorry, I'm not familiar with horizon these days... But, if you need change in aodh side, I can help you. > In addition, in Vitrage we plan to handle alarm aggregation by creating > aggregation rule templates, for example based > on the RCA information. > The user will be able to see only the root cause alarms, and then drill down > to all specific alarms. But I doubt if > this will be done for Mitaka. I think 'the RCA information' means information for RCA. I mean vitrage will use the resource topologies or relationship in aggregation, rather than result of RCA. Am I right? Best regards, Ryota __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
On Thu, Dec 03 2015, AFEK, Ifat (Ifat) wrote: > Another question is our need to get alarms from other sources, like > Nagios, zabbix, ganglia, etc. We thought that Vitrage would query these > Alarms from each source directly, and then create alarms in AODH in the > same way as our deduced alarms: for example create nagios_ovs_vswitchd > alarm if nagios check_ovs_vswitchd test failed. > An alternative could be to integrate nagios directly with AODH. > What do you think? I think I'd like to be able to answer this question, but I kind of lack the bigger picture of what you need these alarms for, and what you would like them to do with? I think we don't have everything right now in Ceilometer/Gnocchi/Aodh to replace something like Nagios _but_ we have a base framework that should be more powerful and way more scalable. That could be leveraged to built something better that Nagios, while staying compatible. What Nagios does is polling, storing state, and doing action based on that state. Which is more or less what Ceilometer does (polling), Gnocchi does (storing things) and Aodh does (triggering action based on the state). Obviously there's more to that (e.g. dependencies) that are not handled currently, and that could be added later – maybe in some parts of the current telemetry projects, or maybe in Vitrage. So how fitting such tools (Nagios, Zabbix, whatever) in those projects is an interesting problem. But I'm not clear on the first steps and how/why you want to leverage alarms first. :) -- Julien Danjou # Free Software hacker # https://julien.danjou.info signature.asc Description: PGP signature __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
> -Original Message- > From: Julien Danjou [mailto:jul...@danjou.info] > > On Wed, Dec 02 2015, AFEK, Ifat (Ifat) wrote: > > > Can this be supported without defining an alarm for every VM > separately? > > No, it's not possible. You'd have to create the alarm for each instance > for now. > > Honestly, I'd say start with this at a first step, and if it starts > becoming a problem, we can envision a better way to define some sort of > alarm template for example in Aodh. I wouldn't put the cart before the > horse. Ok, makes sense. So we can start by creating on-the-fly alarm definitions for every resource, and optionally request an Aodh enhancement in the future for alarm templates creation. Also, please see my response to Ryota Mibu, regarding the other alarms that we need to define. Thanks, Ifat. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi Ryota, Thanks for your response, please see my comments below. Ifat. > -Original Message- > From: Ryota Mibu [mailto:r-m...@cq.jp.nec.com] > > Hi, > > > Sorry for my late response... > > It seems like a fundamental question whether we should have rich > function or intelligence in on-the-fly event alarm evaluation. I think > we can add simple operations (like aggregating alarm) in aodh > evaluator, and other operations (like deducing with referring some > external DB) should be done outside of the evaluation process to reduce > impact on other evaluations. But, if we separate too much, then there > will be many interactions between two services that makes slow to > finish sequence of alarm handling. > > One approach we can take, is that you configure aodh to pass each row > event (e.g. each VM downed) wrapped in alarm notification to vitrage, > then do some operation (e.g. deducing, aggregating) and store resource- > level alarm without any alarm_actions, so that users can see the alarms > in horizon view. This may not require alarm evaluation, so we can > forget the problem I raised (cache refresh interval). Let me see if I got this right: are you suggesting that we create on-the-fly alarm definitions with no alarm_actions, for every deduced alarm that we want to raise? And this will spare us the extra alarm evaluation in AODH? It does make sense. My next question is how exactly we should create these resource-level alarms. Can we create an alarm definition with no rule, no actions, and initial state set to "alarm"? (I'm not sure it can be done in the current AODH API) Another question is our need to get alarms from other sources, like Nagios, zabbix, ganglia, etc. We thought that Vitrage would query these Alarms from each source directly, and then create alarms in AODH in the same way as our deduced alarms: for example create nagios_ovs_vswitchd alarm if nagios check_ovs_vswitchd test failed. An alternative could be to integrate nagios directly with AODH. What do you think? > BTW, is it useful to have on-the-fly evaluation of combination alarm > with event alarms for alarm aggregation or other cases? I'm not sure I understand. Can you give a detailed example? > Horizon view is the different topic. Maybe we can reduce the number of > alarms listed in user view by creating raw alarms in admin space that > is not visible from end user, or using relevant severity or tag so that > user can filter out uninterested alarms. Referring to this[1] blueprint, do you have specific concerns regarding the usability/performance of Horizon view when there are many alarms? I think that your ideas make sense, and we can implement them if there is a need. In addition, in Vitrage we plan to handle alarm aggregation by creating aggregation rule templates, for example based on the RCA information. The user will be able to see only the root cause alarms, and then drill down to all specific alarms. But I doubt if this will be done for Mitaka. [1] https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page Thanks, Ifat. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi, Sorry for my late response... It seems like a fundamental question whether we should have rich function or intelligence in on-the-fly event alarm evaluation. I think we can add simple operations (like aggregating alarm) in aodh evaluator, and other operations (like deducing with referring some external DB) should be done outside of the evaluation process to reduce impact on other evaluations. But, if we separate too much, then there will be many interactions between two services that makes slow to finish sequence of alarm handling. One approach we can take, is that you configure aodh to pass each row event (e.g. each VM downed) wrapped in alarm notification to vitrage, then do some operation (e.g. deducing, aggregating) and store resource-level alarm without any alarm_actions, so that users can see the alarms in horizon view. This may not require alarm evaluation, so we can forget the problem I raised (cache refresh interval). BTW, is it useful to have on-the-fly evaluation of combination alarm with event alarms for alarm aggregation or other cases? Horizon view is the different topic. Maybe we can reduce the number of alarms listed in user view by creating raw alarms in admin space that is not visible from end user, or using relevant severity or tag so that user can filter out uninterested alarms. Best regards, Ryota --- "Ryota Mibu" NEC Corporation __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
On Wed, Dec 02 2015, AFEK, Ifat (Ifat) wrote: > As we understand it, if we take the first approach you describe, then we can > have an alarm refer to all the VMs in the system, but then if the alarm is > triggered by one VM or by five VMs, the result will be the same - only one > alarm will be active. What we want is to be able to distinguish between the > different VMs - to know which alarms were triggered on each specific VM. > > One of the motivations for this is that in Horizon we would like to display > all > the alarms, where we would like to be able to see that a problem occurred on > instance1, instance2 and instance8, not just that there was a problem on some > VMs out of a group. Ok, that's clearer. > Can this be supported without defining an alarm for every VM separately? No, it's not possible. You'd have to create the alarm for each instance for now. Honestly, I'd say start with this at a first step, and if it starts becoming a problem, we can envision a better way to define some sort of alarm template for example in Aodh. I wouldn't put the cart before the horse. > This is what Ryota Mibu wrote us: > >> The reason is that aodh evaluator may not be aware of new alarm >> definitions and won't send notification until its alarm definition > >> cache is refreshed in less than 60 sec (default value). > > Did we misunderstand? Oh no, but I thought you were mentioning at it being slow. This is a cache, you can lower it to 1s if you want, with the potential performance impact it may have. :) -- Julien Danjou ;; Free Software hacker ;; https://julien.danjou.info signature.asc Description: PGP signature __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi Julien, Please see our questions below. Ifat and Elisha. > -Original Message- > From: Julien Danjou [mailto:jul...@danjou.info] > > On Wed, Dec 02 2015, ROSENSWEIG, ELISHA (ELISHA) wrote: > > Regarding the second point: Say we have 30 different types of alarms > > we might want to raise on an OpenStack instance (VM). What I > > understand from your explanation is that when we create a new > > instance, we need to create 30 new alarms in Aodh that can be > > triggered some time in the future. If we have 100 instances, we will > > effectively have 3,000 alarms created in Aodh, and so on with more > instances. > > Not necessarily. You can create one alarm that has conditions large > enough to match e.g. all your VMs, and an alarm action that can be > generic enough so that it will do the right thing for each VM. > As we understand it, if we take the first approach you describe, then we can have an alarm refer to all the VMs in the system, but then if the alarm is triggered by one VM or by five VMs, the result will be the same - only one alarm will be active. What we want is to be able to distinguish between the different VMs - to know which alarms were triggered on each specific VM. One of the motivations for this is that in Horizon we would like to display all the alarms, where we would like to be able to see that a problem occurred on instance1, instance2 and instance8, not just that there was a problem on some VMs out of a group. Can this be supported without defining an alarm for every VM separately? > The alarm system provided by Aodh is really a simple event -> trigger > system in this area. How precise or large is your event really depends > on the granularity that your trigger (which is usually a Web hook) can > handle. > > > A different approach might be to create a new alarm in Aodh on-the- > fly. > > However, we are under the impression that the creation time can be up > > to one minute, which will cause a large delay. Is there any way to > shorten this? > > Creation time of an alarm of one minute? That's not normal. It should > consist of just a record in the database so it should be pretty fast. > This is what Ryota Mibu wrote us: > The reason is that aodh evaluator may not be aware of new alarm definitions > and won't send notification until its alarm definition > cache is refreshed > in less than 60 sec (default value). Did we misunderstand? __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
On Wed, Dec 02 2015, ROSENSWEIG, ELISHA (ELISHA) wrote: > Regarding the second point: Say we have 30 different types of alarms we might > want to raise on an OpenStack instance (VM). What I understand from your > explanation is that when we create a new instance, we need to create 30 new > alarms in Aodh that can be triggered some time in the future. If we have 100 > instances, we will effectively have 3,000 alarms created in Aodh, and so on > with more instances. Not necessarily. You can create one alarm that has conditions large enough to match e.g. all your VMs, and an alarm action that can be generic enough so that it will do the right thing for each VM. The alarm system provided by Aodh is really a simple event -> trigger system in this area. How precise or large is your event really depends on the granularity that your trigger (which is usually a Web hook) can handle. > A different approach might be to create a new alarm in Aodh on-the-fly. > However, we are under the impression that the creation time can be up to one > minute, which will cause a large delay. Is there any way to shorten this? Creation time of an alarm of one minute? That's not normal. It should consist of just a record in the database so it should be pretty fast. -- Julien Danjou // Free Software hacker // https://julien.danjou.info signature.asc Description: PGP signature __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Thanks for your response. It definitely helped clarify things. Regarding the second point: Say we have 30 different types of alarms we might want to raise on an OpenStack instance (VM). What I understand from your explanation is that when we create a new instance, we need to create 30 new alarms in Aodh that can be triggered some time in the future. If we have 100 instances, we will effectively have 3,000 alarms created in Aodh, and so on with more instances. Is this a correct depiction of the situation? If so, do you think it will scale? A different approach might be to create a new alarm in Aodh on-the-fly. However, we are under the impression that the creation time can be up to one minute, which will cause a large delay. Is there any way to shorten this? Thanks Elisha > -Original Message- > From: Julien Danjou [mailto:jul...@danjou.info] > Sent: Wednesday, December 02, 2015 12:26 PM > To: ROSENSWEIG, ELISHA (ELISHA) > Cc: OpenStack Development Mailing List (not for usage questions); AFEK, > Ifat (Ifat) > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom > alarms in AODH > > On Tue, Dec 01 2015, ROSENSWEIG, ELISHA (ELISHA) wrote: > > > 1. Does AODH currently support raising alarms on resources not > modeled > > in OpenStack? For example, raising an alarm on a Switch? Or does each > > alarm have to relate to a resource ID (or IDs)( > > Yes, Aodh does not really care, especially with the Gnocchi backend. It > can evaluate any metric on any resource type and just trigger the > alarm. > > > 2. What we feel is missing is some way to raise an alarm on-the-fly. > > In Vitrage, we have this concept of "deduced alarms", where based on > > some analysis Vitrage determines we need to raise an alarm on some > > resource. As we understand it, currently to raise an alarm in AODH we > > need to register the alarm in advance, wait for it to be registered > and only then we can trigger the event. > > This will delay our response time to events. > > What you need to do, is create the alarm that you want to trigger in > Aodh. Let's say Vitrage knows that if switch A is going down, it needs > to send an email to an admin. > > You create an alarm in Aodh that says on event "switch A is down -> > send a mail to admin". Then Vitrage runs, and just have to emit an > event "switch A is down". You can do that via oslo.messaging or I guess > via the REST API (not sure the mechanism is here but it could be I > guess). > Then Aodh will trigger the alarm actions for you. > > Next step could be give more job to Aodh, such as determining that > "switch A is down" by doing some evaluation – unless the switch sends > an event when it's going down, but I imagine it's not the point. ;-) > > Does that help? > > -- > Julien Danjou > # Free Software hacker > # https://julien.danjou.info __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
On Tue, Dec 01 2015, ROSENSWEIG, ELISHA (ELISHA) wrote: > 1. Does AODH currently support raising alarms on resources not modeled > in OpenStack? For example, raising an alarm on a Switch? Or does each > alarm have to relate to a resource ID (or IDs)( Yes, Aodh does not really care, especially with the Gnocchi backend. It can evaluate any metric on any resource type and just trigger the alarm. > 2. What we feel is missing is some way to raise an alarm on-the-fly. In > Vitrage, we have this concept of "deduced alarms", where based on some > analysis > Vitrage determines we need to raise an alarm on some resource. As we > understand > it, currently to raise an alarm in AODH we need to register the alarm in > advance, wait for it to be registered and only then we can trigger the event. > This will delay our response time to events. What you need to do, is create the alarm that you want to trigger in Aodh. Let's say Vitrage knows that if switch A is going down, it needs to send an email to an admin. You create an alarm in Aodh that says on event "switch A is down -> send a mail to admin". Then Vitrage runs, and just have to emit an event "switch A is down". You can do that via oslo.messaging or I guess via the REST API (not sure the mechanism is here but it could be I guess). Then Aodh will trigger the alarm actions for you. Next step could be give more job to Aodh, such as determining that "switch A is down" by doing some evaluation – unless the switch sends an event when it's going down, but I imagine it's not the point. ;-) Does that help? -- Julien Danjou # Free Software hacker # https://julien.danjou.info signature.asc Description: PGP signature __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Thanks for the quick reply. We have a few more questions, for clarification: 1. Does AODH currently support raising alarms on resources not modeled in OpenStack? For example, raising an alarm on a Switch? Or does each alarm have to relate to a resource ID (or IDs)( 2. What we feel is missing is some way to raise an alarm on-the-fly. In Vitrage, we have this concept of "deduced alarms", where based on some analysis Vitrage determines we need to raise an alarm on some resource. As we understand it, currently to raise an alarm in AODH we need to register the alarm in advance, wait for it to be registered and only then we can trigger the event. This will delay our response time to events. Could you clarify these two points, or correct any misconceptions on our part? Thanks, Elisha Rosensweig, PhD CloudBand, Alcatel-Lucent > -Original Message- > From: Julien Danjou [mailto:jul...@danjou.info] > Sent: Tuesday, December 01, 2015 3:25 PM > To: AFEK, Ifat (Ifat) > Cc: OpenStack Development Mailing List (not for usage questions) > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom > alarms in AODH > > On Tue, Dec 01 2015, AFEK, Ifat (Ifat) wrote: > > > In Vitrage, we would like to evaluate and correlate different kinds > of alarms: > > AODH threshold alarms, event alarms, Nagios alarms, Ganglia alarms, > > Zabbix alarms, etc. This includes alarms on physical resources that > > are not part of OpenStack, like switches or ports, in order to > > understand their effect on OpenStack resources. > > > > Our question is: do you vision AODH as a "general OpenStack alarm > > engine", which serves as a database for alarms of all kinds? Or does > > AODH focus on metric-related alarms? > > I think we would be happy to have any kind of alarm supported in Aodh. > Though currently I'm not really seeing what is missing since we have > evaluation based alarms and event based alarms. > > Aodh is also meant to be generic enough to be consumed outside of > OpenStack itself – it works pretty well in standalone with Gnocchi for > example. > > -- > Julien Danjou > // Free Software hacker > // https://julien.danjou.info __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
On Tue, Dec 01 2015, AFEK, Ifat (Ifat) wrote: > In Vitrage, we would like to evaluate and correlate different kinds of alarms: > AODH threshold alarms, event alarms, Nagios alarms, Ganglia alarms, Zabbix > alarms, etc. This includes alarms on physical resources that are not part of > OpenStack, like switches or ports, in order to understand their effect on > OpenStack resources. > > Our question is: do you vision AODH as a "general OpenStack alarm engine", > which serves as a database for alarms of all kinds? Or does AODH focus on > metric-related alarms? I think we would be happy to have any kind of alarm supported in Aodh. Though currently I'm not really seeing what is missing since we have evaluation based alarms and event based alarms. Aodh is also meant to be generic enough to be consumed outside of OpenStack itself – it works pretty well in standalone with Gnocchi for example. -- Julien Danjou // Free Software hacker // https://julien.danjou.info signature.asc Description: PGP signature __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi, After some further discussions with Vitrage team, let me go one step back and ask a more basic question: In Vitrage, we would like to evaluate and correlate different kinds of alarms: AODH threshold alarms, event alarms, Nagios alarms, Ganglia alarms, Zabbix alarms, etc. This includes alarms on physical resources that are not part of OpenStack, like switches or ports, in order to understand their effect on OpenStack resources. Our question is: do you vision AODH as a "general OpenStack alarm engine", which serves as a database for alarms of all kinds? Or does AODH focus on metric-related alarms? Thanks, Ifat. > -Original Message- > From: AFEK, Ifat (Ifat) [mailto:ifat.a...@alcatel-lucent.com] > Sent: Monday, November 30, 2015 2:47 PM > To: OpenStack Development Mailing List (not for usage questions) > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom > alarms in AODH > > Hi, > > A few days ago I sent you this email (see below). Resending in case you > didn't see it. > If you could get back to me soon it would be most appreciated, as we > are quite blocked with our AODH integration right now. > > Thanks, > Ifat. > > > -Original Message- > From: AFEK, Ifat (Ifat) [mailto:ifat.a...@alcatel-lucent.com] > Sent: Tuesday, November 24, 2015 7:37 PM > To: OpenStack Development Mailing List (not for usage questions) > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom > alarms in AODH > > Hi Gord, Hi Ryota, > > (I sent the same mail again in a more readable format) > > Thanks for your detailed responses. > Hope you don't mind that I'm sending one reply to both of your emails. > I think it would be easier to have one thread for this discussion. > > > Let me explain our use case in more details. > Here is an example of how we would like to integrate with AODH. Let me > know what you think about it. > > 1. Vitrage gets an alarm from Nagios about high cpu load on one of the > hosts > > 2. Vitrage evaluator decides (based on its templates) that an "instance > might be suffering due to high cpu load on the host" alarm should be > triggered for every instance on this host > > 3. Vitrage notifier creates corresponding alarm definitions in AODH > > 4. AODH stores these alarms in its database > > 5. Vitrage triggers the alarms > > 6. AODH updates the alarms states and notifies about it > > 7. Horizon user queries AODH for a list of all alarms (we are currently > checking the status of a blueprint that should implement it[2]). AODH > returns a list that includes the alarms that were triggered by Vitrage. > > 8. Horizon user selects one of the alarms that Vitrage generated, and > asks to see its root cause (we will create a new blueprint for that). > Vitrage API returns the RCA information for this alarm. > > > Our current discussion is on steps 3-6 (as far as we understand, and > please correct me if I'm wrong, nothing blocks the implementation of > the blueprint for step 7). > > > > Looking at AODH API again, here is what I think we need to do: > > 1. Define an alarm with an external_trigger_rule or something like > that. This alarm has no metric data. We just want to be able to trigger > it and query its state. > > 2. Use AODH API for triggering this alarm. Will "PUT > /v2/alarms/(alarm_id)/state" do the job? > > > Please see also my comments below. > > Thanks, > Ifat. > > > [2] https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm- > management-page > > > > > > -Original Message- > > From: gord chung [mailto:g...@live.ca] > > Sent: Monday, November 23, 2015 9:45 PM > > To: openstack-dev@lists.openstack.org > > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising > > custom alarms in AODH > > > > > > > > On 23/11/2015 11:14 AM, AFEK, Ifat (Ifat) wrote: > > > I guess I would like to do both: create a new alarm definition, > then > > > trigger it (call alarm_actions), and possibly later on set its > state > > > back to OK (call ok_action). > > > I understood that currently all alarm triggering is internal in > > > AODH, according to threshold/events/combination alarm rules. Would > > > it be possible to add a new kind of rule, that will allow > triggering > > > the alarm externally? > > what type of rule? > > > > i have https://review.openstack.org/#/c/247211 which would > > theoretically allow you to push an action into queue which would then > > trigger appropriate REST call. not sure if it helps you plug into >
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi, A few days ago I sent you this email (see below). Resending in case you didn't see it. If you could get back to me soon it would be most appreciated, as we are quite blocked with our AODH integration right now. Thanks, Ifat. -Original Message- From: AFEK, Ifat (Ifat) [mailto:ifat.a...@alcatel-lucent.com] Sent: Tuesday, November 24, 2015 7:37 PM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH Hi Gord, Hi Ryota, (I sent the same mail again in a more readable format) Thanks for your detailed responses. Hope you don't mind that I'm sending one reply to both of your emails. I think it would be easier to have one thread for this discussion. Let me explain our use case in more details. Here is an example of how we would like to integrate with AODH. Let me know what you think about it. 1. Vitrage gets an alarm from Nagios about high cpu load on one of the hosts 2. Vitrage evaluator decides (based on its templates) that an "instance might be suffering due to high cpu load on the host" alarm should be triggered for every instance on this host 3. Vitrage notifier creates corresponding alarm definitions in AODH 4. AODH stores these alarms in its database 5. Vitrage triggers the alarms 6. AODH updates the alarms states and notifies about it 7. Horizon user queries AODH for a list of all alarms (we are currently checking the status of a blueprint that should implement it[2]). AODH returns a list that includes the alarms that were triggered by Vitrage. 8. Horizon user selects one of the alarms that Vitrage generated, and asks to see its root cause (we will create a new blueprint for that). Vitrage API returns the RCA information for this alarm. Our current discussion is on steps 3-6 (as far as we understand, and please correct me if I'm wrong, nothing blocks the implementation of the blueprint for step 7). Looking at AODH API again, here is what I think we need to do: 1. Define an alarm with an external_trigger_rule or something like that. This alarm has no metric data. We just want to be able to trigger it and query its state. 2. Use AODH API for triggering this alarm. Will "PUT /v2/alarms/(alarm_id)/state" do the job? Please see also my comments below. Thanks, Ifat. [2] https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page > -Original Message- > From: gord chung [mailto:g...@live.ca] > Sent: Monday, November 23, 2015 9:45 PM > To: openstack-dev@lists.openstack.org > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising > custom alarms in AODH > > > > On 23/11/2015 11:14 AM, AFEK, Ifat (Ifat) wrote: > > I guess I would like to do both: create a new alarm definition, then > > trigger it (call alarm_actions), and possibly later on set its state > > back to OK (call ok_action). > > I understood that currently all alarm triggering is internal in > > AODH, according to threshold/events/combination alarm rules. Would > > it be possible to add a new kind of rule, that will allow triggering > > the alarm externally? > what type of rule? > > i have https://review.openstack.org/#/c/247211 which would > theoretically allow you to push an action into queue which would then > trigger appropriate REST call. not sure if it helps you plug into Aodh > easier or not? We need to add an alarm definition with an "external_rule", and then trigger it. It is important for us that the alarm definition will be stored in AODH database for future queries. As far as I understand, the queue should help only with the triggering? > > -- > gord > -Original Message- > From: Ryota Mibu [mailto:r-m...@cq.jp.nec.com] > Sent: Tuesday, November 24, 2015 10:00 AM > To: OpenStack Development Mailing List (not for usage questions) > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising > custom alarms in AODH > > Hi Ifat, > > > Thank you for starting discussion how AODH can be integrated with > Vitrage that would be a good example of AODH integration with other > OpenStack components. > > The key role of creating alarm definition is to set endpoint > (alarm_actins) which can be receive alarm notification from AODH. How > the endpoints can be set in your use case? Those endpoints are > configured via virtage API and stored in its DB? We have a graph database that will include resources and alarms imported from few sources of information (including Ceilometer), as well as alarms generated by Vitrage. However, we would like our alarms to be stored in AODH as well. If I understood you correctly, we will need the endpoints in order to be notified on Ceilometer alarms. > > I agree
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi Gord, Hi Ryota, (I sent the same mail again in a more readable format) Thanks for your detailed responses. Hope you don't mind that I'm sending one reply to both of your emails. I think it would be easier to have one thread for this discussion. Let me explain our use case in more details. Here is an example of how we would like to integrate with AODH. Let me know what you think about it. 1. Vitrage gets an alarm from Nagios about high cpu load on one of the hosts 2. Vitrage evaluator decides (based on its templates) that an "instance might be suffering due to high cpu load on the host" alarm should be triggered for every instance on this host 3. Vitrage notifier creates corresponding alarm definitions in AODH 4. AODH stores these alarms in its database 5. Vitrage triggers the alarms 6. AODH updates the alarms states and notifies about it 7. Horizon user queries AODH for a list of all alarms (we are currently checking the status of a blueprint that should implement it[2]). AODH returns a list that includes the alarms that were triggered by Vitrage. 8. Horizon user selects one of the alarms that Vitrage generated, and asks to see its root cause (we will create a new blueprint for that). Vitrage API returns the RCA information for this alarm. Our current discussion is on steps 3-6 (as far as we understand, and please correct me if I'm wrong, nothing blocks the implementation of the blueprint for step 7). Looking at AODH API again, here is what I think we need to do: 1. Define an alarm with an external_trigger_rule or something like that. This alarm has no metric data. We just want to be able to trigger it and query its state. 2. Use AODH API for triggering this alarm. Will "PUT /v2/alarms/(alarm_id)/state" do the job? Please see also my comments below. Thanks, Ifat. [2] https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page > -Original Message- > From: gord chung [mailto:g...@live.ca] > Sent: Monday, November 23, 2015 9:45 PM > To: openstack-dev@lists.openstack.org > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising > custom alarms in AODH > > > > On 23/11/2015 11:14 AM, AFEK, Ifat (Ifat) wrote: > > I guess I would like to do both: create a new alarm definition, then > > trigger it (call alarm_actions), and possibly later on set its state > > back to OK (call ok_action). > > I understood that currently all alarm triggering is internal in > > AODH, according to threshold/events/combination alarm rules. Would > > it be possible to add a new kind of rule, that will allow triggering > > the alarm externally? > what type of rule? > > i have https://review.openstack.org/#/c/247211 which would > theoretically allow you to push an action into queue which would then > trigger appropriate REST call. not sure if it helps you plug into Aodh > easier or not? We need to add an alarm definition with an "external_rule", and then trigger it. It is important for us that the alarm definition will be stored in AODH database for future queries. As far as I understand, the queue should help only with the triggering? > > -- > gord > -Original Message- > From: Ryota Mibu [mailto:r-m...@cq.jp.nec.com] > Sent: Tuesday, November 24, 2015 10:00 AM > To: OpenStack Development Mailing List (not for usage questions) > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising > custom alarms in AODH > > Hi Ifat, > > > Thank you for starting discussion how AODH can be integrated with > Vitrage that would be a good example of AODH integration with other > OpenStack components. > > The key role of creating alarm definition is to set endpoint > (alarm_actins) which can be receive alarm notification from AODH. How > the endpoints can be set in your use case? Those endpoints are > configured via virtage API and stored in its DB? We have a graph database that will include resources and alarms imported from few sources of information (including Ceilometer), as well as alarms generated by Vitrage. However, we would like our alarms to be stored in AODH as well. If I understood you correctly, we will need the endpoints in order to be notified on Ceilometer alarms. > > I agree with Gordon, you can use even-alarm with generating "event" > containing alarming message that can be captured in aodh if vitrage > relay the alarm definition to aodh. That is more feasible way rather > than creating alarm definition right before triggering alarm > notification. The reason is that aodh evaluator may not be aware of > new alarm definitions and won't send notification until its alarm > definition cache is refreshed in less than 60 sec (default value). Logically speaking,
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi Gord, Hi Ryota, Thanks for your detailed responses. Hope you don't mind that I'm sending one reply to both of your emails. I think it would be easier to have one thread for this discussion. Let me explain our use case in more details. Here is an example of how we would like to integrate with AODH. Let me know what you think about it. 1. Vitrage gets an alarm from Nagios about high cpu load on one of the hosts 2. Vitrage evaluator decides (based on its templates) that an "instance might be suffering due to high cpu load on the host" alarm should be triggered for every instance on this host 3. Vitrage notifier creates corresponding alarm definitions in AODH 4. AODH stores these alarms in its database 5. Vitrage triggers the alarms 6. AODH updates the alarms states and notifies about it 7. Horizon user queries AODH for a list of all alarms (we are currently checking the status of a blueprint that should implement it[2]). AODH returns a list that includes the alarms that were triggered by Vitrage. 8. Horizon user selects one of the alarms that Vitrage generated, and asks to see its root cause (we will create a new blueprint for that). Vitrage API returns the RCA information for this alarm. Our current discussion is on steps 3-6 (as far as we understand, and please correct me if I'm wrong, nothing blocks the implementation of the blueprint for step 7). Looking at AODH API again, here is what I think we need to do: 1. Define an alarm with an external_trigger_rule or something like that. This alarm has no metric data. We just want to be able to trigger it and query its state. 2. Use AODH API for triggering this alarm. Will "PUT /v2/alarms/(alarm_id)/state" do the job? Please see also my comments below. Thanks, Ifat. [2] https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page > -Original Message- > From: gord chung [mailto:g...@live.ca] > Sent: Monday, November 23, 2015 9:45 PM > To: openstack-dev@lists.openstack.org > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom > alarms in AODH > > > > On 23/11/2015 11:14 AM, AFEK, Ifat (Ifat) wrote: > > I guess I would like to do both: create a new alarm definition, then > > trigger it (call alarm_actions), and possibly later on set its state > > back to OK (call ok_action). > > I understood that currently all alarm triggering is internal in AODH, > > according to threshold/events/combination alarm rules. Would it be > > possible to add a new kind of rule, that will allow triggering the > > alarm externally? > what type of rule? > > i have https://review.openstack.org/#/c/247211 which would > theoretically allow you to push an action into queue which would then > trigger appropriate REST call. not sure if it helps you plug into Aodh > easier or not? We need to add an alarm definition with an "external_rule", and then trigger it. It is important for us that the alarm definition will be stored in AODH database for future queries. As far as I understand, the queue should help only with the triggering? > > -- > gord > -Original Message- > From: Ryota Mibu [mailto:r-m...@cq.jp.nec.com] > Sent: Tuesday, November 24, 2015 10:00 AM > To: OpenStack Development Mailing List (not for usage questions) > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom > alarms in AODH > > Hi Ifat, > > > Thank you for starting discussion how AODH can be integrated with > Vitrage that would be a good example of AODH integration with other > OpenStack components. > > The key role of creating alarm definition is to set endpoint > (alarm_actins) which can be receive alarm notification from AODH. How > the endpoints can be set in your use case? Those endpoints are > configured via virtage API and stored in its DB? We have a graph database that will include resources and alarms imported from few sources of information (including Ceilometer), as well as alarms generated by Vitrage. However, we would like our alarms to be stored in AODH as well. If I understood you correctly, we will need the endpoints in order to be notified on Ceilometer alarms. > > I agree with Gordon, you can use even-alarm with generating "event" > containing alarming message that can be captured in aodh if vitrage > relay the alarm definition to aodh. That is more feasible way rather > than creating alarm definition right before triggering alarm > notification. The reason is that aodh evaluator may not be aware of new > alarm definitions and won't send notification until its alarm > definition cache is refreshed in less than 60 sec (default value). Logically speaking, we would like to create alarms and not events. Our goal is to alert when something is wrong
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi Ifat, Thank you for starting discussion how AODH can be integrated with Vitrage that would be a good example of AODH integration with other OpenStack components. The key role of creating alarm definition is to set endpoint (alarm_actins) which can be receive alarm notification from AODH. How the endpoints can be set in your use case? Those endpoints are configured via virtage API and stored in its DB? I agree with Gordon, you can use even-alarm with generating "event" containing alarming message that can be captured in aodh if vitrage relay the alarm definition to aodh. That is more feasible way rather than creating alarm definition right before triggering alarm notification. The reason is that aodh evaluator may not be aware of new alarm definitions and won't send notification until its alarm definition cache is refreshed in less than 60 sec (default value). Having special rule and external evaluator would be alternative, but it should be difficult to catch up latest aodh, since it will be changed faster with small code base as result of split from ceilometer. BR, Ryota > -Original Message- > From: AFEK, Ifat (Ifat) [mailto:ifat.a...@alcatel-lucent.com] > Sent: Tuesday, November 24, 2015 1:15 AM > To: OpenStack Development Mailing List (not for usage questions) > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom > alarms in AODH > > Hi Gord, > > Please see my answers below. > > Ifat. > > > > -Original Message- > > From: gord chung [mailto:g...@live.ca] > > Sent: Monday, November 23, 2015 4:57 PM > > To: openstack-dev@lists.openstack.org > > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising > > custom alarms in AODH > > > > hi Ifat, > > > > i added some questions below. > > > > On 23/11/2015 7:16 AM, AFEK, Ifat (Ifat) wrote: > > > Hi, > > > > > > We have a couple of questions regarding AODH alarms. > > > > > > In Vitrage[1] project we have two use cases that involve Ceilometer: > > > > > > 1. Import Ceilometer alarms, as well as alarms and resources from > > other sources (Nagios, Zabbix, Nova, Heat, etc.), and produce RCA > > insights about the connection between different alarms. > > to clarify, Ceilometer alarms is deprecated for Aodh and will be > > removed very, very soon. > > Right, I meant Aodh alarms. > > > > > > 2. Raise "deduced alarms". For example, in case we detect a high > > memory consumption on a host, we would like to raise deduced alarms > > saying "instance might be suffering due to high memory consumption on > > the host" on all related instances. Then, we can further deduce that > > applications running on these instances might also be affected, and > > raise alarms on them as well. > > > > > > Initially we planned to raise these deduced alarms in AODH, so other > > Openstack components may consume them as well. Then, when we looked at > > AODH alarms documentation, we noticed that there is currently no way > > of raising custom alarms. We saw only three types of alarms: threshold > > alarms, combination alarms and event alarms. > > > > > > So, our questions are: > > > > > > * Is there an alternative way of raising alarms in AODH? > > what do we mean by raising alarms? do you want to create a new alarm > > definition for Aodh or do you want to trigger an action? do you want > > to have a new non-REST action? > > I guess I would like to do both: create a new alarm definition, then trigger > it (call alarm_actions), and possibly > later on set its state back to OK (call ok_action). > I understood that currently all alarm triggering is internal in AODH, > according to threshold/events/combination alarm > rules. Would it be possible to add a new kind of rule, that will allow > triggering the alarm externally? > > > > > > * Do you think custom alarms belong in AODH? Are you interested in > > adding this capability to AODH? > > > > > > We would be happy to hear your vision and thoughts about it. > > > > > > > > > Thanks, > > > Ifat and Alexey. > > > > > > > > > [1] https://wiki.openstack.org/wiki/Vitrage > > > > > > > > > > > > > > > > > __ > > > OpenStack Development Mailing List (not for usage questions) > > > Unsubscribe: > > > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > > http://lists.openstack.org
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
On 23/11/2015 11:14 AM, AFEK, Ifat (Ifat) wrote: I guess I would like to do both: create a new alarm definition, then trigger it (call alarm_actions), and possibly later on set its state back to OK (call ok_action). I understood that currently all alarm triggering is internal in AODH, according to threshold/events/combination alarm rules. Would it be possible to add a new kind of rule, that will allow triggering the alarm externally? what type of rule? i have https://review.openstack.org/#/c/247211 which would theoretically allow you to push an action into queue which would then trigger appropriate REST call. not sure if it helps you plug into Aodh easier or not? -- gord __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi Gord, Please see my answers below. Ifat. > -Original Message- > From: gord chung [mailto:g...@live.ca] > Sent: Monday, November 23, 2015 4:57 PM > To: openstack-dev@lists.openstack.org > Subject: Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom > alarms in AODH > > hi Ifat, > > i added some questions below. > > On 23/11/2015 7:16 AM, AFEK, Ifat (Ifat) wrote: > > Hi, > > > > We have a couple of questions regarding AODH alarms. > > > > In Vitrage[1] project we have two use cases that involve Ceilometer: > > > > 1. Import Ceilometer alarms, as well as alarms and resources from > other sources (Nagios, Zabbix, Nova, Heat, etc.), and produce RCA > insights about the connection between different alarms. > to clarify, Ceilometer alarms is deprecated for Aodh and will be > removed very, very soon. Right, I meant Aodh alarms. > > > 2. Raise "deduced alarms". For example, in case we detect a high > memory consumption on a host, we would like to raise deduced alarms > saying "instance might be suffering due to high memory consumption on > the host" on all related instances. Then, we can further deduce that > applications running on these instances might also be affected, and > raise alarms on them as well. > > > > Initially we planned to raise these deduced alarms in AODH, so other > Openstack components may consume them as well. Then, when we looked at > AODH alarms documentation, we noticed that there is currently no way of > raising custom alarms. We saw only three types of alarms: threshold > alarms, combination alarms and event alarms. > > > > So, our questions are: > > > > * Is there an alternative way of raising alarms in AODH? > what do we mean by raising alarms? do you want to create a new alarm > definition for Aodh or do you want to trigger an action? do you want to > have a new non-REST action? I guess I would like to do both: create a new alarm definition, then trigger it (call alarm_actions), and possibly later on set its state back to OK (call ok_action). I understood that currently all alarm triggering is internal in AODH, according to threshold/events/combination alarm rules. Would it be possible to add a new kind of rule, that will allow triggering the alarm externally? > > > * Do you think custom alarms belong in AODH? Are you interested in > adding this capability to AODH? > > > > We would be happy to hear your vision and thoughts about it. > > > > > > Thanks, > > Ifat and Alexey. > > > > > > [1] https://wiki.openstack.org/wiki/Vitrage > > > > > > > > > > > __ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: > > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > -- > gord > > > ___ > ___ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: OpenStack-dev- > requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
hi Ifat, i added some questions below. On 23/11/2015 7:16 AM, AFEK, Ifat (Ifat) wrote: Hi, We have a couple of questions regarding AODH alarms. In Vitrage[1] project we have two use cases that involve Ceilometer: 1. Import Ceilometer alarms, as well as alarms and resources from other sources (Nagios, Zabbix, Nova, Heat, etc.), and produce RCA insights about the connection between different alarms. to clarify, Ceilometer alarms is deprecated for Aodh and will be removed very, very soon. 2. Raise "deduced alarms". For example, in case we detect a high memory consumption on a host, we would like to raise deduced alarms saying "instance might be suffering due to high memory consumption on the host" on all related instances. Then, we can further deduce that applications running on these instances might also be affected, and raise alarms on them as well. Initially we planned to raise these deduced alarms in AODH, so other Openstack components may consume them as well. Then, when we looked at AODH alarms documentation, we noticed that there is currently no way of raising custom alarms. We saw only three types of alarms: threshold alarms, combination alarms and event alarms. So, our questions are: * Is there an alternative way of raising alarms in AODH? what do we mean by raising alarms? do you want to create a new alarm definition for Aodh or do you want to trigger an action? do you want to have a new non-REST action? * Do you think custom alarms belong in AODH? Are you interested in adding this capability to AODH? We would be happy to hear your vision and thoughts about it. Thanks, Ifat and Alexey. [1] https://wiki.openstack.org/wiki/Vitrage __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- gord __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [ceilometer][aodh][vitrage] Raising custom alarms in AODH
Hi, We have a couple of questions regarding AODH alarms. In Vitrage[1] project we have two use cases that involve Ceilometer: 1. Import Ceilometer alarms, as well as alarms and resources from other sources (Nagios, Zabbix, Nova, Heat, etc.), and produce RCA insights about the connection between different alarms. 2. Raise "deduced alarms". For example, in case we detect a high memory consumption on a host, we would like to raise deduced alarms saying "instance might be suffering due to high memory consumption on the host" on all related instances. Then, we can further deduce that applications running on these instances might also be affected, and raise alarms on them as well. Initially we planned to raise these deduced alarms in AODH, so other Openstack components may consume them as well. Then, when we looked at AODH alarms documentation, we noticed that there is currently no way of raising custom alarms. We saw only three types of alarms: threshold alarms, combination alarms and event alarms. So, our questions are: * Is there an alternative way of raising alarms in AODH? * Do you think custom alarms belong in AODH? Are you interested in adding this capability to AODH? We would be happy to hear your vision and thoughts about it. Thanks, Ifat and Alexey. [1] https://wiki.openstack.org/wiki/Vitrage __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev