Re: [openstack-dev] [Horizon] [UX] Design for Alarming and Alarm Management

2014-06-16 Thread Eoghan Glynn

Apologies for the top-posting, but just wanted to call out some
potential confusion that arose on the #os-ceilometer channel earlier
today.

TL;DR: the UI shouldn't assume a 1:1 mapping between alarms and
   resources, since this mapping does not exist in general

Background: See ML post[1]

Discussion: See IRC log [2]
Ctrl+F: Let's see what the UI guys think about it

Cheers,
Eoghan

[1] http://lists.openstack.org/pipermail/openstack-dev/2014-June/037788.html
[2] 
http://eavesdrop.openstack.org/irclogs/%23openstack-ceilometer/%23openstack-ceilometer.2014-06-16.log


- Original Message -
 Hi all,
 
 Thanks again for the great comments on the initial cut of wireframes. I’ve
 updated them a fair amount based on feedback in this e-mail thread along
 with the feedback written up here:
 https://etherpad.openstack.org/p/alarm-management-page-design-discussion
 
 Here is a link to the new version:
 http://people.redhat.com/~lsurette/OpenStack/Alarm%20Management%20-%202014-06-05.pdf
 
 And a quick explanation of the updates that I made from the last version:
 
 1) Removed severity.
 
 2) Added Status column. I also added details around the fact that users can
 enable/disable alerts.
 
 3) Updated Alarm creation workflow to include choosing the project and user
 (optionally for filtering the resource list), choosing resource, and
 allowing for choose of amount of time to monitor for alarming.
  -Perhaps we could be even more sophisticated for how we let users filter
  down to find the right resources that they want to monitor for alarms?
 
 4) As for notifying users…I’ve updated the “Alarms” section to be “Alarms
 History”. The point here is to show any Alarms that have occurred to notify
 the user. Other notification ideas could be to allow users to get notified
 of alerts via e-mail (perhaps a user setting?). I’ve added a wireframe for
 this update in User Settings. Then the Alarms Management section would just
 be where the user creates, deletes, enables, and disables alarms. Do you
 still think we don’t need the “alarms” tab? Perhaps this just becomes
 iteration 2 and is left out for now as you mention in your etherpad.
 
 5) Question about combined alarms…currently I’ve designed it so that a user
 could create multiple levels in the “Alarm When…” section. They could
 combine these with AND/ORs. Is this going far enough? Or do we actually need
 to allow users to combine Alarms that might watch different resources?
 
 6) I updated the Actions column to have the “More” drop down which is
 consistent with other tables in Horizon.
 
 7) Added in a section in the “Add Alarm” workflow for “Actions after Alarm”.
 I’m thinking we could have some sort of If State is X, do X type selections,
 but I’m looking to understand more details about how the backend works for
 this feature. Eoghan gave examples of logging and potentially scaling out
 via Heat. Would simple drop downs support these events?
 
 8) I can definitely add in a “scheduling” feature with respect to Alarms. I
 haven’t added it in yet, but I could see this being very useful in future
 revisions of this feature.
 
 9) Another though is that we could add in some padding for outlier data as
 Eoghan mentioned. Perhaps a setting for “This has happened 3 times over the
 last minute, so now send an alarm.”?
 
 A new round of feedback is of course welcome :)
 
 Best,
 Liz
 
 On Jun 4, 2014, at 1:27 PM, Liz Blanchard lsure...@redhat.com wrote:
 
  Thanks for the excellent feedback on these, guys! I’ll be working on making
  updates over the next week and will send a fresh link out when done.
  Anyone else with feedback, please feel free to fire away.
  
  Best,
  Liz
  On Jun 4, 2014, at 12:33 PM, Eoghan Glynn egl...@redhat.com wrote:
  
  
  Hi Liz,
  
  Two further thoughts occurred to me after hitting send on
  my previous mail.
  
  First, is the concept of alarm dimensioning; see my RDO Ceilometer
  getting started guide[1] for an explanation of that notion.
  
  A key associated concept is the notion of dimensioning which defines the
  set of matching meters that feed into an alarm evaluation. Recall that
  meters are per-resource-instance, so in the simplest case an alarm might
  be defined over a particular meter applied to all resources visible to a
  particular user. More useful however would the option to explicitly
  select which specific resources we're interested in alarming on. On one
  extreme we would have narrowly dimensioned alarms where this selection
  would have only a single target (identified by resource ID). On the other
  extreme, we'd have widely dimensioned alarms where this selection
  identifies many resources over which the statistic is aggregated, for
  example all instances booted from a particular image or all instances
  with matching user metadata (the latter is how Heat identifies
  autoscaling groups).
  
  We'd have to think about how that concept is captured in the
  UX for alarm 

Re: [openstack-dev] [Horizon] [UX] Design for Alarming and Alarm Management

2014-06-16 Thread Liz Blanchard

On Jun 16, 2014, at 10:56 AM, Eoghan Glynn egl...@redhat.com wrote:

 
 Apologies for the top-posting, but just wanted to call out some
 potential confusion that arose on the #os-ceilometer channel earlier
 today.
 
 TL;DR: the UI shouldn't assume a 1:1 mapping between alarms and
   resources, since this mapping does not exist in general

Thanks for the clarification on this Eoghan. After reading the IRC chat and 
e-mail thread I’m now understanding that there are alarms that can be created 
for things like “Alarm me when a new instance is created” that have nothing to 
do with monitoring instances. Am I correct? Are there other cases we should 
consider here? I’ve updated the latest version of wireframes to reflect an 
example of an alarm like this (See Alarm 4 in tables). Also, I got rid of the 
required mark on Resource in the Add Alarm modal. I will be sending a link 
these updated wireframes along with feedback to Christian’s latest comments in 
the next few minutes...

Best,
Liz

 
 Background: See ML post[1]
 
 Discussion: See IRC log [2]
Ctrl+F: Let's see what the UI guys think about it
 
 Cheers,
 Eoghan
 
 [1] http://lists.openstack.org/pipermail/openstack-dev/2014-June/037788.html
 [2] 
 http://eavesdrop.openstack.org/irclogs/%23openstack-ceilometer/%23openstack-ceilometer.2014-06-16.log
 
 
 - Original Message -
 Hi all,
 
 Thanks again for the great comments on the initial cut of wireframes. I’ve
 updated them a fair amount based on feedback in this e-mail thread along
 with the feedback written up here:
 https://etherpad.openstack.org/p/alarm-management-page-design-discussion
 
 Here is a link to the new version:
 http://people.redhat.com/~lsurette/OpenStack/Alarm%20Management%20-%202014-06-05.pdf
 
 And a quick explanation of the updates that I made from the last version:
 
 1) Removed severity.
 
 2) Added Status column. I also added details around the fact that users can
 enable/disable alerts.
 
 3) Updated Alarm creation workflow to include choosing the project and user
 (optionally for filtering the resource list), choosing resource, and
 allowing for choose of amount of time to monitor for alarming.
 -Perhaps we could be even more sophisticated for how we let users filter
 down to find the right resources that they want to monitor for alarms?
 
 4) As for notifying users…I’ve updated the “Alarms” section to be “Alarms
 History”. The point here is to show any Alarms that have occurred to notify
 the user. Other notification ideas could be to allow users to get notified
 of alerts via e-mail (perhaps a user setting?). I’ve added a wireframe for
 this update in User Settings. Then the Alarms Management section would just
 be where the user creates, deletes, enables, and disables alarms. Do you
 still think we don’t need the “alarms” tab? Perhaps this just becomes
 iteration 2 and is left out for now as you mention in your etherpad.
 
 5) Question about combined alarms…currently I’ve designed it so that a user
 could create multiple levels in the “Alarm When…” section. They could
 combine these with AND/ORs. Is this going far enough? Or do we actually need
 to allow users to combine Alarms that might watch different resources?
 
 6) I updated the Actions column to have the “More” drop down which is
 consistent with other tables in Horizon.
 
 7) Added in a section in the “Add Alarm” workflow for “Actions after Alarm”.
 I’m thinking we could have some sort of If State is X, do X type selections,
 but I’m looking to understand more details about how the backend works for
 this feature. Eoghan gave examples of logging and potentially scaling out
 via Heat. Would simple drop downs support these events?
 
 8) I can definitely add in a “scheduling” feature with respect to Alarms. I
 haven’t added it in yet, but I could see this being very useful in future
 revisions of this feature.
 
 9) Another though is that we could add in some padding for outlier data as
 Eoghan mentioned. Perhaps a setting for “This has happened 3 times over the
 last minute, so now send an alarm.”?
 
 A new round of feedback is of course welcome :)
 
 Best,
 Liz
 
 On Jun 4, 2014, at 1:27 PM, Liz Blanchard lsure...@redhat.com wrote:
 
 Thanks for the excellent feedback on these, guys! I’ll be working on making
 updates over the next week and will send a fresh link out when done.
 Anyone else with feedback, please feel free to fire away.
 
 Best,
 Liz
 On Jun 4, 2014, at 12:33 PM, Eoghan Glynn egl...@redhat.com wrote:
 
 
 Hi Liz,
 
 Two further thoughts occurred to me after hitting send on
 my previous mail.
 
 First, is the concept of alarm dimensioning; see my RDO Ceilometer
 getting started guide[1] for an explanation of that notion.
 
 A key associated concept is the notion of dimensioning which defines the
 set of matching meters that feed into an alarm evaluation. Recall that
 meters are per-resource-instance, so in the simplest case an alarm might
 be defined over a particular 

Re: [openstack-dev] [Horizon] [UX] Design for Alarming and Alarm Management

2014-06-16 Thread Eoghan Glynn


 On Jun 16, 2014, at 10:56 AM, Eoghan Glynn egl...@redhat.com wrote:
 
  
  Apologies for the top-posting, but just wanted to call out some
  potential confusion that arose on the #os-ceilometer channel earlier
  today.
  
  TL;DR: the UI shouldn't assume a 1:1 mapping between alarms and
resources, since this mapping does not exist in general
 
 Thanks for the clarification on this Eoghan. After reading the IRC chat and
 e-mail thread I’m now understanding that there are alarms that can be
 created for things like “Alarm me when a new instance is created” that have
 nothing to do with monitoring instances. Am I correct?

More something like:

 Alarm me when the average CPU util throughout all instances in an
  autoscaling group suggests that the group is under-scaled

In that case, the alarm may map onto zero resources initially, then
N actually-existing resources at any given point in time (where N
lies between some high and low water marks, but is not constant in
time).

That's an example of an 1:N mapping between alarm and resource names,
but where the set of N resource names is potentially constantly varying
(or apparently static, if the load on the autoscaling group is relatively
constant).

 Are there other cases we should consider here? 

Another example would be:

 Alarm me when the number of instances owned by a particular tenant
  exceeds some threshold

(... actually, that would require an update to the alarm API to
 accommodate the new selectable cardinality aggregate, but would
 be easy to do) 

Well, I'd recommend removing the concept of alarms and resources being
*directly* tied to each other.

Cheers,
Eoghan

 I’ve updated the latest version of wireframes to
 reflect an example of an alarm like this (See Alarm 4 in tables). Also, I
 got rid of the required mark on Resource in the Add Alarm modal. I will be
 sending a link these updated wireframes along with feedback to Christian’s
 latest comments in the next few minutes...
 
 Best,
 Liz
 
  
  Background: See ML post[1]
  
  Discussion: See IRC log [2]
 Ctrl+F: Let's see what the UI guys think about it
  
  Cheers,
  Eoghan
  
  [1]
  http://lists.openstack.org/pipermail/openstack-dev/2014-June/037788.html
  [2]
  http://eavesdrop.openstack.org/irclogs/%23openstack-ceilometer/%23openstack-ceilometer.2014-06-16.log
  
  
  - Original Message -
  Hi all,
  
  Thanks again for the great comments on the initial cut of wireframes. I’ve
  updated them a fair amount based on feedback in this e-mail thread along
  with the feedback written up here:
  https://etherpad.openstack.org/p/alarm-management-page-design-discussion
  
  Here is a link to the new version:
  http://people.redhat.com/~lsurette/OpenStack/Alarm%20Management%20-%202014-06-05.pdf
  
  And a quick explanation of the updates that I made from the last version:
  
  1) Removed severity.
  
  2) Added Status column. I also added details around the fact that users
  can
  enable/disable alerts.
  
  3) Updated Alarm creation workflow to include choosing the project and
  user
  (optionally for filtering the resource list), choosing resource, and
  allowing for choose of amount of time to monitor for alarming.
  -Perhaps we could be even more sophisticated for how we let users
  filter
  down to find the right resources that they want to monitor for alarms?
  
  4) As for notifying users…I’ve updated the “Alarms” section to be “Alarms
  History”. The point here is to show any Alarms that have occurred to
  notify
  the user. Other notification ideas could be to allow users to get notified
  of alerts via e-mail (perhaps a user setting?). I’ve added a wireframe for
  this update in User Settings. Then the Alarms Management section would
  just
  be where the user creates, deletes, enables, and disables alarms. Do you
  still think we don’t need the “alarms” tab? Perhaps this just becomes
  iteration 2 and is left out for now as you mention in your etherpad.
  
  5) Question about combined alarms…currently I’ve designed it so that a
  user
  could create multiple levels in the “Alarm When…” section. They could
  combine these with AND/ORs. Is this going far enough? Or do we actually
  need
  to allow users to combine Alarms that might watch different resources?
  
  6) I updated the Actions column to have the “More” drop down which is
  consistent with other tables in Horizon.
  
  7) Added in a section in the “Add Alarm” workflow for “Actions after
  Alarm”.
  I’m thinking we could have some sort of If State is X, do X type
  selections,
  but I’m looking to understand more details about how the backend works for
  this feature. Eoghan gave examples of logging and potentially scaling out
  via Heat. Would simple drop downs support these events?
  
  8) I can definitely add in a “scheduling” feature with respect to Alarms.
  I
  haven’t added it in yet, but I could see this being very useful in future
  revisions of this feature.
  
  9) Another 

Re: [openstack-dev] [Horizon] [UX] Design for Alarming and Alarm Management

2014-06-10 Thread Martinez, Christian
Here my feedback regarding the designs:

Page 2:

* I think that the admin would probably want to filter alarms per user, 
project, name, meter_name, current_alarm_state(ok=alarm ready; 
insufficient data = alarm not ready; alarm =alarm triggered), but we 
don't have all that columns on the table. Maybe it will be better just to add 
columns for those fields, or have another tables or tabs that could allow the 
admin to see the alarms based on that parameters.

* I would add a delete alarm button as a table action

* Nice to have: if we are thinking about combining alarms, maybe 
having a combine alarm button as table action that gets activated when the 
admin selects two or more alarms.

o   When the button is clicked, it should show something like the Add Alarm 
dialog, allowing the user to create a new combined alarm, based on their 
previous alarm selection

Page 3-5:

* Love the workflow!

* A couple of things related to the Alarm When setup:

o   Depending on the resource that is selected (from page 2) you would have a 
list of the possible meters to be considered. For example, if your resource is 
an instance, you would have the following list of meters: number of instances, 
cpu time used, Average CPU utilization, memory, etc. This will also affect the 
threshold unit to be used. In the design, there is a textbox that has a 
percentage label (%) right next to it. The thing is that this threshold 
could be a percentage (for example, CPU utilization), but it could be a flat 
number as well (for example, number of instances on the project).

o   (Related to your point 5) There are two things related to combined alarms 
that we need to consider.

?  1) the combination can be between any type of alarm: you could combine 
alarms associated to different resources, meters, users? (Ceilometer expert 
will know). You even could combine combined alarms with other alarms as well. 
The AND and OR operation between the alarms can be used for combined alarms. 
For instance, combine two alarms with an OR operator

?  2) Adding two rules to match to a single alarm is not supported by 
Ceilometer. For that, you use combined alarms :). The idea of adding triggering 
rules to the alarm creation dialog is great for me, but I'm not sure if 
Ceilometer supports that.

Page 6:

* Really liked the way that actions and state could be set, but we 
should see how the notifications will be handled. Maybe these actions could be 
set by default in our first version and after that, start thinking about 
setting custom actions for alarm states in the future (same for email add-on  
at the user settings)

Page 7:  Viewing Alarm History A.K.A: the alarms that have occurred.

* Same as page 2: I think that the admin would probably want to filter 
alarms per user, project, name, meter_name, etc. (for instance, to see what 
alarms have being triggered on the project X), but we don't have that columns 
on the table. Maybe it will be better just to add columns for those fields, or 
have another tables or tabs that could allow the admin to see the alarms based 
on that parameters.

* Is the alarm date column referring to the date in which the alarm was 
created or the date in which the alarm was triggered?

* Is the alarm name content a link or a simple text? What would happen 
when the admin selects an alarm? Is It going to show the update alarm dialog? 
Are there any actions associated to the rows?

* Maybe changing the name of the tab to Activated alarms or smth that 
actually it's interpreted as in here you can see the alarms that have 
occurred.

Hope it helps

Cheers,
H

From: Liz Blanchard [mailto:lsure...@redhat.com]
Sent: Monday, June 9, 2014 2:36 PM
To: Eoghan Glynn
Cc: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Horizon] [UX] Design for Alarming and Alarm 
Management

Hi all,

Thanks again for the great comments on the initial cut of wireframes. I've 
updated them a fair amount based on feedback in this e-mail thread along with 
the feedback written up here:
https://etherpad.openstack.org/p/alarm-management-page-design-discussion

Here is a link to the new version:
http://people.redhat.com/~lsurette/OpenStack/Alarm%20Management%20-%202014-06-05.pdf

And a quick explanation of the updates that I made from the last version:

1) Removed severity.

2) Added Status column. I also added details around the fact that users can 
enable/disable alerts.

3) Updated Alarm creation workflow to include choosing the project and user 
(optionally for filtering the resource list), choosing resource, and allowing 
for choose of amount of time to monitor for alarming.
 -Perhaps we could be even more sophisticated for how we let users filter 
down to find the right resources that they want to monitor for alarms?

4) As for notifying users...I've updated the Alarms section to be Alarms 
History. The point here

Re: [openstack-dev] [Horizon] [UX] Design for Alarming and Alarm Management

2014-06-09 Thread Liz Blanchard
Hi all,

Thanks again for the great comments on the initial cut of wireframes. I’ve 
updated them a fair amount based on feedback in this e-mail thread along with 
the feedback written up here:
https://etherpad.openstack.org/p/alarm-management-page-design-discussion

Here is a link to the new version:
http://people.redhat.com/~lsurette/OpenStack/Alarm%20Management%20-%202014-06-05.pdf

And a quick explanation of the updates that I made from the last version:

1) Removed severity.

2) Added Status column. I also added details around the fact that users can 
enable/disable alerts.

3) Updated Alarm creation workflow to include choosing the project and user 
(optionally for filtering the resource list), choosing resource, and allowing 
for choose of amount of time to monitor for alarming.
 -Perhaps we could be even more sophisticated for how we let users filter 
down to find the right resources that they want to monitor for alarms?

4) As for notifying users…I’ve updated the “Alarms” section to be “Alarms 
History”. The point here is to show any Alarms that have occurred to notify the 
user. Other notification ideas could be to allow users to get notified of 
alerts via e-mail (perhaps a user setting?). I’ve added a wireframe for this 
update in User Settings. Then the Alarms Management section would just be where 
the user creates, deletes, enables, and disables alarms. Do you still think we 
don’t need the “alarms” tab? Perhaps this just becomes iteration 2 and is left 
out for now as you mention in your etherpad.

5) Question about combined alarms…currently I’ve designed it so that a user 
could create multiple levels in the “Alarm When…” section. They could combine 
these with AND/ORs. Is this going far enough? Or do we actually need to allow 
users to combine Alarms that might watch different resources?

6) I updated the Actions column to have the “More” drop down which is 
consistent with other tables in Horizon.

7) Added in a section in the “Add Alarm” workflow for “Actions after Alarm”. 
I’m thinking we could have some sort of If State is X, do X type selections, 
but I’m looking to understand more details about how the backend works for this 
feature. Eoghan gave examples of logging and potentially scaling out via Heat. 
Would simple drop downs support these events?

8) I can definitely add in a “scheduling” feature with respect to Alarms. I 
haven’t added it in yet, but I could see this being very useful in future 
revisions of this feature.

9) Another though is that we could add in some padding for outlier data as 
Eoghan mentioned. Perhaps a setting for “This has happened 3 times over the 
last minute, so now send an alarm.”?  

A new round of feedback is of course welcome :)

Best,
Liz

On Jun 4, 2014, at 1:27 PM, Liz Blanchard lsure...@redhat.com wrote:

 Thanks for the excellent feedback on these, guys! I’ll be working on making 
 updates over the next week and will send a fresh link out when done. Anyone 
 else with feedback, please feel free to fire away.
 
 Best,
 Liz
 On Jun 4, 2014, at 12:33 PM, Eoghan Glynn egl...@redhat.com wrote:
 
 
 Hi Liz,
 
 Two further thoughts occurred to me after hitting send on
 my previous mail.
 
 First, is the concept of alarm dimensioning; see my RDO Ceilometer
 getting started guide[1] for an explanation of that notion.
 
 A key associated concept is the notion of dimensioning which defines the 
 set of matching meters that feed into an alarm evaluation. Recall that 
 meters are per-resource-instance, so in the simplest case an alarm might be 
 defined over a particular meter applied to all resources visible to a 
 particular user. More useful however would the option to explicitly select 
 which specific resources we're interested in alarming on. On one extreme we 
 would have narrowly dimensioned alarms where this selection would have only 
 a single target (identified by resource ID). On the other extreme, we'd have 
 widely dimensioned alarms where this selection identifies many resources 
 over which the statistic is aggregated, for example all instances booted 
 from a particular image or all instances with matching user metadata (the 
 latter is how Heat identifies autoscaling groups).
 
 We'd have to think about how that concept is captured in the
 UX for alarm creation/update.
 
 Second, there are a couple of more advanced alarming features 
 that were added in Icehouse:
 
 1. The ability to constrain alarms on time ranges, such that they
  would only fire say during 9-to-5 on a weekday. This would
  allow for example different autoscaling policies to be applied
  out-of-hours, when resource usage is likely to be cheaper and
  manual remediation less straight-forward.
 
 2. The ability to exclude low-quality datapoints with anomolously
  low sample counts. This allows the leading edge of the trend of
  widely dimensioned alarms not to be skewed by eagerly-reporting
  outliers.
 
 Perhaps not in a first iteration, but at some point it may 

Re: [openstack-dev] [Horizon] [UX] Design for Alarming and Alarm Management

2014-06-04 Thread Eoghan Glynn

Comments inline ...

 Hi Liz,
 
 The designs look really cool and I think that we should consider a couple of
 things (more related to the alarm’s implementation made at Ceilometer):
 
 · There are combined alarms, which are a combination of two or more alarms.
 We need to see how they work and how we can show/modify them (or even if we
 want to show them)

+1

Combined alarms allow a meta-alarm to be layered over several under-pinning
alarms, with their state combined using logical AND or OR.
 
 · Currently, the alarms doesn’t have a severity field. Which will be the
 intention to have this? Is to be able to filter by “alarm severity”? Is to
 have a way to distinguish the “not-so-critical” alarms that the ones that
 are critical?

No such concept currently.

 · The alarms have a “list of actions” to be executed based on their current
 state. I think that the intention of that feature was to create alarms that
 could manage and trigger different actions based on their “alarm state”. For
 instance, if an alarm is created but doesn’t have enough data to be
 evaluated, the state is “insufficient data”, and you can add actions to be
 triggered when this happens, for instance writing a LOG file or calling an
 URL. Maybe we could use this functionality that to notify the user whenever
 an alarm is triggered and we also should consider that when creating or
 updating the alarms as well.

Alarm actions are currently either:

1. log the event

2. POST out to a webhook with a notification of the state change and related
   data (e.g. the recent datapoints).

In reality, all non-toy alarms would have action of form #2.

This is the form used by Heat for example when autoscaling is driven by
ceilometer alarms.

Re. the authorization of such actions in the alarm notification consumer,
one of two approaches are generally used:

1. pre-sign the webhook URL with the EC2 signer (this depends on the
   physical security of the URL being maintained, i.e. the URL not being
   leaked by ceilometer, or in this case horizon)

2. use the new-fangled keystone trusts

Heat originally used approach #1, but is changing over to approach #2 for Juno.

Actions are then associated with a target state (alarm, ok, insufficient_data)
with most alarm actions in practice being associated with the transition into
the alarm state. Multiple actions can be associated with a single target state.

By default, actions are only executed when the alarm state transition fires.

However, a continuous notification mode can be enabled on the alarm (such
that the actions are repeated on each alarm evaluation cycle as long as the
alarm *remains* in the target state).

 
 
 More related to Alarms in general :
 
 · What are the ideas around the alarm notifications? I saw that your
 intention is to have some sort of “g+ notifications” but what about other
 solutions/options, like email (using Mistral, perhaps’), logs. What do you
 guys think about that?

Current only webhook notifications are supported.

But the idea for the last couple of cycles has been to leverage Marconi
SNS-style user-consumable notifications (email, SMS, tweets etc.) when 
if this becomes available.

 · The alarms could be created by the users as well.. I would add that CRUD
 functionality on the alarms tab on the overview section as well.

+1

Cheers,
Eoghan

 
 
 
 Hope it helps
 
 
 
 Regards,
 
 H
 
 
 From: Liz Blanchard [mailto:lsure...@redhat.com]
 Sent: Tuesday, June 3, 2014 3:41 PM
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: [openstack-dev] [Horizon] [UX] Design for Alarming and Alarm
 Management
 
 
 
 
 Hi All,
 
 
 
 
 
 I’ve recently put together a set of wireframes[1] around Alarm Management
 that would support the following blueprint:
 
 
 https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page
 
 
 
 
 
 If you have a chance it would be great to hear any feedback that folks have
 on this direction moving forward with Alarms.
 
 
 
 
 
 Best,
 
 
 Liz
 
 
 
 
 
 [1]
 http://people.redhat.com/~lsurette/OpenStack/Alarm%20Management%20-%202014-05-30.pdf
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Horizon] [UX] Design for Alarming and Alarm Management

2014-06-04 Thread Eoghan Glynn

Hi Liz,

Two further thoughts occurred to me after hitting send on
my previous mail.

First, is the concept of alarm dimensioning; see my RDO Ceilometer
getting started guide[1] for an explanation of that notion.

A key associated concept is the notion of dimensioning which defines the set 
of matching meters that feed into an alarm evaluation. Recall that meters are 
per-resource-instance, so in the simplest case an alarm might be defined over a 
particular meter applied to all resources visible to a particular user. More 
useful however would the option to explicitly select which specific resources 
we're interested in alarming on. On one extreme we would have narrowly 
dimensioned alarms where this selection would have only a single target 
(identified by resource ID). On the other extreme, we'd have widely dimensioned 
alarms where this selection identifies many resources over which the statistic 
is aggregated, for example all instances booted from a particular image or all 
instances with matching user metadata (the latter is how Heat identifies 
autoscaling groups).

We'd have to think about how that concept is captured in the
UX for alarm creation/update.

Second, there are a couple of more advanced alarming features 
that were added in Icehouse:

1. The ability to constrain alarms on time ranges, such that they
   would only fire say during 9-to-5 on a weekday. This would
   allow for example different autoscaling policies to be applied
   out-of-hours, when resource usage is likely to be cheaper and
   manual remediation less straight-forward.

2. The ability to exclude low-quality datapoints with anomolously
   low sample counts. This allows the leading edge of the trend of
   widely dimensioned alarms not to be skewed by eagerly-reporting
   outliers.

Perhaps not in a first iteration, but at some point it may make sense
to expose these more advanced features in the UI.

Cheers,
Eoghan

[1] http://openstack.redhat.com/CeilometerQuickStart



- Original Message -
 
 Hi Liz,
 
 Looks great!
 
 Some thoughts on the wireframe doc:
 
 * The description of form:
 
 If CPU Utilization exceeds 80%, send alarm.
   
   misses the time-window aspect of the alarm definition.
 
   Whereas the boilerplate default descriptions generated by
   ceilometer itself:
 
 cpu_util  70.0 during 3 x 600s
 
   captures this important info.
 
 * The metric names, e.g. CPU Utilization, are not an exact
   match for the meter names used by ceilometer, e.g. cpu_util.
 
 * Non-admin users can create alarms in ceilometer:
 
   This is where admins can come in and
define and edit any alarms they want
the environment to use.
 
   (though these alarms will only have visibility onto the stats
that would be accessible to the user on behalf of whom the
alarm is being evaluated)
 
 * There's no concept currently of alarm severity.
 
 * Should users be able to enable/dis-able alarms.
 
   Yes, the API allows for disabled (i.e. non-evaluated) alarms.
 
 * Should users be able to own/assign alarms?
 
   Only admin users can create an alarm on behalf of another
   user/tenant.
 
 * Should users be able to acknowledge, close alarms?
 
   No, we have no concept of ACKing an alarm.
 
 * Admins can also see a full list of all Alarms that have
taken place in the past.
 
   In ceilometer terminology, we refer to this as alarm history
   or alarm change events.
 
 * CPU Utilization exceeded 80%.
 
   Again good to capture the duration in that description of the
   event.
 
 * Within the Overview section, there should be a new tab that allows the
user to click and view all Alarms that have occurred in their
environment.
 
   Not sure really what environment means here. Non-admin tenants only
   have visibility to their own alarm, whereas admins have visibility to
   all alarms.
 
 * This list would keep the latest  alarms.
 
   Presumably this would be based on querying the alarm-history API,
   as opposed to an assumption that Horizon is consuming the actual
   alarm notifications?
 
 Cheers,
 Eoghan
 
 - Original Message -
  Hi All,
  
  I’ve recently put together a set of wireframes[1] around Alarm Management
  that would support the following blueprint:
  https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page
  
  If you have a chance it would be great to hear any feedback that folks have
  on this direction moving forward with Alarms.
  
  Best,
  Liz
  
  [1]
  http://people.redhat.com/~lsurette/OpenStack/Alarm%20Management%20-%202014-05-30.pdf
  
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___

Re: [openstack-dev] [Horizon] [UX] Design for Alarming and Alarm Management

2014-06-04 Thread Liz Blanchard
Thanks for the excellent feedback on these, guys! I’ll be working on making 
updates over the next week and will send a fresh link out when done. Anyone 
else with feedback, please feel free to fire away.

Best,
Liz
On Jun 4, 2014, at 12:33 PM, Eoghan Glynn egl...@redhat.com wrote:

 
 Hi Liz,
 
 Two further thoughts occurred to me after hitting send on
 my previous mail.
 
 First, is the concept of alarm dimensioning; see my RDO Ceilometer
 getting started guide[1] for an explanation of that notion.
 
 A key associated concept is the notion of dimensioning which defines the set 
 of matching meters that feed into an alarm evaluation. Recall that meters are 
 per-resource-instance, so in the simplest case an alarm might be defined over 
 a particular meter applied to all resources visible to a particular user. 
 More useful however would the option to explicitly select which specific 
 resources we're interested in alarming on. On one extreme we would have 
 narrowly dimensioned alarms where this selection would have only a single 
 target (identified by resource ID). On the other extreme, we'd have widely 
 dimensioned alarms where this selection identifies many resources over which 
 the statistic is aggregated, for example all instances booted from a 
 particular image or all instances with matching user metadata (the latter is 
 how Heat identifies autoscaling groups).
 
 We'd have to think about how that concept is captured in the
 UX for alarm creation/update.
 
 Second, there are a couple of more advanced alarming features 
 that were added in Icehouse:
 
 1. The ability to constrain alarms on time ranges, such that they
   would only fire say during 9-to-5 on a weekday. This would
   allow for example different autoscaling policies to be applied
   out-of-hours, when resource usage is likely to be cheaper and
   manual remediation less straight-forward.
 
 2. The ability to exclude low-quality datapoints with anomolously
   low sample counts. This allows the leading edge of the trend of
   widely dimensioned alarms not to be skewed by eagerly-reporting
   outliers.
 
 Perhaps not in a first iteration, but at some point it may make sense
 to expose these more advanced features in the UI.
 
 Cheers,
 Eoghan
 
 [1] http://openstack.redhat.com/CeilometerQuickStart
 
 
 
 - Original Message -
 
 Hi Liz,
 
 Looks great!
 
 Some thoughts on the wireframe doc:
 
 * The description of form:
 
If CPU Utilization exceeds 80%, send alarm.
 
  misses the time-window aspect of the alarm definition.
 
  Whereas the boilerplate default descriptions generated by
  ceilometer itself:
 
cpu_util  70.0 during 3 x 600s
 
  captures this important info.
 
 * The metric names, e.g. CPU Utilization, are not an exact
  match for the meter names used by ceilometer, e.g. cpu_util.
 
 * Non-admin users can create alarms in ceilometer:
 
  This is where admins can come in and
   define and edit any alarms they want
   the environment to use.
 
  (though these alarms will only have visibility onto the stats
   that would be accessible to the user on behalf of whom the
   alarm is being evaluated)
 
 * There's no concept currently of alarm severity.
 
 * Should users be able to enable/dis-able alarms.
 
  Yes, the API allows for disabled (i.e. non-evaluated) alarms.
 
 * Should users be able to own/assign alarms?
 
  Only admin users can create an alarm on behalf of another
  user/tenant.
 
 * Should users be able to acknowledge, close alarms?
 
  No, we have no concept of ACKing an alarm.
 
 * Admins can also see a full list of all Alarms that have
   taken place in the past.
 
  In ceilometer terminology, we refer to this as alarm history
  or alarm change events.
 
 * CPU Utilization exceeded 80%.
 
  Again good to capture the duration in that description of the
  event.
 
 * Within the Overview section, there should be a new tab that allows the
   user to click and view all Alarms that have occurred in their
   environment.
 
  Not sure really what environment means here. Non-admin tenants only
  have visibility to their own alarm, whereas admins have visibility to
  all alarms.
 
 * This list would keep the latest  alarms.
 
  Presumably this would be based on querying the alarm-history API,
  as opposed to an assumption that Horizon is consuming the actual
  alarm notifications?
 
 Cheers,
 Eoghan
 
 - Original Message -
 Hi All,
 
 I’ve recently put together a set of wireframes[1] around Alarm Management
 that would support the following blueprint:
 https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page
 
 If you have a chance it would be great to hear any feedback that folks have
 on this direction moving forward with Alarms.
 
 Best,
 Liz
 
 [1]
 http://people.redhat.com/~lsurette/OpenStack/Alarm%20Management%20-%202014-05-30.pdf
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 

[openstack-dev] [Horizon] [UX] Design for Alarming and Alarm Management

2014-06-03 Thread Liz Blanchard
Hi All,

I’ve recently put together a set of wireframes[1] around Alarm Management that 
would support the following blueprint:
https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page

If you have a chance it would be great to hear any feedback that folks have on 
this direction moving forward with Alarms.

Best,
Liz

[1] 
http://people.redhat.com/~lsurette/OpenStack/Alarm%20Management%20-%202014-05-30.pdf___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Horizon] [UX] Design for Alarming and Alarm Management

2014-06-03 Thread Martinez, Christian
Hi Liz,
The designs look really cool and I think that we should consider a couple of 
things (more related to the alarm's implementation made at Ceilometer):

* There are combined alarms, which are a combination of two or more 
alarms. We need to see how they work and how we can show/modify them (or even 
if we want to show them)

* Currently, the alarms doesn't have a severity field. Which will be 
the intention to have this? Is to be able to filter by alarm severity? Is to 
have a way to distinguish the not-so-critical alarms that the ones that are 
critical?

* The alarms have a list of actions to be executed based on their 
current state. I think that the intention of that feature was to create alarms 
that could manage and trigger different actions based on their alarm state. 
For instance, if an alarm is created but doesn't have enough data to be 
evaluated, the state is insufficient data, and you can add actions to be 
triggered when this happens, for instance writing a LOG file or calling an URL. 
Maybe we could use this functionality that to notify the user whenever an alarm 
is triggered and we also should consider that when creating or updating the 
alarms as well.

More related to Alarms in general :

* What are the ideas around the alarm notifications? I saw that your 
intention is to have some sort of g+ notifications but what about other 
solutions/options, like email (using Mistral, perhaps'), logs. What do you guys 
think about that?

* The alarms could be created by the users as well.. I would add that 
CRUD functionality on the alarms tab on the overview section as well.

Hope it helps

Regards,
H
From: Liz Blanchard [mailto:lsure...@redhat.com]
Sent: Tuesday, June 3, 2014 3:41 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: [openstack-dev] [Horizon] [UX] Design for Alarming and Alarm Management

Hi All,

I've recently put together a set of wireframes[1] around Alarm Management that 
would support the following blueprint:
https://blueprints.launchpad.net/horizon/+spec/ceilometer-alarm-management-page

If you have a chance it would be great to hear any feedback that folks have on 
this direction moving forward with Alarms.

Best,
Liz

[1] 
http://people.redhat.com/~lsurette/OpenStack/Alarm%20Management%20-%202014-05-30.pdf
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev