Re: [openstack-dev] [Ironic][Ceilometer] Proposed Change to Sensor meter naming in Ceilometer
On 10/20/2014 6:53 AM, Chris Dent wrote: On Fri, 17 Oct 2014, Jim Mankovich wrote: See answers inline. I don't have any concrete answers as to how to deal with some of questions you brought up, but I do have some more detail that may be useful to further the discussion. That seems like progress to me. Personally, I would like to see the _(0x##) removed form the Sensor ID string (by the ipmitool driver) before it returns sensors to the Ironic conductor. I just don't see any value in this extra info. This 0x## addition only helps if a vendor used the exact same Sensor ID string for multiple sensors of the same sensor type. i.e. Multiple sensors of type Temperature, each with the exact same Sensor ID string of CPU instead of giving each Sensor ID string a unique name like CPU 1 , CPU 2,... Is it worthwhile metadata to save, even if it isn't in the meter name? Removing the _(0x##) from the sensor name and keeping the _(0x##) in the metadata Sensor ID string works for me. In a heterogeneous platform environment, the Sensor ID string is likely going to be different per vendor, so your question If temperate...on any system board...on any hardware, notify the authorities is going to be tough because each vendor may name their system board differently. But, I bet that vendors use similar strings, so worst case, your alarm creation could require 1 alarm definition per vendor. The alarm defintion I want to make is (as an operator not as a dev): My puter's too hot, hlp! Making that easy is the proper (to me) endpoint of a conversation about how to name meters. I understand your operator example, but it could be the case every different vendors putter has a different definition of its too hot temperature. If you are going to act on puters that are too hot, you might believe there is a heat problem with a puter if you lump everything together, but I guess that's an operators choice. It's not really clear to me that this query makes practical sense even though it seems like a logical query to want to make. Note: I' trying to provide puter health information so an operator can easily query platform.health.overall to determine whether or not a puter is healthy and if you really care why you can dig deeper into individual standard puter components like platform.health.fan, platform.health.temperature,... I think this would enable the kind of generic query across platforms that you are thinking about.Health is generated in a vendor and platform specific way by interpretation of all the different sensors. Other vendors than HP could provide these meters and then the query you are proposing would make both logical and practical sense. I see generic naming as somewhat problematic. If you lump all the temperature sensors for a platform under hardware.temperature the consumer will always need to query for a specific temperature sensor that it is interested in, like system board. The notion of having different samples from multiple sensors under a single generic name seems harder to deal with to me. If you have multiple temperature samples under the same generic meter name, how do you figure out what all the possible temperature samples actual exist? I'm not suggestion all temperate sensors under one name (hardware.temperature), but all sensors which identify as the same thing (e.g. hardware.temperature.system_board) under the same name. Good. I'm not very informed about IMPI or hardware sensors, but I do have some experiencing in using names and identifiers (don't we all!) and I find that far too often we name things based on where they come from rather than how we wish to address them after genesis. I understand wantng to name sensors based on how you want to address them, but interpretation of them once you've addressed them is going to vendor dependent. Throughout ceilometer I think there are tons of opportunities to improve the naming of meters and as a result improve the UI for people who want to do things with the data. So from my perspective, with regard to naming IPMI (and other hardware sensor) related samples, I think we need to make a better list of the use cases which the samples need to satisfy and use that to drive a naming scheme. We have 2 use cases, Get all the sensors within a given platform (based on ironic node id) Get all the sensors of a given type/name. independent of platform Others? ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ironic][Ceilometer] Proposed Change to Sensor meter naming in Ceilometer
Chris, Use case point well taken :-) I'll propose something via a spec to ceilometer for sensor naming which will include the ability to support the new health sensor information. From a use case perspective, I want to provide the health of every platform so an administrator can be notified when a platforms health drops below 100%. I also want to provide an administrator the ability to investigate exactly what components in the platform are not working correctly if health is reported at less than 100%. With the current sensor information, the use case I was interested in was the graphical display of individual platform sensor information. Do you happen to know what some of the use cases are for the current reporting of sensor information? Thanks, Jim On 10/20/2014 11:14 AM, Chris Dent wrote: On Mon, 20 Oct 2014, Jim Mankovich wrote: On 10/20/2014 6:53 AM, Chris Dent wrote: On Fri, 17 Oct 2014, Jim Mankovich wrote: See answers inline. I don't have any concrete answers as to how to deal with some of questions you brought up, but I do have some more detail that may be useful to further the discussion. That seems like progress to me. And thanks for keeping it going some more. I'm going to skip your other (very useful) comments and go (almost) straight (below) to one thing which goes to the root of the queries I've been making. Most of the rest of what you said makes sense and we seem to be mostly in agreement. I suppose the next step would be propose a spec? https://github.com/openstack/ceilometer-specs We have 2 use cases, Get all the sensors within a given platform (based on ironic node id) Get all the sensors of a given type/name. independent of platform Others? These are not use cases, these are tasks. That's because these say nothing about the thing you are actually trying to achieve. Get all the sensors with a given platform is a task without a purpose. You're not just going to stop there are you? If so why did you get the information in the first place. A use case could be: * I want to get all the sensors of a given platform so I can do something. Or even better something like: * I want to do something. And the way to do that would just so happen to be getting all the sensors. I realize this is perhaps pedantic hair-splitting, but I think it can be useful at least some of the time. I know that from my own experience I am very rarely able to get the Ceilometer API to give me the information that I actually want (e.g. How many vcpus are currently in action). This feels like the result of data availability driving the query engine rather than vice versa. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ironic][Ceilometer] Proposed Change to Sensor meter naming in Ceilometer
Chris, See answers inline. I don't have any concrete answers as to how to deal with some of questions you brought up, but I do have some more detail that may be useful to further the discussion. On 10/17/2014 11:03 AM, Chris Dent wrote: On Thu, 16 Oct 2014, Jim Mankovich wrote: What I would like to propose is dropping the ipmi string from the name altogether and appending the Sensor ID to the name instead of to the Resource ID. So, transforming the above to the new naming would result in the following: | Name | Type | Unit | Resource ID | hardware.current.power_meter_(0x16) | gauge | W | edafe6f4-5996-4df8-bc84-7d92439e15c0 | hardware.temperature.system_board_(0x15) | gauge | C | edafe6f4-5996-4df8-bc84-7d92439e15c0 [plus sensor_provider in resource_metadata] If this makes sense for the kinds of queries that need to happen then we may as well do it, but I'm not sure it is. When I was writing the consumer code for the notifications the names of the meters was a big open question that was hard to resolve because of insufficient data and input on what people really need to do with the samples. The scenario you've listed is getting all sensors on a given single platform. What about the scenario where you want to create an alarm that says If temperate gets over X on any system board on any of my hardware, notify the authorities? Will having the _(0x##) qualifier allow that to work? I don't actually know, are those qualifiers standard in some way or are they specific to different equipment? If they are different having them in the meter name makes creating a useful alarm in a heterogeneous a bit more of a struggle, doesn't it? The _(0x##) is an ipmitool display artifact that is tacked onto the end of the Sensor ID in order to provide more information beyond what Sensor ID has in it. The ## is the sensor record ID which is specific to IPMI. Whether or not a Sensor ID (sans _(0x##)) is unique is up to the vendor, but in general I believe all vendors will likely name their sensors uniquely; otherwise, how can a person differentiate textually what component in a platform the sensor represents? Personally, I would like to see the _(0x##) removed form the Sensor ID string (by the ipmitool driver) before it returns sensors to the Ironic conductor. I just don't see any value in this extra info. This 0x## addition only helps if a vendor used the exact same Sensor ID string for multiple sensors of the same sensor type. i.e. Multiple sensors of type Temperature, each with the exact same Sensor ID string of CPU instead of giving each Sensor ID string a unique name like CPU 1 , CPU 2,... Now if you want to get deeper into the IPMI realm, (which I don't really want to advocate) the Entity ID Code actually tells you the component. From the IPMI spec section, 43.14 Entity IDs The Entity ID field is used for identifying the physical entity that a sensor or device is associated with. If multiple sensors refer to the same entity, they will have the same Entity ID field value. For example, if a voltage sensor and a temperature sensor are both for a ‘Power Supply 1’ entity the Entity ID in their sensor data records would both be 10 (0Ah), per the Entity ID table. FYI: Entity 10 (0Ah) means power supply. In a heterogeneous platform environment, the Sensor ID string is likely going to be different per vendor, so your question If temperate...on any system board...on any hardware, notify the authorities is going to be tough because each vendor may name their system board differently. But, I bet that vendors use similar strings, so worst case, your alarm creation could require 1 alarm definition per vendor. Perhaps (if they are not standard) this would work: | hardware.current.power_meter | gauge | W | edafe6f4-5996-4df8-bc84-7d92439e15c0 with both sensor_provider and whatever that qualifier is called in the metadata? I see generic naming as somewhat problematic. If you lump all the temperature sensors for a platform under hardware.temperature the consumer will always need to query for a specific temperature sensor that it is interested in, like system board. The notion of having different samples from multiple sensors under a single generic name seems harder to deal with to me. If you have multiple temperature samples under the same generic meter name, how do you figure out what all the possible temperature samples actual exist? Then the name remains sufficiently generic to allow aggregates across multiple systems, while still having the necessary info to narrow to different sensors of the same type. I understand that this proposed change is not backward compatible with the existing naming, but I don't really see a good solution that would retain backward compatibility. I think we should strive to worry less about such things, especially when it's just names in data fields. Not always possible, or even a good idea, but sometimes its a win. I'm always good
[openstack-dev] [Ironic][Ceilometer] Proposed Change to Sensor meter naming in Ceilometer
All, I would like to get some feedback on a proposal to change to the current sensor naming implemented in ironic and ceilometer. I would like to provide vendor specific sensors within the current structure for IPMI sensors in ironic and ceilometer, but I have found that the current implementation of sensor meters in ironic and ceilometer is IPMI specific (from a meter naming perspective) . This is not suitable as it currently stands to support sensor information from a provider other than IPMI.Also, the current Resource ID naming makes it difficult for a consumer of sensors to quickly find all the sensors for a given Ironic Node ID, so I would like to propose changing the Resource ID naming as well. Currently, sensors sent by ironic to ceilometer get named by ceilometer as has hardware.ipmi.SensorType, and the Resource ID is the Ironic Node ID with a post-fix containing the Sensor ID. For Details pertaining to the issue with the Resource ID naming, see https://bugs.launchpad.net/ironic/+bug/1377157, ipmi sensor naming in ceilometer is not consumer friendly Here is an example of what meters look like for sensors in ceilometer with the current implementation: | Name| Type | Unit | Resource ID | hardware.ipmi.current | gauge | W| edafe6f4-5996-4df8-bc84-7d92439e15c0-power_meter_(0x16) | hardware.ipmi.temperature | gauge | C| edafe6f4-5996-4df8-bc84-7d92439e15c0-16-system_board_(0x15) What I would like to propose is dropping the ipmi string from the name altogether and appending the Sensor ID to the name instead of to the Resource ID. So, transforming the above to the new naming would result in the following: | Name | Type | Unit | Resource ID | hardware.current.power_meter_(0x16) | gauge | W| edafe6f4-5996-4df8-bc84-7d92439e15c0 | hardware.temperature.system_board_(0x15) | gauge | C| edafe6f4-5996-4df8-bc84-7d92439e15c0 This structure would provide the ability for a consumer to do a ceilometer resource list using the Ironic Node ID as the Resource ID to get all the sensors in a given platform. The consumer would then then iterate over each of the sensors to get the samples it wanted. In order to retain the information as to who provide the sensors, I would like to propose that a standard sensor_provider field be added to the resource_metadata for every sensor where the sensor_provider field would have a string value indicating the driver that provided the sensor information. This is where the string ipmi, or a vendor specific string would be specified. I understand that this proposed change is not backward compatible with the existing naming, but I don't really see a good solution that would retain backward compatibility. Any/All Feedback will be appreciated, Jim -- --- Jim Mankovich | jm...@hp.com (US Mountain Time) --- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Ironic][Ceilometer] IPMI Sensor naming in Ceilometer
Hello All, In a nutshell, I've found the current IPMI Ceilometer sensor naming structure difficult to deal with from a programmatic perspective if a consumer wants to simply read all the sensors for a given Ironic Node. The current naming scheme just doesn't seem to provide a simple fast way to get all the sensors for a given Ironic Node from Ceilometer. I detailed what I learned and proposed a potential alternative naming structure to the Ironic/Ceilometer bugs as https://bugs.launchpad.net/ironic/+bug/1377157 I am curious as to what the intended use model was for the current naming scheme? Regards, Jim -- --- Jim Mankovich | jm...@hp.com (US Mountain Time) --- ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev