Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project
Doesn't InfluxDB do the same? InfluxDB stores timeseries data primarily. Gnocchi in intended to store strongly-typed OpenStack resource representations (instance, images, etc.) in addition to providing a means to access timeseries data associated with those resources. So to answer your question: no, IIUC, it doesn't do the same thing. Ok, I think I'm getting closer on this. Great! Thanks for the clarification. Sadly, I have more questions :) Any time, Sandy :) Is this closer? a metadata repo for resources (instances, images, etc) + an abstraction to some TSDB(s)? Somewhat closer (more clarification below on the metadata repository aspect, and the completeness/authority of same). Hmm, thinking out loud ... if it's a metadata repo for resources, who is the authoritative source for what the resource is? Ceilometer/Gnocchi or the source service? The source service is authoritative. For example, if I want to query instance power state do I ask ceilometer or Nova? In that scenario, you'd ask nova. If, on the other hand, you wanted to average out the CPU utilization over all instances with a certain metadata attribute set (e.g. some user metadata set by Heat that indicated membership of an autoscaling group), then you'd ask ceilometer. Or is it metadata about the time-series data collected for that resource? Both. But the focus of my preceding remarks was on the latter. In which case, I think most tsdb's have some sort of series description facilities. Sure, and those should be used for metadata related directly to the timeseries (granularity, retention etc.) I guess my question is, what makes this metadata unique and how would it differ from the metadata ceilometer already collects? The primary difference between the way ceilometer currently stores metadata, is the avoidance of per-sample snapshots of resource metadata (as stated in the initial mail on this thread). Will it be using Glance, now that Glance is becoming a pure metadata repo? No, we have no plans to use glance for this. By becoming a pure metadata repo, presumably you mean this spec: https://github.com/openstack/glance-specs/blob/master/specs/juno/metadata-schema-catalog.rst I don't see this on the glance roadmap for Juno: https://blueprints.launchpad.net/glance/juno so presumably the integration of graffiti and glance is still more of a longer term intent, than a present-tense becoming. I'm totally open to correction on this by markwash and others, but my reading of the debate around the recent change in glance's mission statement was that the primary focus in the immediate term was to expand into providing an artifact repository (for artifacts such as Heat templates), while not to *precluding* any future expansion into also providing a metadata repository. The fossil-record of that discussion is here: https://review.openstack.org/98002 Though of course these things are not a million miles from each other, one is just a step up in the abstraction stack, having a wider and more OpenStack-specific scope. Could it be a generic timeseries service? Is it openstack specific because it uses stackforge/python/oslo? No, I meant OpenStack-specific in terms of it understanding something of the nature of OpenStack resources and their ownership (e.g. instances, with some metadata, each being associated with a user tenant etc.) Not OpenStack-specific in the sense that it takes dependencies from oslo or stackforge. As for using python: yes, gnocchi is implemented in python, like much of the rest of OpenStack. However, no, I don't think that choice of implementation language makes it OpenStack-specific. I assume the rules and schemas will be data-driven (vs. hard-coded)? Well one of the ideas was to move away from loosely typed representations of resources in ceilometer, in the form of a dict of metadata containing whatever it contains, and instead decide upfront what was the specific minimal information per resource type that we need to store. ... and since the ceilometer collectors already do the bridge work, is it a pre-packaging of definitions that target openstack specifically? I'm not entirely sure of what you mean by the bridge work in this context. The ceilometer collector effectively acts a concentrator, by persisting the metering messages emitted by the other ceilometer agents (i.e. the compute, central, notification agents) to the metering store. These samples are stored by the collector pretty much as-is, so there's no real bridging going on currently in the collector (in the sense of mapping or transforming). However, the collector is indeed the obvious hook point for ceilometer to emit data to gnocchi. (not sure about wider and more specific) I presume you're thinking oxymoron with wider and more specific? I meant: * wider in the sense that it covers more ground than generic timeseries data storage * more specific in the sense that some of
Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project
Hi Eoghan, Thanks for the note below. However, one thing the overview below does not cover is why InfluxDB (http://influxdb.com/) is not being leveraged. Many folks feel that this technology is a viable solution for the problem space discussed below. Thanks, Brad Brad Topol, Ph.D. IBM Distinguished Engineer OpenStack (919) 543-0646 Internet: bto...@us.ibm.com Assistant: Kendra Witherspoon (919) 254-0680 From: Eoghan Glynn egl...@redhat.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org, Date: 08/06/2014 11:17 AM Subject:[openstack-dev] [tc][ceilometer] Some background on the gnocchi project Folks, It's come to our attention that some key individuals are not fully up-to-date on gnocchi activities, so it being a good and healthy thing to ensure we're as communicative as possible about our roadmap, I've provided a high-level overview here of our thinking. This is intended as a precursor to further discussion with the TC. Cheers, Eoghan What gnocchi is: === Gnocchi is a separate, but related, project spun up on stackforge by Julien Danjou, with the objective of providing efficient storage and retrieval of timeseries-oriented data and resource representations. The goal is to experiment with a potential approach to addressing an architectural misstep made in the very earliest days of ceilometer, specifically the decision to store snapshots of some resource metadata alongside each metric datapoint. The core idea is to move to storing datapoints shorn of metadata, and instead allow the resource-state timeline to be reconstructed more cheaply from much less frequently occurring events (e.g. instance resizes or migrations). What gnocchi isn't: == Gnocchi is not a large-scale under-the-radar rewrite of a core OpenStack component along the lines of keystone-lite. The change is concentrated on the final data-storage phase of the ceilometer pipeline, so will have little initial impact on the data-acquiring agents, or on transformation phase. We've been totally open at the Atlanta summit and other forums about this approach being a multi-cycle effort. Why we decided to do it this way: The intent behind spinning up a separate project on stackforge was to allow the work progress at arms-length from ceilometer, allowing normalcy to be maintained on the core project and a rapid rate of innovation on gnocchi. Note that that the developers primarily contributing to gnocchi represent a cross-section of the core team, and there's a regular feedback loop in the form of a recurring agenda item at the weekly team meeting to avoid the effort becoming silo'd. But isn't re-architecting frowned upon? == Well, the architecture of other OpenStack projects have also under-gone change as the community understanding of the implications of prior design decisions has evolved. Take for example the move towards nova no-db-compute the unified-object-model in order to address issues in the nova architecture that made progress towards rolling upgrades unneccessarily difficult. The point, in my understanding, is not to avoid doing the course-correction where it's deemed necessary. Rather, the principle is more that these corrections happen in an open and planned way. The path forward: A subset of the ceilometer community will continue to work on gnocchi in parallel with the ceilometer core over the remainder of the Juno cycle and into the Kilo timeframe. The goal is to have an initial implementation of gnocchi ready for tech preview by the end of Juno, and to have the integration/migration/ co-existence questions addressed in Kilo. Moving the ceilometer core to using gnocchi will be contingent on it demonstrating the required performance characteristics and providing the semantics needed to support a v3 ceilometer API that's fit-for-purpose. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project
Hi Eoghan, Thanks for the note below. However, one thing the overview below does not cover is why InfluxDB ( http://influxdb.com/ ) is not being leveraged. Many folks feel that this technology is a viable solution for the problem space discussed below. Great question Brad! As it happens we've been working closely with Paul Dix (lead developer of InfluxDB) to ensure that this metrics store would be usable as a backend driver. That conversation actually kicked off at the Juno summit in Atlanta, but it really got off the ground at our mid-cycle meet-up in Paris on in early July. I wrote a rough strawman version of an InfluxDB driver in advance of the mid-cycle to frame the discussion, and Paul Dix traveled to the meet-up so we could have the discussion face-to-face. The conclusion was that InfluxDB would indeed potentially be a great fit, modulo some requirements that we identified during the detailed discussions: * shard-space-based retention backgrounded deletion * capability to merge individual timeseries for cross-aggregation * backfill-aware downsampling The InfluxDB folks have committed to implementing those features in over July and August, and have made concrete progress on that score. I hope that provides enough detail to answer to your question? Cheers, Eoghan Thanks, Brad Brad Topol, Ph.D. IBM Distinguished Engineer OpenStack (919) 543-0646 Internet: bto...@us.ibm.com Assistant: Kendra Witherspoon (919) 254-0680 From: Eoghan Glynn egl...@redhat.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org, Date: 08/06/2014 11:17 AM Subject: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project Folks, It's come to our attention that some key individuals are not fully up-to-date on gnocchi activities, so it being a good and healthy thing to ensure we're as communicative as possible about our roadmap, I've provided a high-level overview here of our thinking. This is intended as a precursor to further discussion with the TC. Cheers, Eoghan What gnocchi is: === Gnocchi is a separate, but related, project spun up on stackforge by Julien Danjou, with the objective of providing efficient storage and retrieval of timeseries-oriented data and resource representations. The goal is to experiment with a potential approach to addressing an architectural misstep made in the very earliest days of ceilometer, specifically the decision to store snapshots of some resource metadata alongside each metric datapoint. The core idea is to move to storing datapoints shorn of metadata, and instead allow the resource-state timeline to be reconstructed more cheaply from much less frequently occurring events (e.g. instance resizes or migrations). What gnocchi isn't: == Gnocchi is not a large-scale under-the-radar rewrite of a core OpenStack component along the lines of keystone-lite. The change is concentrated on the final data-storage phase of the ceilometer pipeline, so will have little initial impact on the data-acquiring agents, or on transformation phase. We've been totally open at the Atlanta summit and other forums about this approach being a multi-cycle effort. Why we decided to do it this way: The intent behind spinning up a separate project on stackforge was to allow the work progress at arms-length from ceilometer, allowing normalcy to be maintained on the core project and a rapid rate of innovation on gnocchi. Note that that the developers primarily contributing to gnocchi represent a cross-section of the core team, and there's a regular feedback loop in the form of a recurring agenda item at the weekly team meeting to avoid the effort becoming silo'd. But isn't re-architecting frowned upon? == Well, the architecture of other OpenStack projects have also under-gone change as the community understanding of the implications of prior design decisions has evolved. Take for example the move towards nova no-db-compute the unified-object-model in order to address issues in the nova architecture that made progress towards rolling upgrades unneccessarily difficult. The point, in my understanding, is not to avoid doing the course-correction where it's deemed necessary. Rather, the principle is more that these corrections happen in an open and planned way. The path forward: A subset of the ceilometer community will continue to work on gnocchi in parallel with the ceilometer core over the remainder of the Juno cycle and into the Kilo timeframe. The goal is to have an initial implementation of gnocchi ready for tech preview by the end of Juno, and to have the integration/migration/ co-existence questions addressed in Kilo. Moving the ceilometer core to using gnocchi will be contingent
Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project
On 8/11/2014 4:22 PM, Eoghan Glynn wrote: Hi Eoghan, Thanks for the note below. However, one thing the overview below does not cover is why InfluxDB ( http://influxdb.com/ ) is not being leveraged. Many folks feel that this technology is a viable solution for the problem space discussed below. Great question Brad! As it happens we've been working closely with Paul Dix (lead developer of InfluxDB) to ensure that this metrics store would be usable as a backend driver. That conversation actually kicked off at the Juno summit in Atlanta, but it really got off the ground at our mid-cycle meet-up in Paris on in early July. ... The InfluxDB folks have committed to implementing those features in over July and August, and have made concrete progress on that score. I hope that provides enough detail to answer to your question? I guess it begs the question, if influxdb will do what you want and it's open source (MIT) as well as commercially supported, how does gnocchi differentiate? Cheers, Eoghan Thanks, Brad Brad Topol, Ph.D. IBM Distinguished Engineer OpenStack (919) 543-0646 Internet: bto...@us.ibm.com Assistant: Kendra Witherspoon (919) 254-0680 From: Eoghan Glynn egl...@redhat.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org, Date: 08/06/2014 11:17 AM Subject: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project Folks, It's come to our attention that some key individuals are not fully up-to-date on gnocchi activities, so it being a good and healthy thing to ensure we're as communicative as possible about our roadmap, I've provided a high-level overview here of our thinking. This is intended as a precursor to further discussion with the TC. Cheers, Eoghan What gnocchi is: === Gnocchi is a separate, but related, project spun up on stackforge by Julien Danjou, with the objective of providing efficient storage and retrieval of timeseries-oriented data and resource representations. The goal is to experiment with a potential approach to addressing an architectural misstep made in the very earliest days of ceilometer, specifically the decision to store snapshots of some resource metadata alongside each metric datapoint. The core idea is to move to storing datapoints shorn of metadata, and instead allow the resource-state timeline to be reconstructed more cheaply from much less frequently occurring events (e.g. instance resizes or migrations). What gnocchi isn't: == Gnocchi is not a large-scale under-the-radar rewrite of a core OpenStack component along the lines of keystone-lite. The change is concentrated on the final data-storage phase of the ceilometer pipeline, so will have little initial impact on the data-acquiring agents, or on transformation phase. We've been totally open at the Atlanta summit and other forums about this approach being a multi-cycle effort. Why we decided to do it this way: The intent behind spinning up a separate project on stackforge was to allow the work progress at arms-length from ceilometer, allowing normalcy to be maintained on the core project and a rapid rate of innovation on gnocchi. Note that that the developers primarily contributing to gnocchi represent a cross-section of the core team, and there's a regular feedback loop in the form of a recurring agenda item at the weekly team meeting to avoid the effort becoming silo'd. But isn't re-architecting frowned upon? == Well, the architecture of other OpenStack projects have also under-gone change as the community understanding of the implications of prior design decisions has evolved. Take for example the move towards nova no-db-compute the unified-object-model in order to address issues in the nova architecture that made progress towards rolling upgrades unneccessarily difficult. The point, in my understanding, is not to avoid doing the course-correction where it's deemed necessary. Rather, the principle is more that these corrections happen in an open and planned way. The path forward: A subset of the ceilometer community will continue to work on gnocchi in parallel with the ceilometer core over the remainder of the Juno cycle and into the Kilo timeframe. The goal is to have an initial implementation of gnocchi ready for tech preview by the end of Juno, and to have the integration/migration/ co-existence questions addressed in Kilo. Moving the ceilometer core to using gnocchi will be contingent on it demonstrating the required performance characteristics and providing the semantics needed to support a v3 ceilometer API that's fit-for-purpose. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo
Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project
On 8/11/2014 4:22 PM, Eoghan Glynn wrote: Hi Eoghan, Thanks for the note below. However, one thing the overview below does not cover is why InfluxDB ( http://influxdb.com/ ) is not being leveraged. Many folks feel that this technology is a viable solution for the problem space discussed below. Great question Brad! As it happens we've been working closely with Paul Dix (lead developer of InfluxDB) to ensure that this metrics store would be usable as a backend driver. That conversation actually kicked off at the Juno summit in Atlanta, but it really got off the ground at our mid-cycle meet-up in Paris on in early July. ... The InfluxDB folks have committed to implementing those features in over July and August, and have made concrete progress on that score. I hope that provides enough detail to answer to your question? I guess it begs the question, if influxdb will do what you want and it's open source (MIT) as well as commercially supported, how does gnocchi differentiate? Hi Sandy, One of the ideas behind gnocchi is to combine resource representation and timeseries-oriented storage of metric data, providing an efficient and convenient way to query for metric data associated with individual resources. Also, having an API layered above the storage driver avoids locking in directly with a particular metrics-oriented DB, allowing for the potential to support multiple storage driver options (e.g. to choose between a canonical implementation based on Swift, an InfluxDB driver, and an OpenTSDB driver, say). A less compelling reason would be to provide a well-defined hook point to innovate with aggregation/analytic logic not supported natively in the underlying drivers (e.g. period-spanning statistics such as exponentially-weighted moving average or even Holt-Winters). Cheers, Eoghan Cheers, Eoghan Thanks, Brad Brad Topol, Ph.D. IBM Distinguished Engineer OpenStack (919) 543-0646 Internet: bto...@us.ibm.com Assistant: Kendra Witherspoon (919) 254-0680 From: Eoghan Glynn egl...@redhat.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org, Date: 08/06/2014 11:17 AM Subject: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project Folks, It's come to our attention that some key individuals are not fully up-to-date on gnocchi activities, so it being a good and healthy thing to ensure we're as communicative as possible about our roadmap, I've provided a high-level overview here of our thinking. This is intended as a precursor to further discussion with the TC. Cheers, Eoghan What gnocchi is: === Gnocchi is a separate, but related, project spun up on stackforge by Julien Danjou, with the objective of providing efficient storage and retrieval of timeseries-oriented data and resource representations. The goal is to experiment with a potential approach to addressing an architectural misstep made in the very earliest days of ceilometer, specifically the decision to store snapshots of some resource metadata alongside each metric datapoint. The core idea is to move to storing datapoints shorn of metadata, and instead allow the resource-state timeline to be reconstructed more cheaply from much less frequently occurring events (e.g. instance resizes or migrations). What gnocchi isn't: == Gnocchi is not a large-scale under-the-radar rewrite of a core OpenStack component along the lines of keystone-lite. The change is concentrated on the final data-storage phase of the ceilometer pipeline, so will have little initial impact on the data-acquiring agents, or on transformation phase. We've been totally open at the Atlanta summit and other forums about this approach being a multi-cycle effort. Why we decided to do it this way: The intent behind spinning up a separate project on stackforge was to allow the work progress at arms-length from ceilometer, allowing normalcy to be maintained on the core project and a rapid rate of innovation on gnocchi. Note that that the developers primarily contributing to gnocchi represent a cross-section of the core team, and there's a regular feedback loop in the form of a recurring agenda item at the weekly team meeting to avoid the effort becoming silo'd. But isn't re-architecting frowned upon? == Well, the architecture of other OpenStack projects have also under-gone change as the community understanding of the implications of prior design decisions has evolved. Take for example the move towards nova no-db-compute the unified-object-model in order to address issues in the nova architecture that made progress towards rolling upgrades unneccessarily difficult. The point, in my
Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project
On 8/11/2014 5:29 PM, Eoghan Glynn wrote: On 8/11/2014 4:22 PM, Eoghan Glynn wrote: Hi Eoghan, Thanks for the note below. However, one thing the overview below does not cover is why InfluxDB ( http://influxdb.com/ ) is not being leveraged. Many folks feel that this technology is a viable solution for the problem space discussed below. Great question Brad! As it happens we've been working closely with Paul Dix (lead developer of InfluxDB) to ensure that this metrics store would be usable as a backend driver. That conversation actually kicked off at the Juno summit in Atlanta, but it really got off the ground at our mid-cycle meet-up in Paris on in early July. ... The InfluxDB folks have committed to implementing those features in over July and August, and have made concrete progress on that score. I hope that provides enough detail to answer to your question? I guess it begs the question, if influxdb will do what you want and it's open source (MIT) as well as commercially supported, how does gnocchi differentiate? Hi Sandy, One of the ideas behind gnocchi is to combine resource representation and timeseries-oriented storage of metric data, providing an efficient and convenient way to query for metric data associated with individual resources. Doesn't InfluxDB do the same? Also, having an API layered above the storage driver avoids locking in directly with a particular metrics-oriented DB, allowing for the potential to support multiple storage driver options (e.g. to choose between a canonical implementation based on Swift, an InfluxDB driver, and an OpenTSDB driver, say). Right, I'm not suggesting to remove the storage abstraction layer. I'm just curious what gnocchi does better/different than InfluxDB? Or, am I missing the objective here and gnocchi is the abstraction layer and not an influxdb alternative? If so, my apologies for the confusion. A less compelling reason would be to provide a well-defined hook point to innovate with aggregation/analytic logic not supported natively in the underlying drivers (e.g. period-spanning statistics such as exponentially-weighted moving average or even Holt-Winters). Cheers, Eoghan Cheers, Eoghan Thanks, Brad Brad Topol, Ph.D. IBM Distinguished Engineer OpenStack (919) 543-0646 Internet: bto...@us.ibm.com Assistant: Kendra Witherspoon (919) 254-0680 From: Eoghan Glynn egl...@redhat.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org, Date: 08/06/2014 11:17 AM Subject: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project Folks, It's come to our attention that some key individuals are not fully up-to-date on gnocchi activities, so it being a good and healthy thing to ensure we're as communicative as possible about our roadmap, I've provided a high-level overview here of our thinking. This is intended as a precursor to further discussion with the TC. Cheers, Eoghan What gnocchi is: === Gnocchi is a separate, but related, project spun up on stackforge by Julien Danjou, with the objective of providing efficient storage and retrieval of timeseries-oriented data and resource representations. The goal is to experiment with a potential approach to addressing an architectural misstep made in the very earliest days of ceilometer, specifically the decision to store snapshots of some resource metadata alongside each metric datapoint. The core idea is to move to storing datapoints shorn of metadata, and instead allow the resource-state timeline to be reconstructed more cheaply from much less frequently occurring events (e.g. instance resizes or migrations). What gnocchi isn't: == Gnocchi is not a large-scale under-the-radar rewrite of a core OpenStack component along the lines of keystone-lite. The change is concentrated on the final data-storage phase of the ceilometer pipeline, so will have little initial impact on the data-acquiring agents, or on transformation phase. We've been totally open at the Atlanta summit and other forums about this approach being a multi-cycle effort. Why we decided to do it this way: The intent behind spinning up a separate project on stackforge was to allow the work progress at arms-length from ceilometer, allowing normalcy to be maintained on the core project and a rapid rate of innovation on gnocchi. Note that that the developers primarily contributing to gnocchi represent a cross-section of the core team, and there's a regular feedback loop in the form of a recurring agenda item at the weekly team meeting to avoid the effort becoming silo'd. But isn't re-architecting frowned upon? == Well, the architecture of other OpenStack projects have also under-gone change as the community understanding of the implications of prior design decisions has
Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project
On 2014-08-11 5:13 PM, Sandy Walsh wrote: Right, I'm not suggesting to remove the storage abstraction layer. I'm just curious what gnocchi does better/different than InfluxDB? I was at the OpenStack Design Summit when Gnocchi was presented. Soon after the basic goals and technical details of Gnocchi were presented, people wondered why InfluxDB wasn't used. AFAIK, people presenting Gnocchi didn't know about InfluxDB so they weren't able to answer the question. I don't really blame them. At that time, I didn't know anything about Gnocchi, even less about InfluxDB but rapidly learned that both are DataSeries databases/services. What I would have answered to that question is (IMO): Gnocchi is a new project tackling the need for a DataSeries database/storage as a service. Pandas/Swift is used as an implementation reference. Some people love Swift and will use it everywhere they can, nothing wrong with it. (or lets not go down that path) Or, am I missing the objective here and gnocchi is the abstraction layer and not an influxdb alternative? If so, my apologies for the confusion. InfluxDB can't be used as-is by OpenStack services. There needs to be an abstraction layer somewhere. As Gnocchi is (or will be) well written, people will be free to drop the Swift implementation and replace it by whatever they want: InfluxDB, Blueflood, RRD, Whisper, plain text files, in-memory, /dev/null, etc. But we first need to start somewhere with one implementation and Pandas/Swift was chosen. I'm confident people will soon start proposing alternative storage backends/implementations better fitting their needs and tastes. -- Mathieu ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project
On 8/11/2014 4:22 PM, Eoghan Glynn wrote: Hi Eoghan, Thanks for the note below. However, one thing the overview below does not cover is why InfluxDB ( http://influxdb.com/ ) is not being leveraged. Many folks feel that this technology is a viable solution for the problem space discussed below. Great question Brad! As it happens we've been working closely with Paul Dix (lead developer of InfluxDB) to ensure that this metrics store would be usable as a backend driver. That conversation actually kicked off at the Juno summit in Atlanta, but it really got off the ground at our mid-cycle meet-up in Paris on in early July. ... The InfluxDB folks have committed to implementing those features in over July and August, and have made concrete progress on that score. I hope that provides enough detail to answer to your question? I guess it begs the question, if influxdb will do what you want and it's open source (MIT) as well as commercially supported, how does gnocchi differentiate? Hi Sandy, One of the ideas behind gnocchi is to combine resource representation and timeseries-oriented storage of metric data, providing an efficient and convenient way to query for metric data associated with individual resources. Doesn't InfluxDB do the same? InfluxDB stores timeseries data primarily. Gnocchi in intended to store strongly-typed OpenStack resource representations (instance, images, etc.) in addition to providing a means to access timeseries data associated with those resources. So to answer your question: no, IIUC, it doesn't do the same thing. Though of course these things are not a million miles from each other, one is just a step up in the abstraction stack, having a wider and more OpenStack-specific scope. Also, having an API layered above the storage driver avoids locking in directly with a particular metrics-oriented DB, allowing for the potential to support multiple storage driver options (e.g. to choose between a canonical implementation based on Swift, an InfluxDB driver, and an OpenTSDB driver, say). Right, I'm not suggesting to remove the storage abstraction layer. I'm just curious what gnocchi does better/different than InfluxDB? Or, am I missing the objective here and gnocchi is the abstraction layer and not an influxdb alternative? If so, my apologies for the confusion. No worries :) The intention is for gnocchi to provide an abstraction over timeseries, aggregation, downsampling and archiving/retention policies, with a number of drivers mapping onto real timeseries storage options. One of those drivers is based on Swift, another is in the works based on InfluxDB, and a third based on OpenTSDB has also been proposed. Cheers, Eoghan ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project
On 8/11/2014 6:49 PM, Eoghan Glynn wrote: On 8/11/2014 4:22 PM, Eoghan Glynn wrote: Hi Eoghan, Thanks for the note below. However, one thing the overview below does not cover is why InfluxDB ( http://influxdb.com/ ) is not being leveraged. Many folks feel that this technology is a viable solution for the problem space discussed below. Great question Brad! As it happens we've been working closely with Paul Dix (lead developer of InfluxDB) to ensure that this metrics store would be usable as a backend driver. That conversation actually kicked off at the Juno summit in Atlanta, but it really got off the ground at our mid-cycle meet-up in Paris on in early July. ... The InfluxDB folks have committed to implementing those features in over July and August, and have made concrete progress on that score. I hope that provides enough detail to answer to your question? I guess it begs the question, if influxdb will do what you want and it's open source (MIT) as well as commercially supported, how does gnocchi differentiate? Hi Sandy, One of the ideas behind gnocchi is to combine resource representation and timeseries-oriented storage of metric data, providing an efficient and convenient way to query for metric data associated with individual resources. Doesn't InfluxDB do the same? InfluxDB stores timeseries data primarily. Gnocchi in intended to store strongly-typed OpenStack resource representations (instance, images, etc.) in addition to providing a means to access timeseries data associated with those resources. So to answer your question: no, IIUC, it doesn't do the same thing. Ok, I think I'm getting closer on this. Thanks for the clarification. Sadly, I have more questions :) Is this closer? a metadata repo for resources (instances, images, etc) + an abstraction to some TSDB(s)? Hmm, thinking out loud ... if it's a metadata repo for resources, who is the authoritative source for what the resource is? Ceilometer/Gnocchi or the source service? For example, if I want to query instance power state do I ask ceilometer or Nova? Or is it metadata about the time-series data collected for that resource? In which case, I think most tsdb's have some sort of series description facilities. I guess my question is, what makes this metadata unique and how would it differ from the metadata ceilometer already collects? Will it be using Glance, now that Glance is becoming a pure metadata repo? Though of course these things are not a million miles from each other, one is just a step up in the abstraction stack, having a wider and more OpenStack-specific scope. Could it be a generic timeseries service? Is it openstack specific because it uses stackforge/python/oslo? I assume the rules and schemas will be data-driven (vs. hard-coded)? ... and since the ceilometer collectors already do the bridge work, is it a pre-packaging of definitions that target openstack specifically? (not sure about wider and more specific) Sorry if this was already hashed out in Atlanta. Also, having an API layered above the storage driver avoids locking in directly with a particular metrics-oriented DB, allowing for the potential to support multiple storage driver options (e.g. to choose between a canonical implementation based on Swift, an InfluxDB driver, and an OpenTSDB driver, say). Right, I'm not suggesting to remove the storage abstraction layer. I'm just curious what gnocchi does better/different than InfluxDB? Or, am I missing the objective here and gnocchi is the abstraction layer and not an influxdb alternative? If so, my apologies for the confusion. No worries :) The intention is for gnocchi to provide an abstraction over timeseries, aggregation, downsampling and archiving/retention policies, with a number of drivers mapping onto real timeseries storage options. One of those drivers is based on Swift, another is in the works based on InfluxDB, and a third based on OpenTSDB has also been proposed. Cheers, Eoghan ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [tc][ceilometer] Some background on the gnocchi project
Folks, It's come to our attention that some key individuals are not fully up-to-date on gnocchi activities, so it being a good and healthy thing to ensure we're as communicative as possible about our roadmap, I've provided a high-level overview here of our thinking. This is intended as a precursor to further discussion with the TC. Cheers, Eoghan What gnocchi is: === Gnocchi is a separate, but related, project spun up on stackforge by Julien Danjou, with the objective of providing efficient storage and retrieval of timeseries-oriented data and resource representations. The goal is to experiment with a potential approach to addressing an architectural misstep made in the very earliest days of ceilometer, specifically the decision to store snapshots of some resource metadata alongside each metric datapoint. The core idea is to move to storing datapoints shorn of metadata, and instead allow the resource-state timeline to be reconstructed more cheaply from much less frequently occurring events (e.g. instance resizes or migrations). What gnocchi isn't: == Gnocchi is not a large-scale under-the-radar rewrite of a core OpenStack component along the lines of keystone-lite. The change is concentrated on the final data-storage phase of the ceilometer pipeline, so will have little initial impact on the data-acquiring agents, or on transformation phase. We've been totally open at the Atlanta summit and other forums about this approach being a multi-cycle effort. Why we decided to do it this way: The intent behind spinning up a separate project on stackforge was to allow the work progress at arms-length from ceilometer, allowing normalcy to be maintained on the core project and a rapid rate of innovation on gnocchi. Note that that the developers primarily contributing to gnocchi represent a cross-section of the core team, and there's a regular feedback loop in the form of a recurring agenda item at the weekly team meeting to avoid the effort becoming silo'd. But isn't re-architecting frowned upon? == Well, the architecture of other OpenStack projects have also under-gone change as the community understanding of the implications of prior design decisions has evolved. Take for example the move towards nova no-db-compute the unified-object-model in order to address issues in the nova architecture that made progress towards rolling upgrades unneccessarily difficult. The point, in my understanding, is not to avoid doing the course-correction where it's deemed necessary. Rather, the principle is more that these corrections happen in an open and planned way. The path forward: A subset of the ceilometer community will continue to work on gnocchi in parallel with the ceilometer core over the remainder of the Juno cycle and into the Kilo timeframe. The goal is to have an initial implementation of gnocchi ready for tech preview by the end of Juno, and to have the integration/migration/ co-existence questions addressed in Kilo. Moving the ceilometer core to using gnocchi will be contingent on it demonstrating the required performance characteristics and providing the semantics needed to support a v3 ceilometer API that's fit-for-purpose. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev