Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project

2014-08-12 Thread Eoghan Glynn


  Doesn't InfluxDB do the same?
  InfluxDB stores timeseries data primarily.
 
  Gnocchi in intended to store strongly-typed OpenStack resource
  representations (instance, images, etc.) in addition to providing
  a means to access timeseries data associated with those resources.
 
  So to answer your question: no, IIUC, it doesn't do the same thing.
 
 Ok, I think I'm getting closer on this.

Great!

 Thanks for the clarification. Sadly, I have more questions :)

Any time, Sandy :)
 
 Is this closer? a metadata repo for resources (instances, images, etc)
 + an abstraction to some TSDB(s)?

Somewhat closer (more clarification below on the metadata repository
aspect, and the completeness/authority of same).

 Hmm, thinking out loud ... if it's a metadata repo for resources, who is
 the authoritative source for what the resource is? Ceilometer/Gnocchi or
 the source service?

The source service is authoritative.

 For example, if I want to query instance power state do I ask ceilometer
 or Nova?

In that scenario, you'd ask nova.

If, on the other hand, you wanted to average out the CPU utilization
over all instances with a certain metadata attribute set (e.g. some
user metadata set by Heat that indicated membership of an autoscaling
group), then you'd ask ceilometer.

 Or is it metadata about the time-series data collected for that
 resource?

Both. But the focus of my preceding remarks was on the latter.

 In which case, I think most tsdb's have some sort of series
 description facilities.

Sure, and those should be used for metadata related directly to
the timeseries (granularity, retention etc.)

 I guess my question is, what makes this metadata unique and how
 would it differ from the metadata ceilometer already collects?

The primary difference between the way ceilometer currently stores
metadata, is the avoidance of per-sample snapshots of resource
metadata (as stated in the initial mail on this thread).
 
 Will it be using Glance, now that Glance is becoming a pure metadata repo?

No, we have no plans to use glance for this.

By becoming a pure metadata repo, presumably you mean this spec:

  
https://github.com/openstack/glance-specs/blob/master/specs/juno/metadata-schema-catalog.rst

I don't see this on the glance roadmap for Juno:

  https://blueprints.launchpad.net/glance/juno 

so presumably the integration of graffiti and glance is still more
of a longer term intent, than a present-tense becoming.

I'm totally open to correction on this by markwash and others,
but my reading of the debate around the recent change in glance's
mission statement was that the primary focus in the immediate
term was to expand into providing an artifact repository (for
artifacts such as Heat templates), while not to *precluding* any
future expansion into also providing a metadata repository.

The fossil-record of that discussion is here:

  https://review.openstack.org/98002

  Though of course these things are not a million miles from each
  other, one is just a step up in the abstraction stack, having a
  wider and more OpenStack-specific scope.
 
 Could it be a generic timeseries service? Is it openstack specific
 because it uses stackforge/python/oslo?

No, I meant OpenStack-specific in terms of it understanding
something of the nature of OpenStack resources and their ownership
(e.g. instances, with some metadata, each being associated with a
user  tenant etc.)

Not OpenStack-specific in the sense that it takes dependencies from
oslo or stackforge.

As for using python: yes, gnocchi is implemented in python, like
much of the rest of OpenStack.  However, no, I don't think that
choice of implementation language makes it OpenStack-specific.

 I assume the rules and schemas will be data-driven (vs. hard-coded)?

Well one of the ideas was to move away from loosely typed
representations of resources in ceilometer, in the form of a dict
of metadata containing whatever it contains, and instead decide
upfront what was the specific minimal information per resource
type that we need to store.

 ... and since the ceilometer collectors already do the bridge work, is
 it a pre-packaging of definitions that target openstack specifically?

I'm not entirely sure of what you mean by the bridge work in
this context.

The ceilometer collector effectively acts a concentrator, by
persisting the metering messages emitted by the other ceilometer
agents (i.e. the compute, central,  notification agents) to the
metering store.

These samples are stored by the collector pretty much as-is, so
there's no real bridging going on currently in the collector (in
the sense of mapping or transforming).

However, the collector is indeed the obvious hook point for
ceilometer to emit data to gnocchi.

 (not sure about wider and more specific)

I presume you're thinking oxymoron with wider and more specific?

I meant:

 * wider in the sense that it covers more ground than generic
   timeseries data storage

 * more specific in the sense that some of 

Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project

2014-08-11 Thread Brad Topol
Hi Eoghan,

Thanks for the note below.  However, one thing the overview below does not 
 cover  is why InfluxDB (http://influxdb.com/) is not being leveraged. 
Many folks feel that this technology is a viable solution for the problem 
space discussed below.

Thanks,

Brad


Brad Topol, Ph.D.
IBM Distinguished Engineer
OpenStack
(919) 543-0646
Internet:  bto...@us.ibm.com
Assistant: Kendra Witherspoon (919) 254-0680



From:   Eoghan Glynn egl...@redhat.com
To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.org, 
Date:   08/06/2014 11:17 AM
Subject:[openstack-dev] [tc][ceilometer] Some background on the 
gnocchi project




Folks,

It's come to our attention that some key individuals are not
fully up-to-date on gnocchi activities, so it being a good and
healthy thing to ensure we're as communicative as possible about
our roadmap, I've provided a high-level overview here of our
thinking. This is intended as a precursor to further discussion
with the TC.

Cheers,
Eoghan


What gnocchi is:
===

Gnocchi is a separate, but related, project spun up on stackforge
by Julien Danjou, with the objective of providing efficient
storage and retrieval of timeseries-oriented data and resource
representations.

The goal is to experiment with a potential approach to addressing
an architectural misstep made in the very earliest days of
ceilometer, specifically the decision to store snapshots of some
resource metadata alongside each metric datapoint. The core idea
is to move to storing datapoints shorn of metadata, and instead
allow the resource-state timeline to be reconstructed more cheaply
from much less frequently occurring events (e.g. instance resizes
or migrations).


What gnocchi isn't:
==

Gnocchi is not a large-scale under-the-radar rewrite of a core
OpenStack component along the lines of keystone-lite.

The change is concentrated on the final data-storage phase of
the ceilometer pipeline, so will have little initial impact on the
data-acquiring agents, or on transformation phase.

We've been totally open at the Atlanta summit and other forums
about this approach being a multi-cycle effort.


Why we decided to do it this way:


The intent behind spinning up a separate project on stackforge
was to allow the work progress at arms-length from ceilometer,
allowing normalcy to be maintained on the core project and a
rapid rate of innovation on gnocchi.

Note that that the developers primarily contributing to gnocchi
represent a cross-section of the core team, and there's a regular
feedback loop in the form of a recurring agenda item at the
weekly team meeting to avoid the effort becoming silo'd.


But isn't re-architecting frowned upon?
==

Well, the architecture of other OpenStack projects have also
under-gone change as the community understanding of the
implications of prior design decisions has evolved.

Take for example the move towards nova no-db-compute  the
unified-object-model in order to address issues in the nova
architecture that made progress towards rolling upgrades
unneccessarily difficult.

The point, in my understanding, is not to avoid doing the
course-correction where it's deemed necessary. Rather, the
principle is more that these corrections happen in an open
and planned way.


The path forward:


A subset of the ceilometer community will continue to work on
gnocchi in parallel with the ceilometer core over the remainder
of the Juno cycle and into the Kilo timeframe. The goal is to
have an initial implementation of gnocchi ready for tech preview
by the end of Juno, and to have the integration/migration/
co-existence questions addressed in Kilo.

Moving the ceilometer core to using gnocchi will be contingent
on it demonstrating the required performance characteristics and
providing the semantics needed to support a v3 ceilometer API
that's fit-for-purpose.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project

2014-08-11 Thread Eoghan Glynn


 Hi Eoghan,
 
 Thanks for the note below. However, one thing the overview below does not
 cover is why InfluxDB ( http://influxdb.com/ ) is not being leveraged. Many
 folks feel that this technology is a viable solution for the problem space
 discussed below.

Great question Brad!

As it happens we've been working closely with Paul Dix (lead
developer of InfluxDB) to ensure that this metrics store would be
usable as a backend driver. That conversation actually kicked off
at the Juno summit in Atlanta, but it really got off the ground
at our mid-cycle meet-up in Paris on in early July.

I wrote a rough strawman version of an InfluxDB driver in advance
of the mid-cycle to frame the discussion, and Paul Dix traveled
to the meet-up so we could have the discussion face-to-face. The
conclusion was that InfluxDB would indeed potentially be a great
fit, modulo some requirements that we identified during the detailed
discussions:

 * shard-space-based retention  backgrounded deletion
 * capability to merge individual timeseries for cross-aggregation
 * backfill-aware downsampling

The InfluxDB folks have committed to implementing those features in
over July and August, and have made concrete progress on that score.

I hope that provides enough detail to answer to your question?

Cheers,
Eoghan

 Thanks,
 
 Brad
 
 
 Brad Topol, Ph.D.
 IBM Distinguished Engineer
 OpenStack
 (919) 543-0646
 Internet: bto...@us.ibm.com
 Assistant: Kendra Witherspoon (919) 254-0680
 
 
 
 From: Eoghan Glynn egl...@redhat.com
 To: OpenStack Development Mailing List (not for usage questions)
 openstack-dev@lists.openstack.org,
 Date: 08/06/2014 11:17 AM
 Subject: [openstack-dev] [tc][ceilometer] Some background on the gnocchi
 project
 
 
 
 
 
 Folks,
 
 It's come to our attention that some key individuals are not
 fully up-to-date on gnocchi activities, so it being a good and
 healthy thing to ensure we're as communicative as possible about
 our roadmap, I've provided a high-level overview here of our
 thinking. This is intended as a precursor to further discussion
 with the TC.
 
 Cheers,
 Eoghan
 
 
 What gnocchi is:
 ===
 
 Gnocchi is a separate, but related, project spun up on stackforge
 by Julien Danjou, with the objective of providing efficient
 storage and retrieval of timeseries-oriented data and resource
 representations.
 
 The goal is to experiment with a potential approach to addressing
 an architectural misstep made in the very earliest days of
 ceilometer, specifically the decision to store snapshots of some
 resource metadata alongside each metric datapoint. The core idea
 is to move to storing datapoints shorn of metadata, and instead
 allow the resource-state timeline to be reconstructed more cheaply
 from much less frequently occurring events (e.g. instance resizes
 or migrations).
 
 
 What gnocchi isn't:
 ==
 
 Gnocchi is not a large-scale under-the-radar rewrite of a core
 OpenStack component along the lines of keystone-lite.
 
 The change is concentrated on the final data-storage phase of
 the ceilometer pipeline, so will have little initial impact on the
 data-acquiring agents, or on transformation phase.
 
 We've been totally open at the Atlanta summit and other forums
 about this approach being a multi-cycle effort.
 
 
 Why we decided to do it this way:
 
 
 The intent behind spinning up a separate project on stackforge
 was to allow the work progress at arms-length from ceilometer,
 allowing normalcy to be maintained on the core project and a
 rapid rate of innovation on gnocchi.
 
 Note that that the developers primarily contributing to gnocchi
 represent a cross-section of the core team, and there's a regular
 feedback loop in the form of a recurring agenda item at the
 weekly team meeting to avoid the effort becoming silo'd.
 
 
 But isn't re-architecting frowned upon?
 ==
 
 Well, the architecture of other OpenStack projects have also
 under-gone change as the community understanding of the
 implications of prior design decisions has evolved.
 
 Take for example the move towards nova no-db-compute  the
 unified-object-model in order to address issues in the nova
 architecture that made progress towards rolling upgrades
 unneccessarily difficult.
 
 The point, in my understanding, is not to avoid doing the
 course-correction where it's deemed necessary. Rather, the
 principle is more that these corrections happen in an open
 and planned way.
 
 
 The path forward:
 
 
 A subset of the ceilometer community will continue to work on
 gnocchi in parallel with the ceilometer core over the remainder
 of the Juno cycle and into the Kilo timeframe. The goal is to
 have an initial implementation of gnocchi ready for tech preview
 by the end of Juno, and to have the integration/migration/
 co-existence questions addressed in Kilo.
 
 Moving the ceilometer core to using gnocchi will be contingent

Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project

2014-08-11 Thread Sandy Walsh
On 8/11/2014 4:22 PM, Eoghan Glynn wrote:

 Hi Eoghan,

 Thanks for the note below. However, one thing the overview below does not
 cover is why InfluxDB ( http://influxdb.com/ ) is not being leveraged. Many
 folks feel that this technology is a viable solution for the problem space
 discussed below.
 Great question Brad!

 As it happens we've been working closely with Paul Dix (lead
 developer of InfluxDB) to ensure that this metrics store would be
 usable as a backend driver. That conversation actually kicked off
 at the Juno summit in Atlanta, but it really got off the ground
 at our mid-cycle meet-up in Paris on in early July.
...

 The InfluxDB folks have committed to implementing those features in
 over July and August, and have made concrete progress on that score.

 I hope that provides enough detail to answer to your question?

I guess it begs the question, if influxdb will do what you want and it's
open source (MIT) as well as commercially supported, how does gnocchi
differentiate?

 Cheers,
 Eoghan

 Thanks,

 Brad


 Brad Topol, Ph.D.
 IBM Distinguished Engineer
 OpenStack
 (919) 543-0646
 Internet: bto...@us.ibm.com
 Assistant: Kendra Witherspoon (919) 254-0680



 From: Eoghan Glynn egl...@redhat.com
 To: OpenStack Development Mailing List (not for usage questions)
 openstack-dev@lists.openstack.org,
 Date: 08/06/2014 11:17 AM
 Subject: [openstack-dev] [tc][ceilometer] Some background on the gnocchi
 project





 Folks,

 It's come to our attention that some key individuals are not
 fully up-to-date on gnocchi activities, so it being a good and
 healthy thing to ensure we're as communicative as possible about
 our roadmap, I've provided a high-level overview here of our
 thinking. This is intended as a precursor to further discussion
 with the TC.

 Cheers,
 Eoghan


 What gnocchi is:
 ===

 Gnocchi is a separate, but related, project spun up on stackforge
 by Julien Danjou, with the objective of providing efficient
 storage and retrieval of timeseries-oriented data and resource
 representations.

 The goal is to experiment with a potential approach to addressing
 an architectural misstep made in the very earliest days of
 ceilometer, specifically the decision to store snapshots of some
 resource metadata alongside each metric datapoint. The core idea
 is to move to storing datapoints shorn of metadata, and instead
 allow the resource-state timeline to be reconstructed more cheaply
 from much less frequently occurring events (e.g. instance resizes
 or migrations).


 What gnocchi isn't:
 ==

 Gnocchi is not a large-scale under-the-radar rewrite of a core
 OpenStack component along the lines of keystone-lite.

 The change is concentrated on the final data-storage phase of
 the ceilometer pipeline, so will have little initial impact on the
 data-acquiring agents, or on transformation phase.

 We've been totally open at the Atlanta summit and other forums
 about this approach being a multi-cycle effort.


 Why we decided to do it this way:
 

 The intent behind spinning up a separate project on stackforge
 was to allow the work progress at arms-length from ceilometer,
 allowing normalcy to be maintained on the core project and a
 rapid rate of innovation on gnocchi.

 Note that that the developers primarily contributing to gnocchi
 represent a cross-section of the core team, and there's a regular
 feedback loop in the form of a recurring agenda item at the
 weekly team meeting to avoid the effort becoming silo'd.


 But isn't re-architecting frowned upon?
 ==

 Well, the architecture of other OpenStack projects have also
 under-gone change as the community understanding of the
 implications of prior design decisions has evolved.

 Take for example the move towards nova no-db-compute  the
 unified-object-model in order to address issues in the nova
 architecture that made progress towards rolling upgrades
 unneccessarily difficult.

 The point, in my understanding, is not to avoid doing the
 course-correction where it's deemed necessary. Rather, the
 principle is more that these corrections happen in an open
 and planned way.


 The path forward:
 

 A subset of the ceilometer community will continue to work on
 gnocchi in parallel with the ceilometer core over the remainder
 of the Juno cycle and into the Kilo timeframe. The goal is to
 have an initial implementation of gnocchi ready for tech preview
 by the end of Juno, and to have the integration/migration/
 co-existence questions addressed in Kilo.

 Moving the ceilometer core to using gnocchi will be contingent
 on it demonstrating the required performance characteristics and
 providing the semantics needed to support a v3 ceilometer API
 that's fit-for-purpose.

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo

Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project

2014-08-11 Thread Eoghan Glynn


 On 8/11/2014 4:22 PM, Eoghan Glynn wrote:
 
  Hi Eoghan,
 
  Thanks for the note below. However, one thing the overview below does not
  cover is why InfluxDB ( http://influxdb.com/ ) is not being leveraged.
  Many
  folks feel that this technology is a viable solution for the problem space
  discussed below.
  Great question Brad!
 
  As it happens we've been working closely with Paul Dix (lead
  developer of InfluxDB) to ensure that this metrics store would be
  usable as a backend driver. That conversation actually kicked off
  at the Juno summit in Atlanta, but it really got off the ground
  at our mid-cycle meet-up in Paris on in early July.
 ...
 
  The InfluxDB folks have committed to implementing those features in
  over July and August, and have made concrete progress on that score.
 
  I hope that provides enough detail to answer to your question?
 
 I guess it begs the question, if influxdb will do what you want and it's
 open source (MIT) as well as commercially supported, how does gnocchi
 differentiate?

Hi Sandy,

One of the ideas behind gnocchi is to combine resource representation
and timeseries-oriented storage of metric data, providing an efficient
and convenient way to query for metric data associated with individual
resources.

Also, having an API layered above the storage driver avoids locking in
directly with a particular metrics-oriented DB, allowing for the
potential to support multiple storage driver options (e.g. to choose
between a canonical implementation based on Swift, an InfluxDB driver,
and an OpenTSDB driver, say).

A less compelling reason would be to provide a well-defined hook point
to innovate with aggregation/analytic logic not supported natively
in the underlying drivers (e.g. period-spanning statistics such as
exponentially-weighted moving average or even Holt-Winters).

Cheers,
Eoghan

 
  Cheers,
  Eoghan
 
  Thanks,
 
  Brad
 
 
  Brad Topol, Ph.D.
  IBM Distinguished Engineer
  OpenStack
  (919) 543-0646
  Internet: bto...@us.ibm.com
  Assistant: Kendra Witherspoon (919) 254-0680
 
 
 
  From: Eoghan Glynn egl...@redhat.com
  To: OpenStack Development Mailing List (not for usage questions)
  openstack-dev@lists.openstack.org,
  Date: 08/06/2014 11:17 AM
  Subject: [openstack-dev] [tc][ceilometer] Some background on the gnocchi
  project
 
 
 
 
 
  Folks,
 
  It's come to our attention that some key individuals are not
  fully up-to-date on gnocchi activities, so it being a good and
  healthy thing to ensure we're as communicative as possible about
  our roadmap, I've provided a high-level overview here of our
  thinking. This is intended as a precursor to further discussion
  with the TC.
 
  Cheers,
  Eoghan
 
 
  What gnocchi is:
  ===
 
  Gnocchi is a separate, but related, project spun up on stackforge
  by Julien Danjou, with the objective of providing efficient
  storage and retrieval of timeseries-oriented data and resource
  representations.
 
  The goal is to experiment with a potential approach to addressing
  an architectural misstep made in the very earliest days of
  ceilometer, specifically the decision to store snapshots of some
  resource metadata alongside each metric datapoint. The core idea
  is to move to storing datapoints shorn of metadata, and instead
  allow the resource-state timeline to be reconstructed more cheaply
  from much less frequently occurring events (e.g. instance resizes
  or migrations).
 
 
  What gnocchi isn't:
  ==
 
  Gnocchi is not a large-scale under-the-radar rewrite of a core
  OpenStack component along the lines of keystone-lite.
 
  The change is concentrated on the final data-storage phase of
  the ceilometer pipeline, so will have little initial impact on the
  data-acquiring agents, or on transformation phase.
 
  We've been totally open at the Atlanta summit and other forums
  about this approach being a multi-cycle effort.
 
 
  Why we decided to do it this way:
  
 
  The intent behind spinning up a separate project on stackforge
  was to allow the work progress at arms-length from ceilometer,
  allowing normalcy to be maintained on the core project and a
  rapid rate of innovation on gnocchi.
 
  Note that that the developers primarily contributing to gnocchi
  represent a cross-section of the core team, and there's a regular
  feedback loop in the form of a recurring agenda item at the
  weekly team meeting to avoid the effort becoming silo'd.
 
 
  But isn't re-architecting frowned upon?
  ==
 
  Well, the architecture of other OpenStack projects have also
  under-gone change as the community understanding of the
  implications of prior design decisions has evolved.
 
  Take for example the move towards nova no-db-compute  the
  unified-object-model in order to address issues in the nova
  architecture that made progress towards rolling upgrades
  unneccessarily difficult.
 
  The point, in my

Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project

2014-08-11 Thread Sandy Walsh
On 8/11/2014 5:29 PM, Eoghan Glynn wrote:

 On 8/11/2014 4:22 PM, Eoghan Glynn wrote:
 Hi Eoghan,

 Thanks for the note below. However, one thing the overview below does not
 cover is why InfluxDB ( http://influxdb.com/ ) is not being leveraged.
 Many
 folks feel that this technology is a viable solution for the problem space
 discussed below.
 Great question Brad!

 As it happens we've been working closely with Paul Dix (lead
 developer of InfluxDB) to ensure that this metrics store would be
 usable as a backend driver. That conversation actually kicked off
 at the Juno summit in Atlanta, but it really got off the ground
 at our mid-cycle meet-up in Paris on in early July.
 ...
 The InfluxDB folks have committed to implementing those features in
 over July and August, and have made concrete progress on that score.

 I hope that provides enough detail to answer to your question?
 I guess it begs the question, if influxdb will do what you want and it's
 open source (MIT) as well as commercially supported, how does gnocchi
 differentiate?
 Hi Sandy,

 One of the ideas behind gnocchi is to combine resource representation
 and timeseries-oriented storage of metric data, providing an efficient
 and convenient way to query for metric data associated with individual
 resources.

Doesn't InfluxDB do the same?


 Also, having an API layered above the storage driver avoids locking in
 directly with a particular metrics-oriented DB, allowing for the
 potential to support multiple storage driver options (e.g. to choose
 between a canonical implementation based on Swift, an InfluxDB driver,
 and an OpenTSDB driver, say).
Right, I'm not suggesting to remove the storage abstraction layer. I'm
just curious what gnocchi does better/different than InfluxDB?

Or, am I missing the objective here and gnocchi is the abstraction layer
and not an influxdb alternative? If so, my apologies for the confusion.

 A less compelling reason would be to provide a well-defined hook point
 to innovate with aggregation/analytic logic not supported natively
 in the underlying drivers (e.g. period-spanning statistics such as
 exponentially-weighted moving average or even Holt-Winters).
 Cheers,
 Eoghan

  
 Cheers,
 Eoghan

 Thanks,

 Brad


 Brad Topol, Ph.D.
 IBM Distinguished Engineer
 OpenStack
 (919) 543-0646
 Internet: bto...@us.ibm.com
 Assistant: Kendra Witherspoon (919) 254-0680



 From: Eoghan Glynn egl...@redhat.com
 To: OpenStack Development Mailing List (not for usage questions)
 openstack-dev@lists.openstack.org,
 Date: 08/06/2014 11:17 AM
 Subject: [openstack-dev] [tc][ceilometer] Some background on the gnocchi
 project





 Folks,

 It's come to our attention that some key individuals are not
 fully up-to-date on gnocchi activities, so it being a good and
 healthy thing to ensure we're as communicative as possible about
 our roadmap, I've provided a high-level overview here of our
 thinking. This is intended as a precursor to further discussion
 with the TC.

 Cheers,
 Eoghan


 What gnocchi is:
 ===

 Gnocchi is a separate, but related, project spun up on stackforge
 by Julien Danjou, with the objective of providing efficient
 storage and retrieval of timeseries-oriented data and resource
 representations.

 The goal is to experiment with a potential approach to addressing
 an architectural misstep made in the very earliest days of
 ceilometer, specifically the decision to store snapshots of some
 resource metadata alongside each metric datapoint. The core idea
 is to move to storing datapoints shorn of metadata, and instead
 allow the resource-state timeline to be reconstructed more cheaply
 from much less frequently occurring events (e.g. instance resizes
 or migrations).


 What gnocchi isn't:
 ==

 Gnocchi is not a large-scale under-the-radar rewrite of a core
 OpenStack component along the lines of keystone-lite.

 The change is concentrated on the final data-storage phase of
 the ceilometer pipeline, so will have little initial impact on the
 data-acquiring agents, or on transformation phase.

 We've been totally open at the Atlanta summit and other forums
 about this approach being a multi-cycle effort.


 Why we decided to do it this way:
 

 The intent behind spinning up a separate project on stackforge
 was to allow the work progress at arms-length from ceilometer,
 allowing normalcy to be maintained on the core project and a
 rapid rate of innovation on gnocchi.

 Note that that the developers primarily contributing to gnocchi
 represent a cross-section of the core team, and there's a regular
 feedback loop in the form of a recurring agenda item at the
 weekly team meeting to avoid the effort becoming silo'd.


 But isn't re-architecting frowned upon?
 ==

 Well, the architecture of other OpenStack projects have also
 under-gone change as the community understanding of the
 implications of prior design decisions has

Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project

2014-08-11 Thread Mathieu Gagné

On 2014-08-11 5:13 PM, Sandy Walsh wrote:

Right, I'm not suggesting to remove the storage abstraction layer. I'm
just curious what gnocchi does better/different than InfluxDB?



I was at the OpenStack Design Summit when Gnocchi was presented.

Soon after the basic goals and technical details of Gnocchi were 
presented, people wondered why InfluxDB wasn't used. AFAIK, people 
presenting Gnocchi didn't know about InfluxDB so they weren't able to 
answer the question.


I don't really blame them. At that time, I didn't know anything about 
Gnocchi, even less about InfluxDB but rapidly learned that both are 
DataSeries databases/services.



What I would have answered to that question is (IMO):

Gnocchi is a new project tackling the need for a DataSeries 
database/storage as a service. Pandas/Swift is used as an implementation 
reference. Some people love Swift and will use it everywhere they can, 
nothing wrong with it. (or lets not go down that path)



 Or, am I missing the objective here and gnocchi is the abstraction layer
 and not an influxdb alternative? If so, my apologies for the confusion.


InfluxDB can't be used as-is by OpenStack services. There needs to be an 
abstraction layer somewhere.


As Gnocchi is (or will be) well written, people will be free to drop the 
Swift implementation and replace it by whatever they want: InfluxDB, 
Blueflood, RRD, Whisper, plain text files, in-memory, /dev/null, etc.


But we first need to start somewhere with one implementation and 
Pandas/Swift was chosen.


I'm confident people will soon start proposing alternative storage 
backends/implementations better fitting their needs and tastes.


--
Mathieu

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project

2014-08-11 Thread Eoghan Glynn


  On 8/11/2014 4:22 PM, Eoghan Glynn wrote:
  Hi Eoghan,
 
  Thanks for the note below. However, one thing the overview below does
  not
  cover is why InfluxDB ( http://influxdb.com/ ) is not being leveraged.
  Many
  folks feel that this technology is a viable solution for the problem
  space
  discussed below.
  Great question Brad!
 
  As it happens we've been working closely with Paul Dix (lead
  developer of InfluxDB) to ensure that this metrics store would be
  usable as a backend driver. That conversation actually kicked off
  at the Juno summit in Atlanta, but it really got off the ground
  at our mid-cycle meet-up in Paris on in early July.
  ...
  The InfluxDB folks have committed to implementing those features in
  over July and August, and have made concrete progress on that score.
 
  I hope that provides enough detail to answer to your question?
  I guess it begs the question, if influxdb will do what you want and it's
  open source (MIT) as well as commercially supported, how does gnocchi
  differentiate?
  Hi Sandy,
 
  One of the ideas behind gnocchi is to combine resource representation
  and timeseries-oriented storage of metric data, providing an efficient
  and convenient way to query for metric data associated with individual
  resources.
 
 Doesn't InfluxDB do the same?

InfluxDB stores timeseries data primarily.

Gnocchi in intended to store strongly-typed OpenStack resource
representations (instance, images, etc.) in addition to providing
a means to access timeseries data associated with those resources.

So to answer your question: no, IIUC, it doesn't do the same thing.

Though of course these things are not a million miles from each
other, one is just a step up in the abstraction stack, having a
wider and more OpenStack-specific scope.
 
  Also, having an API layered above the storage driver avoids locking in
  directly with a particular metrics-oriented DB, allowing for the
  potential to support multiple storage driver options (e.g. to choose
  between a canonical implementation based on Swift, an InfluxDB driver,
  and an OpenTSDB driver, say).
 Right, I'm not suggesting to remove the storage abstraction layer. I'm
 just curious what gnocchi does better/different than InfluxDB?
 
 Or, am I missing the objective here and gnocchi is the abstraction layer
 and not an influxdb alternative? If so, my apologies for the confusion.

No worries :)

The intention is for gnocchi to provide an abstraction over
timeseries, aggregation, downsampling and archiving/retention
policies, with a number of drivers mapping onto real timeseries
storage options. One of those drivers is based on Swift, another
is in the works based on InfluxDB, and a third based on OpenTSDB
has also been proposed.

Cheers,
Eoghan

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project

2014-08-11 Thread Sandy Walsh
On 8/11/2014 6:49 PM, Eoghan Glynn wrote:

 On 8/11/2014 4:22 PM, Eoghan Glynn wrote:
 Hi Eoghan,

 Thanks for the note below. However, one thing the overview below does
 not
 cover is why InfluxDB ( http://influxdb.com/ ) is not being leveraged.
 Many
 folks feel that this technology is a viable solution for the problem
 space
 discussed below.
 Great question Brad!

 As it happens we've been working closely with Paul Dix (lead
 developer of InfluxDB) to ensure that this metrics store would be
 usable as a backend driver. That conversation actually kicked off
 at the Juno summit in Atlanta, but it really got off the ground
 at our mid-cycle meet-up in Paris on in early July.
 ...
 The InfluxDB folks have committed to implementing those features in
 over July and August, and have made concrete progress on that score.

 I hope that provides enough detail to answer to your question?
 I guess it begs the question, if influxdb will do what you want and it's
 open source (MIT) as well as commercially supported, how does gnocchi
 differentiate?
 Hi Sandy,

 One of the ideas behind gnocchi is to combine resource representation
 and timeseries-oriented storage of metric data, providing an efficient
 and convenient way to query for metric data associated with individual
 resources.
 Doesn't InfluxDB do the same?
 InfluxDB stores timeseries data primarily.

 Gnocchi in intended to store strongly-typed OpenStack resource
 representations (instance, images, etc.) in addition to providing
 a means to access timeseries data associated with those resources.

 So to answer your question: no, IIUC, it doesn't do the same thing.

Ok, I think I'm getting closer on this.  Thanks for the clarification.
Sadly, I have more questions :)

Is this closer? a metadata repo for resources (instances, images, etc)
+ an abstraction to some TSDB(s)?

Hmm, thinking out loud ... if it's a metadata repo for resources, who is
the authoritative source for what the resource is? Ceilometer/Gnocchi or
the source service? For example, if I want to query instance power state
do I ask ceilometer or Nova?

Or is it metadata about the time-series data collected for that
resource? In which case, I think most tsdb's have some sort of series
description facilities. I guess my question is, what makes this
metadata unique and how would it differ from the metadata ceilometer
already collects?

Will it be using Glance, now that Glance is becoming a pure metadata repo?


 Though of course these things are not a million miles from each
 other, one is just a step up in the abstraction stack, having a
 wider and more OpenStack-specific scope.

Could it be a generic timeseries service? Is it openstack specific
because it uses stackforge/python/oslo? I assume the rules and schemas
will be data-driven (vs. hard-coded)? ... and since the ceilometer
collectors already do the bridge work, is it a pre-packaging of
definitions that target openstack specifically? (not sure about wider
and more specific)

Sorry if this was already hashed out in Atlanta.

  
 Also, having an API layered above the storage driver avoids locking in
 directly with a particular metrics-oriented DB, allowing for the
 potential to support multiple storage driver options (e.g. to choose
 between a canonical implementation based on Swift, an InfluxDB driver,
 and an OpenTSDB driver, say).
 Right, I'm not suggesting to remove the storage abstraction layer. I'm
 just curious what gnocchi does better/different than InfluxDB?

 Or, am I missing the objective here and gnocchi is the abstraction layer
 and not an influxdb alternative? If so, my apologies for the confusion.
 No worries :)

 The intention is for gnocchi to provide an abstraction over
 timeseries, aggregation, downsampling and archiving/retention
 policies, with a number of drivers mapping onto real timeseries
 storage options. One of those drivers is based on Swift, another
 is in the works based on InfluxDB, and a third based on OpenTSDB
 has also been proposed.

 Cheers,
 Eoghan

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tc][ceilometer] Some background on the gnocchi project

2014-08-06 Thread Eoghan Glynn

Folks,

It's come to our attention that some key individuals are not
fully up-to-date on gnocchi activities, so it being a good and
healthy thing to ensure we're as communicative as possible about
our roadmap, I've provided a high-level overview here of our
thinking. This is intended as a precursor to further discussion
with the TC.

Cheers,
Eoghan


What gnocchi is:
===

Gnocchi is a separate, but related, project spun up on stackforge
by Julien Danjou, with the objective of providing efficient
storage and retrieval of timeseries-oriented data and resource
representations.

The goal is to experiment with a potential approach to addressing
an architectural misstep made in the very earliest days of
ceilometer, specifically the decision to store snapshots of some
resource metadata alongside each metric datapoint. The core idea
is to move to storing datapoints shorn of metadata, and instead
allow the resource-state timeline to be reconstructed more cheaply
from much less frequently occurring events (e.g. instance resizes
or migrations).


What gnocchi isn't:
==

Gnocchi is not a large-scale under-the-radar rewrite of a core
OpenStack component along the lines of keystone-lite.

The change is concentrated on the final data-storage phase of
the ceilometer pipeline, so will have little initial impact on the
data-acquiring agents, or on transformation phase.

We've been totally open at the Atlanta summit and other forums
about this approach being a multi-cycle effort.


Why we decided to do it this way:


The intent behind spinning up a separate project on stackforge
was to allow the work progress at arms-length from ceilometer,
allowing normalcy to be maintained on the core project and a
rapid rate of innovation on gnocchi.

Note that that the developers primarily contributing to gnocchi
represent a cross-section of the core team, and there's a regular
feedback loop in the form of a recurring agenda item at the
weekly team meeting to avoid the effort becoming silo'd.


But isn't re-architecting frowned upon?
==

Well, the architecture of other OpenStack projects have also
under-gone change as the community understanding of the
implications of prior design decisions has evolved.

Take for example the move towards nova no-db-compute  the
unified-object-model in order to address issues in the nova
architecture that made progress towards rolling upgrades
unneccessarily difficult.

The point, in my understanding, is not to avoid doing the
course-correction where it's deemed necessary. Rather, the
principle is more that these corrections happen in an open
and planned way.


The path forward:


A subset of the ceilometer community will continue to work on
gnocchi in parallel with the ceilometer core over the remainder
of the Juno cycle and into the Kilo timeframe. The goal is to
have an initial implementation of gnocchi ready for tech preview
by the end of Juno, and to have the integration/migration/
co-existence questions addressed in Kilo.

Moving the ceilometer core to using gnocchi will be contingent
on it demonstrating the required performance characteristics and
providing the semantics needed to support a v3 ceilometer API
that's fit-for-purpose.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev