from:"Sandy Walsh"

Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project

2014-08-11 Thread Sandy Walsh

On 8/11/2014 4:22 PM, Eoghan Glynn wrote:

 Hi Eoghan,

 Thanks for the note below. However, one thing the overview below does not
 cover is why InfluxDB ( http://influxdb.com/ ) is not being leveraged. Many
 folks feel that this technology is a viable solution for the problem space
 discussed below.
 Great question Brad!

 As it happens we've been working closely with Paul Dix (lead
 developer of InfluxDB) to ensure that this metrics store would be
 usable as a backend driver. That conversation actually kicked off
 at the Juno summit in Atlanta, but it really got off the ground
 at our mid-cycle meet-up in Paris on in early July.
...

 The InfluxDB folks have committed to implementing those features in
 over July and August, and have made concrete progress on that score.

 I hope that provides enough detail to answer to your question?

I guess it begs the question, if influxdb will do what you want and it's
open source (MIT) as well as commercially supported, how does gnocchi
differentiate?

 Cheers,
 Eoghan

 Thanks,

 Brad


 Brad Topol, Ph.D.
 IBM Distinguished Engineer
 OpenStack
 (919) 543-0646
 Internet: bto...@us.ibm.com
 Assistant: Kendra Witherspoon (919) 254-0680



 From: Eoghan Glynn egl...@redhat.com
 To: OpenStack Development Mailing List (not for usage questions)
 openstack-dev@lists.openstack.org,
 Date: 08/06/2014 11:17 AM
 Subject: [openstack-dev] [tc][ceilometer] Some background on the gnocchi
 project





 Folks,

 It's come to our attention that some key individuals are not
 fully up-to-date on gnocchi activities, so it being a good and
 healthy thing to ensure we're as communicative as possible about
 our roadmap, I've provided a high-level overview here of our
 thinking. This is intended as a precursor to further discussion
 with the TC.

 Cheers,
 Eoghan


 What gnocchi is:
 ===

 Gnocchi is a separate, but related, project spun up on stackforge
 by Julien Danjou, with the objective of providing efficient
 storage and retrieval of timeseries-oriented data and resource
 representations.

 The goal is to experiment with a potential approach to addressing
 an architectural misstep made in the very earliest days of
 ceilometer, specifically the decision to store snapshots of some
 resource metadata alongside each metric datapoint. The core idea
 is to move to storing datapoints shorn of metadata, and instead
 allow the resource-state timeline to be reconstructed more cheaply
 from much less frequently occurring events (e.g. instance resizes
 or migrations).


 What gnocchi isn't:
 ==

 Gnocchi is not a large-scale under-the-radar rewrite of a core
 OpenStack component along the lines of keystone-lite.

 The change is concentrated on the final data-storage phase of
 the ceilometer pipeline, so will have little initial impact on the
 data-acquiring agents, or on transformation phase.

 We've been totally open at the Atlanta summit and other forums
 about this approach being a multi-cycle effort.


 Why we decided to do it this way:
 

 The intent behind spinning up a separate project on stackforge
 was to allow the work progress at arms-length from ceilometer,
 allowing normalcy to be maintained on the core project and a
 rapid rate of innovation on gnocchi.

 Note that that the developers primarily contributing to gnocchi
 represent a cross-section of the core team, and there's a regular
 feedback loop in the form of a recurring agenda item at the
 weekly team meeting to avoid the effort becoming silo'd.


 But isn't re-architecting frowned upon?
 ==

 Well, the architecture of other OpenStack projects have also
 under-gone change as the community understanding of the
 implications of prior design decisions has evolved.

 Take for example the move towards nova no-db-compute  the
 unified-object-model in order to address issues in the nova
 architecture that made progress towards rolling upgrades
 unneccessarily difficult.

 The point, in my understanding, is not to avoid doing the
 course-correction where it's deemed necessary. Rather, the
 principle is more that these corrections happen in an open
 and planned way.


 The path forward:
 

 A subset of the ceilometer community will continue to work on
 gnocchi in parallel with the ceilometer core over the remainder
 of the Juno cycle and into the Kilo timeframe. The goal is to
 have an initial implementation of gnocchi ready for tech preview
 by the end of Juno, and to have the integration/migration/
 co-existence questions addressed in Kilo.

 Moving the ceilometer core to using gnocchi will be contingent
 on it demonstrating the required performance characteristics and
 providing the semantics needed to support a v3 ceilometer API
 that's fit-for-purpose.

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org

Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project

2014-08-11 Thread Sandy Walsh

On 8/11/2014 5:29 PM, Eoghan Glynn wrote:

 On 8/11/2014 4:22 PM, Eoghan Glynn wrote:
 Hi Eoghan,

 Thanks for the note below. However, one thing the overview below does not
 cover is why InfluxDB ( http://influxdb.com/ ) is not being leveraged.
 Many
 folks feel that this technology is a viable solution for the problem space
 discussed below.
 Great question Brad!

 As it happens we've been working closely with Paul Dix (lead
 developer of InfluxDB) to ensure that this metrics store would be
 usable as a backend driver. That conversation actually kicked off
 at the Juno summit in Atlanta, but it really got off the ground
 at our mid-cycle meet-up in Paris on in early July.
 ...
 The InfluxDB folks have committed to implementing those features in
 over July and August, and have made concrete progress on that score.

 I hope that provides enough detail to answer to your question?
 I guess it begs the question, if influxdb will do what you want and it's
 open source (MIT) as well as commercially supported, how does gnocchi
 differentiate?
 Hi Sandy,

 One of the ideas behind gnocchi is to combine resource representation
 and timeseries-oriented storage of metric data, providing an efficient
 and convenient way to query for metric data associated with individual
 resources.

Doesn't InfluxDB do the same?


 Also, having an API layered above the storage driver avoids locking in
 directly with a particular metrics-oriented DB, allowing for the
 potential to support multiple storage driver options (e.g. to choose
 between a canonical implementation based on Swift, an InfluxDB driver,
 and an OpenTSDB driver, say).
Right, I'm not suggesting to remove the storage abstraction layer. I'm
just curious what gnocchi does better/different than InfluxDB?

Or, am I missing the objective here and gnocchi is the abstraction layer
and not an influxdb alternative? If so, my apologies for the confusion.

 A less compelling reason would be to provide a well-defined hook point
 to innovate with aggregation/analytic logic not supported natively
 in the underlying drivers (e.g. period-spanning statistics such as
 exponentially-weighted moving average or even Holt-Winters).
 Cheers,
 Eoghan

  
 Cheers,
 Eoghan

 Thanks,

 Brad


 Brad Topol, Ph.D.
 IBM Distinguished Engineer
 OpenStack
 (919) 543-0646
 Internet: bto...@us.ibm.com
 Assistant: Kendra Witherspoon (919) 254-0680



 From: Eoghan Glynn egl...@redhat.com
 To: OpenStack Development Mailing List (not for usage questions)
 openstack-dev@lists.openstack.org,
 Date: 08/06/2014 11:17 AM
 Subject: [openstack-dev] [tc][ceilometer] Some background on the gnocchi
 project





 Folks,

 It's come to our attention that some key individuals are not
 fully up-to-date on gnocchi activities, so it being a good and
 healthy thing to ensure we're as communicative as possible about
 our roadmap, I've provided a high-level overview here of our
 thinking. This is intended as a precursor to further discussion
 with the TC.

 Cheers,
 Eoghan


 What gnocchi is:
 ===

 Gnocchi is a separate, but related, project spun up on stackforge
 by Julien Danjou, with the objective of providing efficient
 storage and retrieval of timeseries-oriented data and resource
 representations.

 The goal is to experiment with a potential approach to addressing
 an architectural misstep made in the very earliest days of
 ceilometer, specifically the decision to store snapshots of some
 resource metadata alongside each metric datapoint. The core idea
 is to move to storing datapoints shorn of metadata, and instead
 allow the resource-state timeline to be reconstructed more cheaply
 from much less frequently occurring events (e.g. instance resizes
 or migrations).


 What gnocchi isn't:
 ==

 Gnocchi is not a large-scale under-the-radar rewrite of a core
 OpenStack component along the lines of keystone-lite.

 The change is concentrated on the final data-storage phase of
 the ceilometer pipeline, so will have little initial impact on the
 data-acquiring agents, or on transformation phase.

 We've been totally open at the Atlanta summit and other forums
 about this approach being a multi-cycle effort.


 Why we decided to do it this way:
 

 The intent behind spinning up a separate project on stackforge
 was to allow the work progress at arms-length from ceilometer,
 allowing normalcy to be maintained on the core project and a
 rapid rate of innovation on gnocchi.

 Note that that the developers primarily contributing to gnocchi
 represent a cross-section of the core team, and there's a regular
 feedback loop in the form of a recurring agenda item at the
 weekly team meeting to avoid the effort becoming silo'd.


 But isn't re-architecting frowned upon?
 ==

 Well, the architecture of other OpenStack projects have also
 under-gone change as the community understanding of the
 implications of prior design decisions has

Re: [openstack-dev] [tc][ceilometer] Some background on the gnocchi project

2014-08-11 Thread Sandy Walsh

On 8/11/2014 6:49 PM, Eoghan Glynn wrote:

 On 8/11/2014 4:22 PM, Eoghan Glynn wrote:
 Hi Eoghan,

 Thanks for the note below. However, one thing the overview below does
 not
 cover is why InfluxDB ( http://influxdb.com/ ) is not being leveraged.
 Many
 folks feel that this technology is a viable solution for the problem
 space
 discussed below.
 Great question Brad!

 As it happens we've been working closely with Paul Dix (lead
 developer of InfluxDB) to ensure that this metrics store would be
 usable as a backend driver. That conversation actually kicked off
 at the Juno summit in Atlanta, but it really got off the ground
 at our mid-cycle meet-up in Paris on in early July.
 ...
 The InfluxDB folks have committed to implementing those features in
 over July and August, and have made concrete progress on that score.

 I hope that provides enough detail to answer to your question?
 I guess it begs the question, if influxdb will do what you want and it's
 open source (MIT) as well as commercially supported, how does gnocchi
 differentiate?
 Hi Sandy,

 One of the ideas behind gnocchi is to combine resource representation
 and timeseries-oriented storage of metric data, providing an efficient
 and convenient way to query for metric data associated with individual
 resources.
 Doesn't InfluxDB do the same?
 InfluxDB stores timeseries data primarily.

 Gnocchi in intended to store strongly-typed OpenStack resource
 representations (instance, images, etc.) in addition to providing
 a means to access timeseries data associated with those resources.

 So to answer your question: no, IIUC, it doesn't do the same thing.

Ok, I think I'm getting closer on this.  Thanks for the clarification.
Sadly, I have more questions :)

Is this closer? a metadata repo for resources (instances, images, etc)
+ an abstraction to some TSDB(s)?

Hmm, thinking out loud ... if it's a metadata repo for resources, who is
the authoritative source for what the resource is? Ceilometer/Gnocchi or
the source service? For example, if I want to query instance power state
do I ask ceilometer or Nova?

Or is it metadata about the time-series data collected for that
resource? In which case, I think most tsdb's have some sort of series
description facilities. I guess my question is, what makes this
metadata unique and how would it differ from the metadata ceilometer
already collects?

Will it be using Glance, now that Glance is becoming a pure metadata repo?


 Though of course these things are not a million miles from each
 other, one is just a step up in the abstraction stack, having a
 wider and more OpenStack-specific scope.

Could it be a generic timeseries service? Is it openstack specific
because it uses stackforge/python/oslo? I assume the rules and schemas
will be data-driven (vs. hard-coded)? ... and since the ceilometer
collectors already do the bridge work, is it a pre-packaging of
definitions that target openstack specifically? (not sure about wider
and more specific)

Sorry if this was already hashed out in Atlanta.

  
 Also, having an API layered above the storage driver avoids locking in
 directly with a particular metrics-oriented DB, allowing for the
 potential to support multiple storage driver options (e.g. to choose
 between a canonical implementation based on Swift, an InfluxDB driver,
 and an OpenTSDB driver, say).
 Right, I'm not suggesting to remove the storage abstraction layer. I'm
 just curious what gnocchi does better/different than InfluxDB?

 Or, am I missing the objective here and gnocchi is the abstraction layer
 and not an influxdb alternative? If so, my apologies for the confusion.
 No worries :)

 The intention is for gnocchi to provide an abstraction over
 timeseries, aggregation, downsampling and archiving/retention
 policies, with a number of drivers mapping onto real timeseries
 storage options. One of those drivers is based on Swift, another
 is in the works based on InfluxDB, and a third based on OpenTSDB
 has also been proposed.

 Cheers,
 Eoghan

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Annoucing CloudKitty : an OpenSource Rating-as-a-Service project for OpenStack

2014-08-14 Thread Sandy Walsh

Sounds very interesting. We're currently collecting detailed (and verified) 
usage information in StackTach and are keen to see what CloudKitty is able to 
offer. My one wish is that you keep the components as small pip 
redistributables with low coupling to promote reuse with other projects. Many 
tiny repos and clear API's (internal and external) are good for adoption and 
contribution. 

All the best!
-Sandy


From: Christophe Sauthier [christophe.sauth...@objectif-libre.com]
Sent: Wednesday, August 13, 2014 10:40 AM
To: openstack-dev@lists.openstack.org
Subject: [openstack-dev] Annoucing CloudKitty : an OpenSource 
Rating-as-a-Service project for OpenStack

We are very pleased at Objectif Libre to intoduce CloudKitty, an effort
to provide a fully OpenSource Rating-as-a-Service component in
OpenStack..

Following a first POC presented during the last summit in Atlanta to
some Ceilometer devs (thanks again Julien Danjou for your great support
!), we continued our effort to create a real service for rating. Today
we are happy to share it with you all.


So what do we propose in CloudKitty?
  - a service for collecting metrics (using Ceilometer API)
  - a modular rating architecture to enable/disable modules and create
your own rules on-the-fly, allowing you to use the rating patterns you
like
  - an API to interact with the whole environment from core components
to every rating module
  - a Horizon integration to allow configuration of the rating modules
and display of pricing information in real time during instance
creation
  - a CLI client to access this information and easily configure
everything

Technically we are using all the elements that are used in the various
OpenStack projects like olso, stevedore, pecan...
CloudKitty is highly modular and allows integration / developement of
third party collection and rating modules and output formats.

A roadmap is available on the project wiki page (the link is at the end
of this email), but we are clearly hoping to have some feedback and
ideas on how to improve the project and reach a tighter integration with
OpenStack.

The project source code is available at
http://github.com/stackforge/cloudkitty
More stuff will be available on stackforge as soon as the reviews get
validated like python-cloudkittyclient and cloudkitty-dashboard, so stay
tuned.

The project's wiki page (https://wiki.openstack.org/wiki/CloudKitty)
provides more information, and you can reach us via irc on freenode:
#cloudkitty. Developper's documentation is on its way to readthedocs
too.

We plan to present CloudKitty in detail during the Paris Summit, but we
would love to hear from you sooner...

Cheers,

  Christophe and Objectif Libre


Christophe Sauthier   Mail :
christophe.sauth...@objectif-libre.com
CEO  Fondateur   Mob : +33 (0) 6 16 98 63
96
Objectif LibreURL :
www.objectif-libre.com
Infrastructure et Formations LinuxTwitter : @objectiflibre

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova][core] Expectations of core reviewers

2014-08-14 Thread Sandy Walsh

On 8/14/2014 11:28 AM, Russell Bryant wrote:
 On 08/14/2014 10:04 AM, CARVER, PAUL wrote:
 Daniel P. Berrange [mailto:berra...@redhat.com] wrote:

 Depending on the usage needs, I think Google hangouts is a quite useful
 technology. For many-to-many session its limit of 10 participants can be
 an issue, but for a few-to-many broadcast it could be practical. What I
 find particularly appealing is the way it can live stream the session
 over youtube which allows for unlimited number of viewers, as well as
 being available offline for later catchup.
 I can't actually offer ATT resources without getting some level of
 management approval first, but just for the sake of discussion here's
 some info about the telepresence system we use.

 -=-=-=-=-=-=-=-=-=-
 ATS B2B Telepresence conferences can be conducted with an external company's
 Telepresence room(s), which subscribe to the ATT Telepresence Solution,
 or a limited number of other Telepresence service provider's networks.

 Currently, the number of Telepresence rooms that can participate in a B2B
 conference is limited to a combined total of 20 rooms (19 of which can be
 ATT rooms, depending on the number of remote endpoints included).
 -=-=-=-=-=-=-=-=-=-

 We currently have B2B interconnect with over 100 companies and ATT has
 telepresence rooms in many of our locations around the US and around
 the world. If other large OpenStack companies also have telepresence
 rooms that we could interconnect with I think it might be possible
 to get management agreement to hold a couple OpenStack meetups per
 year.

 Most of our rooms are best suited for 6 people, but I know of at least
 one 18 person telepresence room near me.
 An ideal solution would allow attendees to join as individuals from
 anywhere.  A lot of contributors work from home.  Is that sort of thing
 compatible with your system?

http://bluejeans.com/ was a good experience.

What about Google Hangout OnAir for the PTL and core, while others are
view-only with chat/irc questions?

http://www.google.com/+/learnmore/hangouts/onair.html



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova][core] Expectations of core reviewers

2014-08-15 Thread Sandy Walsh

Maybe we need to think about this from a distributed software perspective?

* Divide and Conquer? 

Can we split the topics to create more manageable sub-groups? This way it's not 
core-vs-non-core but intererested-vs-moderately-interested.  (of course, this 
is much the way the mailing list works). Perhaps OnAir would work well for that?

How about geographic separation? Meetings per time-zone that roll up into 
larger meetings (see More Workers below). This is much the same way the 
regional openstack meetups work, but with specific topics. 

Of course, then we get replication latency :)

* More workers?

Can we assign topic owners? Cores might delegate a topic to a non-core member 
to gather consensus, concerns, suggestions and summarize the result to present 
during weekly IRC meetings.

*  Better threading?

Are there other tools than mailing lists for talking about these topics? Would 
mind-mapping software [1] work better for keeping the threads manageable? 

-Sandy
[1] http://en.wikipedia.org/wiki/Mind_map
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] The future of the integrated release

2014-08-15 Thread Sandy Walsh

On 8/14/2014 6:42 PM, Doug Hellmann wrote:

On Aug 14, 2014, at 4:41 PM, Joe Gordon 
joe.gord...@gmail.commailto:joe.gord...@gmail.com wrote:




On Wed, Aug 13, 2014 at 12:24 PM, Doug Hellmann 
d...@doughellmann.commailto:d...@doughellmann.com wrote:

On Aug 13, 2014, at 3:05 PM, Eoghan Glynn 
egl...@redhat.commailto:egl...@redhat.com wrote:


 At the end of the day, that's probably going to mean saying No to more
 things. Everytime I turn around everyone wants the TC to say No to
 things, just not to their particular thing. :) Which is human nature.
 But I think if we don't start saying No to more things we're going to
 end up with a pile of mud that no one is happy with.

 That we're being so abstract about all of this is frustrating. I get
 that no-one wants to start a flamewar, but can someone be concrete about
 what they feel we should say 'no' to but are likely to say 'yes' to?


 I'll bite, but please note this is a strawman.

 No:
 * Accepting any more projects into incubation until we are comfortable with
 the state of things again
 * Marconi
 * Ceilometer

 Well -1 to that, obviously, from me.

 Ceilometer is on track to fully execute on the gap analysis coverage
 plan agreed with the TC at the outset of this cycle, and has an active
 plan in progress to address architectural debt.

Yes, there seems to be an attitude among several people in the community that 
the Ceilometer team denies that there are issues and refuses to work on them. 
Neither of those things is the case from our perspective.

Totally agree.


Can you be more specific about the shortcomings you see in the project that 
aren’t being addressed?


Once again, this is just a straw man.

You’re not the first person to propose ceilometer as a project to kick out of 
the release, though, and so I would like to be talking about specific reasons 
rather than vague frustrations.


I'm just not sure OpenStack has 'blessed' the best solution out there.

https://wiki.openstack.org/wiki/Ceilometer/Graduation#Why_we_think_we.27re_ready



  *   Successfully passed the challenge of being adopted by 3 related projects 
which have agreed to join or use ceilometer:
 *   Synaps
 *   Healthnmon
 *   
StackTachhttps://wiki.openstack.org/w/index.php?title=StackTachaction=editredlink=1

Stacktach seems to still be under active development 
(http://git.openstack.org/cgit/stackforge/stacktach/log/), is used by rackspace 
in production and from everything I hear is more mature then ceilometer.

Stacktach is older than ceilometer, but does not do all of the things 
ceilometer does now and aims to do in the future. It has been a while since I 
last looked at it, so the situation may have changed, but some of the reasons 
stacktach would not be a full replacement for ceilometer include: it only works 
with AMQP; it collects notification events, but doesn’t offer any metering 
ability per se (no tracking of values like CPU or bandwidth utilization); it 
only collects notifications from some projects, and doesn’t have a way to 
collect data from swift, which doesn’t emit notifications; and it does not 
integrate with Heat to trigger autoscaling alarms.

Well, that's my cue.

Yes, StackTach was started before the incubation process was established and it 
solves other problems. Specifically around usage, billing and performance 
monitoring, things I wouldn't use Ceilometer for. But, if someone asked me what 
they should use for metering today, I'd point them towards Monasca in a 
heartbeat. Another non-blessed project.

It is nice to see that Ceilometer is working to solve their problems, but there 
are other solutions operators should consider until that time comes. It would 
be nice to see the TC endorse those too. Solve the users need first.

We did work with a few of the Stacktach developers on bringing event collection 
into ceilometer, and that work is allowing us to modify the way we store the 
meter data that causes a lot of the performance issues we’ve seen. That work is 
going on now and will be continued into Kilo, when we expect to be adding 
drivers for time-series databases more appropriate for that type of data.


StackTach isn't actively contributing to Ceilometer any more. Square peg/round 
hole. We needed some room to experiment with alternative solutions and the 
rigidity of the process was a hindrance. Not a problem with the core team, just 
a problem with the dev process overall.

I recently suggested that the Ceilometer API (and integration tests) be 
separated from the implementation (two repos) so others might plug in a 
different implementation while maintaining compatibility, but that wasn't well 
received.

Personally, I'd like to see that model extended for all OpenStack projects. 
Keep compatible at the API level and welcome competing implementations.

We'll be moving StackTach.v3 [1] to StackForge soon and following that model. 
The API and integration tests are one repo (with a bare-bones implementation to 
make the

Re: [openstack-dev] [all] The future of the integrated release

2014-08-16 Thread Sandy Walsh

On 8/16/2014 10:09 AM, Chris Dent wrote:
 On Fri, 15 Aug 2014, Sandy Walsh wrote:

 I recently suggested that the Ceilometer API (and integration tests)
 be separated from the implementation (two repos) so others might plug
 in a different implementation while maintaining compatibility, but
 that wasn't well received.

 Personally, I'd like to see that model extended for all OpenStack
 projects. Keep compatible at the API level and welcome competing
 implementations.
 I think this is a _very_ interesting idea, especially the way it fits
 in with multiple themes that have bounced around the list lately, not
 just this thread:

 * Improving project-side testing; that is, pre-gate integration
testing.

 * Providing a framework (at least conceptual) on which to inform the
tempest-libification.

 * Solidifying both intra- and inter-project API contracts (both HTTP
and notifications).

 * Providing a solid basis on which to enable healthy competition between
implementations.

 * Helping to ensure that the various projects work to the goals of their
public facing name rather than their internal name (e.g. Telemetry
vs ceilometer).
+1 ... love that take on it.

 Given the usual trouble with resource availability it seems best to
 find tactics that can be applied to multiple strategic goals.


Exactly! You get it.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] The future of the integrated release

2014-08-19 Thread Sandy Walsh

On 8/18/2014 9:27 AM, Thierry Carrez wrote:
 Clint Byrum wrote:
 Here's why folk are questioning Ceilometer:

 Nova is a set of tools to abstract virtualization implementations.
 Neutron is a set of tools to abstract SDN/NFV implementations.
 Cinder is a set of tools to abstract block-device implementations.
 Trove is a set of tools to simplify consumption of existing databases.
 Sahara is a set of tools to simplify Hadoop consumption.
 Swift is a feature-complete implementation of object storage, none of
 which existed when it was started.
 Keystone supports all of the above, unifying their auth.
 Horizon supports all of the above, unifying their GUI.

 Ceilometer is a complete implementation of data collection and alerting.
 There is no shortage of implementations that exist already.

 I'm also core on two projects that are getting some push back these
 days:

 Heat is a complete implementation of orchestration. There are at least a
 few of these already in existence, though not as many as their are data
 collection and alerting systems.

 TripleO is an attempt to deploy OpenStack using tools that OpenStack
 provides. There are already quite a few other tools that _can_ deploy
 OpenStack, so it stands to reason that people will question why we
 don't just use those. It is my hope we'll push more into the unifying
 the implementations space and withdraw a bit from the implementing
 stuff space.

 So, you see, people are happy to unify around a single abstraction, but
 not so much around a brand new implementation of things that already
 exist.
 Right, most projects focus on providing abstraction above
 implementations, and that abstraction is where the real domain
 expertise of OpenStack should be (because no one else is going to do it
 for us). Every time we reinvent something, we are at larger risk because
 we are out of our common specialty, and we just may not be as good as
 the domain specialists. That doesn't mean we should never reinvent
 something, but we need to be damn sure it's a good idea before we do.
 It's sometimes less fun to piggyback on existing implementations, but if
 they exist that's probably what we should do.

 While Ceilometer is far from alone in that space, what sets it apart is
 that even after it was blessed by the TC as the one we should all
 converge on, we keep on seeing competing implementations for some (if
 not all) of its scope. Convergence did not happen, and without
 convergence we struggle in adoption. We need to understand why, and if
 this is fixable.


So, here's what happened with StackTach ...

We had two teams working on StackTach, one group working on the original
program (v2) and another working on Ceilometer integration of our new
design. The problem was, there was no way we could compete with the
speed of the v2 team. Every little thing we needed to do in OpenStack
was a herculean effort. Submit a branch in one place, it needs to go
somewhere else. Spend weeks trying to land a branch. Endlessly debate
about minutia. It goes on.

I know that's the nature of running a large project. And I know everyone
is feeling it.

We quickly came to realize that, if the stars aligned and we did what we
needed to do, we'd only be playing catch-up to the other StackTach team.
And StackTach had growing pains. We needed this new architecture to
solve real business problems *today*. This isn't built it and they will
come, this is we know it's valuable ... when can I have the new one?
Like everyone, we have incredible pressure to deliver and we can't
accurately forecast with so many uncontrollable factors.

Much of what is now StackTach.v3 is (R)esearch not (D)evelopment. With
R, we need to be able to run a little fast-and-loose. Not every pull
request is a masterpiece. Our plans are going to change. We need to have
room to experiment. If it was all just D, yes, we could be more formal.
But we frequently go down a road to find a dead end and need to adjust.

We started on StackTach.v3 outside of formal OpenStack. It's still open
source. We still talk with interested parties (including ceilo) about
the design and how we're going to fulfill their needs, but we're mostly
head-down trying to get a production ready release in place. In the
process, we're making all of StackTach.v3 as tiny repos that other
groups (like Ceilo and Monasca) can adopt if they find them useful. Even
our impending move to StackForge is going to be a big productivity hit,
but it's necessary for some of our potential contributors.

Will we later revisit integration with Ceilometer? Possibly, but it's
not a priority. We have to serve the customers that are screaming for
v3. Arguably this is more of a BDFL model, but in order to innovate
quickly, get to large-scale production and remain competitive it may be
necessary.

This is why I'm pushing for an API-first model in OpenStack. Alternative
implementations shouldn't have to live outside the tribe.

(as always, my view only)

[openstack-dev] StackTach.v3 - Screencasts ...

2014-08-22 Thread Sandy Walsh

Hey y'all,

We've started a screencast series on the StackTach.v3 dev efforts [1]. It's 
still early-days, so subscribe to the playlist for updates.

The videos start with the StackTach/Ceilometer integration presentation at the 
Hong Kong summit, which is useful for background and motivation but then gets 
into our current dev strategy and state-of-the-union. 

If you're interested, we will be at the Ops Meetup in San Antonio next week and 
would love to chat about your monitoring, usage and billing requirements. 

All the best!
-S

[1] https://www.youtube.com/playlist?list=PLmyM48VxCGaW5pPdyFNWCuwVT1bCBV5p3
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Treating notifications as a contract

2014-09-03 Thread Sandy Walsh

Is there anything slated for the Paris summit around this?

I just spent nearly a week parsing Nova notifications and the pain of no schema 
has overtaken me. 

We're chatting with IBM about CADF and getting down to specifics on their 
applicability to notifications. Once I get StackTach.v3 into production I'm 
keen to get started on revisiting the notification format and olso.messaging 
support for notifications. 

Perhaps a hangout for those keenly interested in doing something about this?

Thoughts?
-S


From: Eoghan Glynn [egl...@redhat.com]
Sent: Monday, July 14, 2014 8:53 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [all] Treating notifications as a contract

  So what we need to figure out is how exactly this common structure can be
  accommodated without reverting back to what Sandy called the wild west
  in another post.

 I got the impression that wild west is what we've already got
 (within the payload)?

Yeah, exactly, that was my interpretation too.

So basically just to ensure that the lightweight-schema/common-structure
notion doesn't land us back not too far beyond square one (if there are
too many degrees-of-freedom in that declaration of a list of dicts with
certain required fields that you had envisaged in an earlier post).

  For example you could write up a brief wiki walking through how an
  existing widely-consumed notification might look under your vision,
  say compute.instance.start.end. Then post a link back here as an RFC.
 
  Or, possibly better, maybe submit up a strawman spec proposal to one
  of the relevant *-specs repos and invite folks to review in the usual
  way?

 Would oslo-specs (as in messaging) be the right place for that?

That's a good question.

Another approach would be to hone in on the producer-side that's
currently the heaviest user of notifications, i.e. nova, and propose
the strawman to nova-specs given that (a) that's where much of the
change will be needed, and (b) many of the notification patterns
originated in nova and then were subsequently aped by other projects
as they were spun up.

 My thinking is the right thing to do is bounce around some questions
 here (or perhaps in a new thread if this one has gone far enough off
 track to have dropped people) and catch up on some loose ends.

Absolutely!

 For example: It appears that CADF was designed for this sort of thing and
 was considered at some point in the past. It would be useful to know
 more of that story if there are any pointers.

 My initial reaction is that CADF has the stank of enterprisey all over
 it rather than less is more and worse is better but that's a
 completely uninformed and thus unfair opinion.

TBH I don't know enough about CADF, but I know a man who does ;)

(gordc, I'm looking at you!)

 Another question (from elsewhere in the thread) is if it is worth, in
 the Ironic notifications, to try and cook up something generic or to
 just carry on with what's being used.

Well, my gut instinct is that the content of the Ironic notifications
is perhaps on the outlier end of the spectrum compared to the more
traditional notifications we see emitted by nova, cinder etc. So it
may make better sense to concentrate initially on how contractizing
these more established notifications might play out.

  This feels like something that we should be thinking about with an eye
  to the K* cycle - would you agree?

 Yup.

 Thanks for helping to tease this all out and provide some direction on
 where to go next.

Well thank *you* for picking up the baton on this and running with it :)

Cheers,
Eoghan

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Treating notifications as a contract

2014-09-03 Thread Sandy Walsh

On 9/3/2014 11:32 AM, Chris Dent wrote:
 On Wed, 3 Sep 2014, Sandy Walsh wrote:

 We're chatting with IBM about CADF and getting down to specifics on
 their applicability to notifications. Once I get StackTach.v3 into
 production I'm keen to get started on revisiting the notification
 format and olso.messaging support for notifications.

 Perhaps a hangout for those keenly interested in doing something about this?
 That seems like a good idea. I'd like to be a part of that.
 Unfortunately I won't be at summit but would like to contribute what
 I can before and after.

 I took some notes on this a few weeks ago and extracted what seemed
 to be the two main threads or ideas the were revealed by the
 conversation that happened in this thread:

  * At the micro level have versioned schema for notifications such that
one end can declare I am sending version X of notification
foo.bar.Y and the other end can effectively deal.

Yes, that's table-stakes I think. Putting structure around the payload
section.

Beyond type and version we should be able to attach meta information
like public/private visibility and perhaps hints for external mapping
(this trait - that trait in CADF, for example).

  * At the macro level standardize a packaging or envelope of all
notifications so that they can be consumed by very similar code.
That is: constrain the notifications in some way so we can also
constrain the consumer code.
That's the intention of what we have now. The top level traits are
standard, the payload is open. We really only require: message_id,
timestamp and event_type. For auditing we need to cover Who, What, When,
Where, Why, OnWhat, OnWhere, FromWhere.

  These ideas serve two different purposes: One is to ensure that
  existing notification use cases are satisfied with robustness and
  provide a contract between two endpoints. The other is to allow a
  fecund notification environment that allows and enables many
  participants.
Good goals. When Producer and Consumer know what to expect, things are
good ... I know to find the Instance ID here. When the consumer
wants to deal with a notification as a generic object, things get tricky
(find the instance ID in the payload, What is the image type?, Is
this an error notification?)

Basically, how do we define the principle artifacts for each service and
grant the consumer easy/consistent access to them? (like the 7-W's above)

I'd really like to find a way to solve that problem.

 Is that a good summary? What did I leave out or get wrong?


Great start! Let's keep it simple and do-able.

We should also review the oslo.messaging notification api ... I've got
some concerns we've lost our way there.

-S


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Treating notifications as a contract (CADF)

2014-09-04 Thread Sandy Walsh

Yesterday, we had a great conversation with Matt Rutkowski from IBM, one
of the authors of the CADF spec.

I was having a disconnect on what CADF offers and got it clarified.

My assumption was CADF was a set of transformation/extraction rules for
taking data from existing data structures and defining them as
well-known things. For example, CADF needs to know who sent this
notification. I thought CADF would give us a means to point at an
existing data structure and say that's where you find it.

But I was wrong. CADF is a full-on schema/data structure of its own. It
would be a fork-lift replacement for our existing notifications.
However, if your service hasn't really adopted notifications yet (green
field) or you can handle a fork-lift replacement, CADF is a good option.
There are a few gotcha's though. If you have required data that is
outside of the CADF spec, it would need to go in the attachment
section of the notification and that still needs a separate schema to
define it. Matt's team is very receptive to extending the spec to
include these special cases though.

Anyway, I've written up all the options (as I see them) [1] with the
advantages/disadvantages of each approach. It's just a strawman, so
bend/spindle/mutilate.

Look forward to feedback!
-S


[1] https://wiki.openstack.org/wiki/NotificationsAndCADF




On 9/3/2014 12:30 PM, Sandy Walsh wrote:
 On 9/3/2014 11:32 AM, Chris Dent wrote:
 On Wed, 3 Sep 2014, Sandy Walsh wrote:

 We're chatting with IBM about CADF and getting down to specifics on
 their applicability to notifications. Once I get StackTach.v3 into
 production I'm keen to get started on revisiting the notification
 format and olso.messaging support for notifications.

 Perhaps a hangout for those keenly interested in doing something about this?
 That seems like a good idea. I'd like to be a part of that.
 Unfortunately I won't be at summit but would like to contribute what
 I can before and after.

 I took some notes on this a few weeks ago and extracted what seemed
 to be the two main threads or ideas the were revealed by the
 conversation that happened in this thread:

  * At the micro level have versioned schema for notifications such that
one end can declare I am sending version X of notification
foo.bar.Y and the other end can effectively deal.
 Yes, that's table-stakes I think. Putting structure around the payload
 section.

 Beyond type and version we should be able to attach meta information
 like public/private visibility and perhaps hints for external mapping
 (this trait - that trait in CADF, for example).

  * At the macro level standardize a packaging or envelope of all
notifications so that they can be consumed by very similar code.
That is: constrain the notifications in some way so we can also
constrain the consumer code.
 That's the intention of what we have now. The top level traits are
 standard, the payload is open. We really only require: message_id,
 timestamp and event_type. For auditing we need to cover Who, What, When,
 Where, Why, OnWhat, OnWhere, FromWhere.

  These ideas serve two different purposes: One is to ensure that
  existing notification use cases are satisfied with robustness and
  provide a contract between two endpoints. The other is to allow a
  fecund notification environment that allows and enables many
  participants.
 Good goals. When Producer and Consumer know what to expect, things are
 good ... I know to find the Instance ID here. When the consumer
 wants to deal with a notification as a generic object, things get tricky
 (find the instance ID in the payload, What is the image type?, Is
 this an error notification?)

 Basically, how do we define the principle artifacts for each service and
 grant the consumer easy/consistent access to them? (like the 7-W's above)

 I'd really like to find a way to solve that problem.

 Is that a good summary? What did I leave out or get wrong?

 Great start! Let's keep it simple and do-able.

 We should also review the oslo.messaging notification api ... I've got
 some concerns we've lost our way there.

 -S


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] StackTach.v3 - Screencasts ...

2014-09-05 Thread Sandy Walsh

For those of you playing the home game ... just added four new screencasts to 
the StackTach.v3 playlist. 

These are technical deep dives into the code added over the last week or so, 
with demos. 
For the more complex topics I spend a little time on the background and 
rationale. 

StackTach.v3: Stream debugging  (24:22)
StackTach.v3: Idempotent pipeline processing and debugging (12:16)
StackTach.v3: Quincy  Quince - the REST API (22:56)
StackTach.v3: Klugman the versioned cmdline tool for Quincy (8:46)

https://www.youtube.com/playlist?list=PLmyM48VxCGaW5pPdyFNWCuwVT1bCBV5p3

Please add any comments to the video and I'll try to address them there. 

Next ... the move to StackForge!

Have a great weekend!
-S
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Treating notifications as a contract

2014-09-10 Thread Sandy Walsh

 Jay Pipes - Wednesday, September 10, 2014 3:56 PM
On 09/03/2014 11:21 AM, Sandy Walsh wrote:
 On 9/3/2014 11:32 AM, Chris Dent wrote:
 I took some notes on this a few weeks ago and extracted what seemed
 to be the two main threads or ideas the were revealed by the
 conversation that happened in this thread:

   * At the micro level have versioned schema for notifications such that
 one end can declare I am sending version X of notification
 foo.bar.Y and the other end can effectively deal.

 Yes, that's table-stakes I think. Putting structure around the payload
 section.

 Beyond type and version we should be able to attach meta information
 like public/private visibility and perhaps hints for external mapping
 (this trait - that trait in CADF, for example).

CADF doesn't address the underlying problem that Chris mentions above: 
that our notification events themselves needs to have a version 
associated with them.

Instead of versioning the message payloads themselves, instead CADF 
focuses versioning on the CADF spec itself, which is less than useful, 
IMO, and a sympton of what I like to call XML-itis.

Well, the spec is the payload, so you can't change the payload without changing 
the spec. Could be semantics, but I see your point. 

Where I *do* see some value in CADF is the primitive string codes it 
defines for resource classifications, actions, and outcomes (Sections 
A.2.5, A.3.5., and A.4.5 respectively in the CADF spec). I see no value 
in the long-form XML-itis fully-qualified URI long-forms of the 
primitive string codes.

+1 to the xml-itis, but do we really get any value from the resource 
classifications without them? Other than, yes, that's a good list to work
from.?

For resource classifications, it defines things like compute, 
storage, service, etc, as well as a structured hierarchy for 
sub-classifications, like storage/volume or service/block. Actions 
are string codes for verbs like create, configure or authenticate. 
Outcomes are string codes for success, failure, etc.

What I feel we need is a library that matches a (resource_type, action, 
version) tuple to a JSONSchema document that describes the payload for 
that combination of resource_type, action, and version.

The 7-W's that CADF define are quite useful and we should try to ensure our
notification payloads address as many of them as possible.

Who, What, When, Where, Why, On-What, To-Whom, To-Where ... not all are 
applicable for
every notification type. 

Also, we need to define standard units-of-measure for numeric fields:
mb vs. gb, bps vs. kbps, image type definitions ... ideally all this should be
part of the standard openstack nomenclature. These are the things that really
belong in oslo and used by everything from notifications to the scheduler
to flavor definitions, etc.  

If I were king for a day, I'd have a standardized notification message 
format that simply consisted of:

resource_class (string) -- From CADF, e.g. service/block
occurred_on (timestamp) -- when the event was published
action (string) -- From CADF, e.g. create
version (int or tuple) -- version of the (resource_class, action)
payload (json-encoded string) -- the message itself
outcome (string) -- Still on fence for this, versus just using payload

Yep, not a problem with that, so long as the payload has all the other things
we need (versioning, data types, visibility, etc)

There would be an Oslo library that would store the codification of the 
resource classes and actions, along with the mapping of (resource_class, 
action, version) to the JSONSchema document describing the payload field.

Producers of messages would consume the oslo lib like so:

```python
from oslo.notifications import resource_classes
from oslo.notifications import actions
from oslo.notifications import message

Not sure how this would look from a packaging perspective, but sure. 

I'm not sure if I like having to define every resource/action type in code
and then having an explosion of types in notification.actions ... perhaps
that should just be part of the schema definition

'action_type': string [acceptable values: create, delete, update]

I'd rather see these schemas defined in some machine readable
format (yaml or something) vs. code. Other languages are going to want
to consume these notifications and should be able to reuse the definitions. 

from nova.compute import power_states
from nova.compute import task_states
...

msg = message.Message(resource_classes.compute.machine,
 actions.update,
 version=1)

# msg is now an object that is guarded by the JSONSchema document
# that describes the version 1.0 schema of the UPDATE action
# for the resource class representing a VM (compute.machine)
# This means that if the producer attempts to set an
# attribute of the msg object that is *not* in that JSONSchema
# document, then an AttributeError would be raised. This essentially

Re: [openstack-dev] [Nova] - do we need .start and .end notifications in all cases ?

2014-09-22 Thread Sandy Walsh

Hey Phil, (sorry for top-post, web client)

There's no firm rule for requiring .start/.end and I think your criteria 
defines it well. Long running transactions (or multi complex-step transactions).

The main motivator behind .start/.end code was .error notifications not getting 
generated in many cases. We had no idea where something was failing. Putting a 
.start before the db operation let us know well, at least the service got the 
call

For some operations like resize, migrate, etc., the .start/.end is good for 
auditing and billing. Although, we could do a better job by simply managing the 
launched_at, deleted_at times better.

Later, we found that by reviewing .start/.end deltas we were able to predict 
pending failures before timeouts actually occurred.

But no, they're not mandatory and a single notification should certainly be 
used for simple operations.

Cheers!
-S



From: Day, Phil [philip@hp.com]
Sent: Monday, September 22, 2014 8:03 AM
To: OpenStack Development Mailing List (openstack-dev@lists.openstack.org)
Subject: [openstack-dev] [Nova] - do we need .start and .end notifications in 
all cases ?

Hi Folks,

I’d like to get some opinions on the use of pairs of notification messages for 
simple events.   I get that for complex operations on an instance (create, 
rebuild, etc) a start and end message are useful to help instrument progress 
and how long the operations took.However we also use this pattern for 
things like aggregate creation, which is just a single DB operation – and it 
strikes me as kind of overkill and probably not all that useful to any external 
system compared to a a single event “.create” event after the DB operation.

There is a change up for review to add notifications for service groups which 
is following this pattern (https://review.openstack.org/#/c/107954/) – the 
author isn’t doing  anything wrong in that there just following that pattern, 
but it made me wonder if we shouldn’t have some better guidance on when to use 
a single notification rather that a .start/.end pair.

Does anyone else have thoughts on this , or know of external systems that would 
break if we restricted .start and .end usage to long-lived instance operations ?

Phil

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] - do we need .start and .end notifications in all cases ?

2014-09-22 Thread Sandy Walsh

+1, the high-level code should deal with top-level exceptions and generate 
.error notifications (though it's a little spotty). Ideally we shouldn't need 
three events for simple operations. 

The use of .start/.end vs. logging is a bit of a blurry line. At its heart a 
notification should provide context around an operation: What happened? Who did 
it? Who did they do it to? Where did it happen? Where is it going to? etc.  
Stuff that could be used for auditing/billing. That's their main purpose.

But for mission critical operations (create instance, etc) notifications give 
us a hot-line to god. Something is wrong! vs. having to pour through log 
files looking for problems. Real-time. Low latency.  

I think it's a case-by-case judgement call which should be used.




From: Day, Phil [philip@hp.com]

I'm just a tad worried that this sounds like its starting to use notification 
as a replacement for logging.If we did this for every CRUD operation on an 
object don't we risk flooding the notification system.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] - do we need .start and .end notifications in all cases ?

2014-09-22 Thread Sandy Walsh

From: Jay Pipes [jaypi...@gmail.com]
Sent: Monday, September 22, 2014 11:51 AM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [Nova] - do we need .start and .end notifications 
in all cases ?

On 09/22/2014 07:37 AM, Sandy Walsh wrote:
 For some operations like resize, migrate, etc., the .start/.end is
 good for auditing and billing. Although, we could do a better job by
  simply managing the launched_at, deleted_at times better.

I'm sure I'll get no real disagreement from you or Andrew Laski on
this... but the above is one of the reasons we really should be moving
with pace towards a fully task-driven system, both internally in Nova
and externally via the Compute REST API. This would allow us to get rid
of the launched_at, deleted_at, created_at, updated_at, etc fields in
many of the database tables and instead have a data store for tasks
RDBMS or otherwise) that had start and end times in the task record,
along with codified task types.

You can see what I had in mind for the public-facing side of this here:

http://docs.oscomputevnext.apiary.io/#schema

See the schema for server task and server task item.

Totally agree. Though I would go one step further and say the Task state
transitions should be managed by notifications.

Then oslo.messaging is reduced to the simple notifications interface (no RPC).
Notification follow proper retry semantics and control Tasks. 
Tasks themselves can restart/retry/etc.

(I'm sure I'm singing to the choir)

-S
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Ceilometer] Nomination of Sandy Walsh to core team

2013-12-10 Thread Sandy Walsh

Thanks John, happy to help out if possible and I agree that events could use 
some extra attention.

-S

From: Herndon, John Luke [john.hern...@hp.com]
Sent: Monday, December 09, 2013 5:30 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: [openstack-dev] [Ceilometer] Nomination of Sandy Walsh to core team

Hi There!

I¹m not 100% sure what the process is around electing an individual to the
core team (i.e., can a non-core person nominate someone?). However, I
believe the ceilometer core team could use a member who is more active in
the development of the event pipeline. A core developer in this area will
not only speed up review times for event patches, but will also help keep
new contributions focused on the overall eventing vision.

To that end, I would like to nominate Sandy Walsh from Rackspace to
ceilometer-core. Sandy is one of the original authors of StackTach, and
spearheaded the original stacktach-ceilometer integration. He has been
instrumental in many of my codes reviews, and has contributed much of the
existing event storage and querying code.

Thanks,
John Herndon
Software Engineer
HP Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Healthnmon

2013-12-18 Thread Sandy Walsh



On 12/18/2013 06:28 AM, Oleg Gelbukh wrote:
 I would copy that question. Looks like integration plan didn't work out,
 and healthnmon development either stalled or gone shadow..
 
 Anyone have information on that?


I think that's the case. There was no mention of Healthnmon at the last
summit.


 --
 Best regards,
 Oleg Gelbukh
 Mirnatis Inc.
 
 
 On Tue, Dec 17, 2013 at 11:29 PM, David S Taylor da...@bluesunrise.com
 mailto:da...@bluesunrise.com wrote:
 
 Could anyone tell me about the status of the Healthnmon project [1]?
 There is a proposal [2] to integrate Ceilometer and Healthnmon,
 which is about 1 year old. I am interested in developing a
 monitoring solution, and discovered that there may already be a
 project and community in place around OpenStack monitoring, or not 
 
 [1] https://github.com/stackforge/healthnmon/tree/master/healthnmon
 [2] https://wiki.openstack.org/wiki/Ceilometer/CeilometerAndHealthnmon
 
 Thanks,
 
 --
 David S Taylor
 CTO, Bluesunrise
 707 529-9194
 da...@bluesunrise.com mailto:da...@bluesunrise.com
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 mailto:OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] How do we format/version/deprecate things from notifications?

2013-12-18 Thread Sandy Walsh



On 12/18/2013 01:44 PM, Nikola Đipanov wrote:
 On 12/18/2013 06:17 PM, Matt Riedemann wrote:


 On 12/18/2013 9:42 AM, Matt Riedemann wrote:
 The question came up in this patch [1], how do we deprecate and remove
 keys in the notification payload?  In this case I need to deprecate and
 replace the 'instance_type' key with 'flavor' per the associated
 blueprint.

 [1] https://review.openstack.org/#/c/62430/


 By the way, my thinking is it's handled like a deprecated config option,
 you deprecate it for a release, make sure it's documented in the release
 notes and then drop it in the next release. For anyone that hasn't
 switched over they are broken until they start consuming the new key.

 
 FWIW - I am OK with this approach - but we should at least document it.
 I am also thinking that we may want to make it explicit like oslo.config
 does it.

Likewise ... until we get defined schemas and versioning on
notifications, it seems reasonable.

A post to the ML is nice too :)




 
 Thanks,
 
 N.
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] How do we format/version/deprecate things from notifications?

2013-12-18 Thread Sandy Walsh



On 12/18/2013 03:00 PM, Russell Bryant wrote:

 We really need proper versioning for notifications.  We've had a
 blueprint open for about a year, but AFAICT, nobody is actively working
 on it.
 
 https://blueprints.launchpad.net/nova/+spec/versioned-notifications
 

IBM is behind this effort now and are keen to get CADF support around
notifications. Seems to handle all of our use cases.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Heat] [Nova] [oslo] [Ceilometer] about notifications : huge and may be non secure

2014-01-29 Thread Sandy Walsh

On 01/29/2014 11:50 AM, Swann Croiset wrote:
Hi stackers,

I would like to sharemy wonder here about Notifications.

I'm working [1] on Heat notifications and I noticed that :
1/ Heat uses his context to store 'password'
2/ Heat and Nova store 'auth_token' in context too. Didn't check for
other projects except for neutron which doesn't store auth_token

These infos are consequently sent thru their notifications.

I guess we consider the broker as securised and network communications
with services too BUT
should not we delete these data anyway since IIRC they are never in
use(at least by ceilometer)and by the way
throwing it away the security question ?

My other concern is the size (Kb) of notifications : 70% for auth_token
(with pki) !
We can reduce the volume drastically and easily by deleting these data
from notifications.
I know that RabbitMQ (or others) is very robust and can handle this
volume but when I see this kind of improvements, I'am tempted to do it.

I see an easy way to fix that in oslo-incubator [2] :
delete keys of context if existing, config driven with password and
auth_token by default

thoughts?

Yeah, there was a bunch of work in nova to eliminate these sorts of
fields from the notification payload. They should certainly be
eliminated from other services as well. Ideally, as you mention, at the
olso layer.

We assume the notifications can be large, but they shouldn't be that large.

The CADF work that IBM is doing to provide versioning and schemas to
notifications will go a long way here. They have provisions for marking
fields as private. I think this is the right way to go, but we may have
to do some hack fixes in the short term.

-S

[1]
https://blueprints.launchpad.net/ceilometer/+spec/handle-heat-notifications
[2]
https://github.com/openstack/oslo-incubator/blob/master/openstack/common/notifier/rpc_notifier.py

and others

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo-notify] notifications consumed by multiple subscribers

2014-02-11 Thread Sandy Walsh

The notification system can specify multiple queues to publish to, so each of 
your dependent services can feed from a separate queue. 

However, this is a critical bug in oslo.messaging that has broken this feature. 
https://bugs.launchpad.net/nova/+bug/1277204

Hopefully it'll get fixed quickly and you'll be able to do what you need.

There were plans for a notification consumer in oslo.messaging, but I don't 
know where it stands. I'm working on a standalone notification consumer library 
for rabbit. 

-S


From: Sanchez, Cristian A [cristian.a.sanc...@intel.com]
Sent: Tuesday, February 11, 2014 2:28 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: [openstack-dev] [oslo-notify] notifications consumed by multiple 
subscribers

Hi,
I’m planning to use oslo.notify mechanisms to implement a climate blueprint: 
https://blueprints.launchpad.net/climate/+spec/notifications. Ideally, the 
notifications sent by climate should be received by multiple services 
subscribed to the same topic. Is that possible with oslo.notify? And moreover, 
is there any mechanism for removing items from the queue? Or should one 
subscriber be responsible for removing items from it?

Thanks

Cristian

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] Gamification and on-boarding ...

2014-02-12 Thread Sandy Walsh

At the Nova mid-cycle meetup we've been talking about the problem of helping 
new contributors. It got into a discussion of karma, code reviews, bug fixes 
and establishing a name for yourself before screaming in a chat room can 
someone look at my branch. We want this experience to be positive, but not 
everyone has time to hand-hold new people in the dance.

The informal OpenStack motto is automate everything, so perhaps we should 
consider some form of gamification [1] to help us? Can we offer badges, quests 
and challenges to new users to lead them on the way to being strong 
contributors?

Fixed your first bug badge
Updated the docs badge
Got your blueprint approved badge
Triaged a bug badge
Reviewed a branch badge
Contributed to 3 OpenStack projects badge
Fixed a Cells bug badge
Constructive in IRC badge
Freed the gate badge
Reverted branch from a core badge
etc. 

These can be strung together as Quests to lead people along the path. It's more 
than karma and less sterile than stackalytics. The Foundation could even 
promote the rising stars and highlight the leader board. 

There are gamification-as-a-service offerings out there [2] as well as Fedora 
Badges [3] (python and open source) that we may want to consider. 

Thoughts?
-Sandy

[1] http://en.wikipedia.org/wiki/Gamification
[2] http://gamify.com/ (and many others)
[3] https://badges.fedoraproject.org/

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] What is the currently accepted way to do plugins

2014-03-04 Thread Sandy Walsh

This brings up something that's been gnawing at me for a while now ... why use 
entry-point based loaders at all? I don't see the problem they're trying to 
solve. (I thought I got it for a while, but I was clearly fooling myself)

1. If you use the load all drivers in this category feature, that's a 
security risk since any compromised python library could hold a trojan.

2. otherwise you have to explicitly name the plugins you want (or don't want) 
anyway, so why have the extra indirection of the entry-point? Why not just name 
the desired modules directly? 

3. the real value of a loader would be to also extend/manage the python path 
... that's where the deployment pain is. Use fully qualified filename driver 
and take care of the pathing for me. Abstracting the module/class/function 
name isn't a great win. 

I don't see where the value is for the added pain (entry-point 
management/package metadata) it brings. 

CMV,

-S

From: Russell Bryant [rbry...@redhat.com]
Sent: Tuesday, March 04, 2014 1:29 PM
To: Murray, Paul (HP Cloud Services); OpenStack Development Mailing List
Subject: Re: [openstack-dev] [Nova] What is the currently accepted way to do 
plugins

On 03/04/2014 06:27 AM, Murray, Paul (HP Cloud Services) wrote:
 One of my patches has a query asking if I am using the agreed way to
 load plugins: https://review.openstack.org/#/c/71557/

 I followed the same approach as filters/weights/metrics using
 nova.loadables. Was there an agreement to do it a different way? And if
 so, what is the agreed way of doing it? A pointer to an example or even
 documentation/wiki page would be appreciated.

The short version is entry-point based plugins using stevedore.

We should be careful though.  We need to limit what we expose as
external plug points, even if we consider them unstable.  If we don't
want it to be public, it may not make sense for it to be a plugin
interface at all.

--
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] What is the currently accepted way to do plugins

2014-03-04 Thread Sandy Walsh

And sorry, as to your original problem, the loadables approach is kinda
messy since only the classes that are loaded when *that* module are
loaded are used (vs. explicitly specifying them in a config). You may
get different results when the flow changes.

Either entry-points or config would give reliable results.


On 03/04/2014 03:21 PM, Murray, Paul (HP Cloud Services) wrote:
 In a chat with Dan Smith on IRC, he was suggesting that the important thing 
 was not to use class paths in the config file. I can see that internal 
 implementation should not be exposed in the config files - that way the 
 implementation can change without impacting the nova users/operators.

There's plenty of easy ways to deal with that problem vs. entry points.

MyModule.get_my_plugin() ... which can point to anywhere in the module
permanently.

Also, we don't have any of the headaches of merging setup.cfg sections
(as we see with oslo.* integration).

 Sandy, I'm not sure I really get the security argument. Python provides every 
 means possible to inject code, not sure plugins are so different. Certainly 
 agree on choosing which plugins you want to use though.

The concern is that any compromised part of the python eco-system can
get auto-loaded with the entry-point mechanism. Let's say Nova
auto-loads all modules with entry-points the [foo] section. All I have
to do is create a setup that has a [foo] section and my code is loaded.
Explicit is better than implicit.

So, assuming we don't auto-load modules ... what does the entry-point
approach buy us?


 From: Russell Bryant [rbry...@redhat.com]
 We should be careful though.  We need to limit what we expose as external 
 plug points, even if we consider them unstable.  If we don't want it to be 
 public, it may not make sense for it to be a plugin interface at all.

I'm not sure what the concern with introducing new extension points is?
OpenStack is basically just a big bag of plugins. If it's optional, it's
supposed to be a plugin (according to the design tenets).



 
 --
 Russell Bryant
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] What is the currently accepted way to do plugins

2014-03-04 Thread Sandy Walsh



On 03/04/2014 05:00 PM, Kevin L. Mitchell wrote:
 On Tue, 2014-03-04 at 12:11 -0800, Dan Smith wrote:
 Now, the actual concern is not related to any of that, but about whether
 we're going to open this up as a new thing we support. In general, my
 reaction to adding new APIs people expect to be stable is no. However,
 I understand why things like the resource reporting and even my events
 mechanism are very useful for deployers to do some plumbing and
 monitoring of their environment -- things that don't belong upstream anyway.

 So I'm conflicted. I think that for these two cases, as long as we can
 say that it's not a stable interface, I think it's probably okay.
 However, things like we've had in the past, where we provide a clear
 plug point for something like Compute manager API class are clearly
 off the table, IMHO.
 
 How about using 'unstable' as a component of the entrypoint group?
 E.g., nova.unstable.events…

Wouldn't that defeat the point of entry points ... immutable
endpoints? What happens when an unstable event is deemed stable?

 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Mistral] Crack at a Real life workflow

2014-03-06 Thread Sandy Walsh

DSL's are tricky beasts. On one hand I like giving a tool to
non-developers so they can do their jobs, but I always cringe when the
DSL reinvents the wheel for basic stuff (compound assignment
expressions, conditionals, etc).

YAML isn't really a DSL per se, in the sense that it has no language
constructs. As compared to a Ruby-based DSL (for example) where you
still have Ruby under the hood for the basic stuff and extensions to the
language for the domain-specific stuff.

Honestly, I'd like to see a killer object model for defining these
workflows as a first step. What would a python-based equivalent of that
real-world workflow look like? Then we can ask ourselves, does the DSL
make this better or worse? Would we need to expose things like email
handlers, or leave that to the general python libraries?

$0.02

-S

On 03/05/2014 10:50 PM, Dmitri Zimine wrote:
Folks,

I took a crack at using our DSL to build a real-world workflow.
Just to see how it feels to write it. And how it compares with
alternative tools.

This one automates a page from OpenStack operation
guide:
http://docs.openstack.org/trunk/openstack-ops/content/maintenance.html#planned_maintenance_compute_node

Here it is https://gist.github.com/dzimine/9380941
or here http://paste.openstack.org/show/72741/

I have a bunch of comments, implicit assumptions, and questions which
came to mind while writing it. Want your and other people's opinions on it.

But gist and paste don't let annotate lines!!! :(

May be we can put it on the review board, even with no intention to
check in, to use for discussion?

Any interest?

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Mistral] Crack at a Real life workflow

2014-03-06 Thread Sandy Walsh

On 03/06/2014 02:16 PM, Renat Akhmerov wrote:
IMO, it looks not bad (sorry, I’m biased too) even now. Keep in mind this is
not the final version, we keep making it more expressive and concise.

As for killer object model it’s not 100% clear what you mean. As always,
devil in the details. This is a web service with all the consequences. I
assume what you call “object model” here is nothing else but a python binding
for the web service which we’re also working on. Custom python logic you
mentioned will also be possible to easily integrate. Like I said, it’s still
a pilot stage of the project.

Yeah, the REST aspect is where the tricky part comes in :)

Basically, in order to make a grammar expressive enough to work across a
web interface, we essentially end up writing a crappy language. Instead,
we should focus on the callback hooks to something higher level to deal
with these issues. Minstral should just say I'm done this task, what
should I do next? and the callback service can make decisions on where
in the graph to go next.

Likewise with things like sending emails from the backend. Minstral
should just call a webhook and let the receiver deal with active
states as they choose.

Which is why modelling this stuff in code is usually always better and
why I'd lean towards the TaskFlow approach to the problem. They're
tackling this from a library perspective first and then (possibly)
turning it into a service. Just seems like a better fit. It's also the
approach taken by Amazon Simple Workflow and many BPEL engines.

-S

Renat Akhmerov
@ Mirantis Inc.

On 06 Mar 2014, at 22:26, Joshua Harlow harlo...@yahoo-inc.com wrote:

That sounds a little similar to what taskflow is trying to do (I am of
course biased).

I agree with letting the native language implement the basics (expressions,
assignment...) and then building the domain ontop of that. Just seems more
natural IMHO, and is similar to what linq (in c#) has done.

My 3 cents.

Sent from my really tiny device...

On Mar 6, 2014, at 5:33 AM, Sandy Walsh sandy.wa...@rackspace.com wrote:

$0.02

-S

On 03/05/2014 10:50 PM, Dmitri Zimine wrote:
Folks,

I took a crack at using our DSL to build a real-world workflow.
Just to see how it feels to write it. And how it compares with
alternative tools.

This one automates a page from OpenStack operation
guide:
http://docs.openstack.org/trunk/openstack-ops/content/maintenance.html#planned_maintenance_compute_node

Here it is https://gist.github.com/dzimine/9380941
or here http://paste.openstack.org/show/72741/

I have a bunch of comments, implicit assumptions, and questions which
came to mind while writing it. Want your and other people's opinions on
it.

But gist and paste don't let annotate lines!!! :(

May be we can put it on the review board, even with no intention to
check in, to use for discussion?

Any interest?

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] RFC - using Gerrit for Nova Blueprint review approval

2014-03-07 Thread Sandy Walsh

Yep, great idea. Do it.

On 03/07/2014 02:53 AM, Chris Behrens wrote:
 
 On Mar 6, 2014, at 11:09 AM, Russell Bryant rbry...@redhat.com wrote:
 […]
 I think a dedicated git repo for this makes sense.
 openstack/nova-blueprints or something, or openstack/nova-proposals if
 we want to be a bit less tied to launchpad terminology.
 
 +1 to this whole idea.. and we definitely should have a dedicated repo for 
 this. I’m indifferent to its name. :)  Either one of those works for me.
 
 - Chris
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.messaging] mongodb notification driver

2014-03-12 Thread Sandy Walsh

You may want to consider StackTach for troubleshooting (that's what it was 
initially created for)

https://github.com/rackerlabs/stacktach

It will consume and record the events as well as give you a gui and cmdline 
tools for tracing calls by server, request_id, event type, etc. 

Ping me if you have any issues getting it going.

Cheers
-S


From: Hiroyuki Eguchi [h-egu...@az.jp.nec.com]
Sent: Tuesday, March 11, 2014 11:09 PM
To: openstack-dev@lists.openstack.org
Subject: [openstack-dev] [oslo.messaging] mongodb notification driver

I'm envisioning a mongodb notification driver.

Currently, For troubleshooting, I'm using a log driver of notification, and 
sent notification log to rsyslog server, and store log in database using 
rsyslog-mysql package.

I would like to make it more simple, So I came up with this feature.

Ceilometer can manage notifications using mongodb, but Ceilometer should have 
the role of Metering, not Troubleshooting.

If you have any comments or suggestion, please let me know.
And please let me know if there's any discussion about this.

Thanks.
--hiroyuki

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Marconi][TC] Withdraw graduation request

2014-03-20 Thread Sandy Walsh

Big +1

From: Jay Pipes [jaypi...@gmail.com]
Sent: Thursday, March 20, 2014 8:18 PM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [Marconi][TC] Withdraw graduation request

This is a very mature stance and well-written email. Thanks, Flavio and
all of the Marconi team for having thick skin and responding to the
various issues professionally.

Cheers,
-jay

On Thu, 2014-03-20 at 23:59 +0100, Flavio Percoco wrote:
 Greetings,

 I'm sending this email on behalf of Marconi's team.

 As you already know, we submitted our graduation request a couple of
 weeks ago and the meeting was held on Tuesday, March 18th. During the
 meeting very important questions and issues were raised that made
 us think and analyse our current situation and re-think about what
 the best for OpenStack and Marconi would be in this very moment.

 After some considerations, we've reached the conclusion that this is
 probably not the right time for this project to graduate and that
 it'll be fruitful for the project and the OpenStack community if we
 take another development cycle before coming out from incubation. Here
 are some things we took under consideration:

 1. It's still not clear to the overall community what the goals of
 the project are. It is not fair for Marconi as a project nor for
 OpenStack as a community to move forward with this integration when
 there are still open questions about the project goals.

 2. Some critical issues came out of our attempt to have a gate job.
 For the team, the project and the community this is a very critical
 point. We've managed to have the gate working but we're still not
 happy with the results.

 3. The drivers currently supported by the project don't cover some
 important cases related to deploying it. One of them solves a
 licensing issue but introduces a scale issue whereas the other one
 solves the scale issue and introduces a licensing issue. Moreover,
 these drivers have created quite a confusion with regards to what the
 project goal's are too.

 4. We've seen the value - and believe in it - of OpenStack's
 incubation period. During this period, the project has gained
 maturity in its API, supported drivers and integration with the
 overall community.

 5. Several important questions were brought up in the recent ML
 discussions. These questions take time, effort but also represent a
 key point in the support, development and integration of the project
 with the rest of OpenStack. We'd like to dedicate to this questions
 the time they deserve.

 6. There are still some open questions in the OpenStack community
 related to the graduation requirements and the required supported
 technologies of integrated projects.

 Based on the aforementioned points, the team would like to withdraw
 the graduation request and remain an incubated project for one
 more development cycle.

 During the upcoming months, the team will focus on solving the issues
 that arose as part of last Tuesday's meeting. If possible, we would
 like to request a meeting where we can discuss with the TC - and
 whoever wants to participate - a set of *most pressing issues* that
 should be solved before requesting another graduation meeting. The
 team will be focused on solving those issues and other issues down
 that road.

 Although the team believes in the project's technical maturity, we think
 this is what is best for OpenStack and the project itself
 community-wise. The open questions are way too important for the team
 and the community and they shouldn't be ignored nor rushed.

 I'd also like to thank the team and the overall community. The team
 for its hard work during the last cycle and the community for being there
 and providing such important feedback in this process. We look forward
 to see Marconi graduating from incubation.

 Bests,
 Marconi's team.

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] ordering of notification 'events' in oslo.messaging

2014-03-31 Thread Sandy Walsh


On 03/31/2014 10:55 AM, Gordon Sim wrote:
 I believe that ordering of notifications at different levels is not
 guaranteed when receiving those notifications using a notification
 listener in olso.messaging.
 
 I.e. with something like:
 
 notifier = notifier.Notifier(get_transport(CONF), 'compute')
 notifier.info(ctxt, event_type, payload_1)
 notifier.warn(ctxt, event_type, payload_2)
 
 its possible that payload_1 is received after payload_2. The root cause
 is that a different queue is used for events of each level.
 
 In practice this is easier to observe with rabbit than qpid, as the qpid
 driver send every message synchronously which reduces the likelihood of
 there being more than one message on the listeners queues from the same
 notifier. Even for rabbit it takes a couple of thousand events before it
 usually occurs. Load on either the receiving client or the broker could
 increase the likelihood of out of order deliveries.
 
 Not sure if this is intended, but as it isn't immediately obvious, I
 thought it would be worth a note to the list.

If they're on different queues, the order they appear depends on the
consumer(s). It's not really an oslo.messaging issue.

You can do reproduce it with just two events:
warn.publish(Foo)
info.publish(Blah)
consume from info
consume from warn
info is out of order.

And, it's going to happen anyway if we get into a timeout and requeue()
scenario.

I think we have to assume that ordering cannot be guaranteed and it's
the consumers responsibility to handle it.

-S



 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [ceilometer] PTL candidacy

2014-04-02 Thread Sandy Walsh



On 04/02/2014 05:47 PM, Gordon Chung wrote:
 I'd like to announce my candidacy for PTL of Ceilometer.

Woot!


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Ceilometer] Notifications from non-local exchanges

2013-10-10 Thread Sandy Walsh



On 10/10/2013 06:16 PM, Neal, Phil wrote:
 Greetings all, I'm looking at how to expand the ability of our CM
 instance to consume notifications and have a quick question about
 the configuration and flow...
 
 For the notifications central agent ,  we rely on the services (i.e.
 glance, cinder)  to drop messages on the same messaging host as used
 by Ceilometer. From there the listener picks it up and cycles through
 the plugin logic to convert it to a sample. It's apparent that we
 can't pass an alternate hostname via the control_exchange values, so
 is there another method for harvesting messages off of other
 instances (e.g. another compute node)?

Hey Phil,

You don't really need to specify the exchange name to consume
notifications. It will default to the control-exchange if not specified
anyway.

How it works isn't so obvious.

Depending on the priority of then notification the oslo notifier will
publish on topic.priority using the service's control-exchange. If
that queue doesn't exist it'll create it and bind the control-exchange
to it. This is so we can publish even if there are no consumers yet.

Oslo.rpc creates a 1:1 mapping of routing_key and queue to topic (no
wildcards). So we get

exchange:service - binding: routing_key topic.priority -
queue topic.priority

(essentially, 1 queue per priority)

Which is why, if you want to enable services to generate notifications,
you just have to set the driver and the topic(s) to publish on. Exchange
is implied and routing key/queue are inferred from topic.

Likewise we only have to specify the queue name to consume, since we
only need an exchange to publish.

I have a bare-bones oslo notifier consumer and client here if you want
to mess around with it (and a bare-bones kombu version in the parent).

https://github.com/SandyWalsh/amqp_sandbox/tree/master/oslo

Not sure if that answered your question or made it worse? :)

Cheers
-S


 
 
 - Phil
 
 ___ OpenStack-dev mailing
 list OpenStack-dev@lists.openstack.org 
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Ceilometer] Notifications from non-local exchanges

2013-10-15 Thread Sandy Walsh

On 10/15/2013 12:28 PM, Neal, Phil wrote:

 -Original Message-
 From: Sandy Walsh [mailto:sandy.wa...@rackspace.com]
 Sent: Thursday, October 10, 2013 6:20 PM
 To: openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] [Ceilometer] Notifications from non-local
 exchanges

 On 10/10/2013 06:16 PM, Neal, Phil wrote:
 Greetings all, I'm looking at how to expand the ability of our CM
 instance to consume notifications and have a quick question about
 the configuration and flow...

 For the notifications central agent ,  we rely on the services (i.e.
 glance, cinder)  to drop messages on the same messaging host as used
 by Ceilometer. From there the listener picks it up and cycles through
 the plugin logic to convert it to a sample. It's apparent that we
 can't pass an alternate hostname via the control_exchange values, so
 is there another method for harvesting messages off of other
 instances (e.g. another compute node)?

 Hey Phil,

 You don't really need to specify the exchange name to consume
 notifications. It will default to the control-exchange if not specified
 anyway.

 How it works isn't so obvious.

 Depending on the priority of then notification the oslo notifier will
 publish on topic.priority using the service's control-exchange. If
 that queue doesn't exist it'll create it and bind the control-exchange
 to it. This is so we can publish even if there are no consumers yet.

 I think the common default is notifications.info, yes?

 Oslo.rpc creates a 1:1 mapping of routing_key and queue to topic (no
 wildcards). So we get

 exchange:service - binding: routing_key topic.priority -
 queue topic.priority

 (essentially, 1 queue per priority)

 Which is why, if you want to enable services to generate notifications,
 you just have to set the driver and the topic(s) to publish on. Exchange
 is implied and routing key/queue are inferred from topic.

 Yep, following up to this point: Oslo takes care of the setup of exchanges on 
 behalf of the 
 services. When, say, Glance wants to push notifications onto the message bus, 
 they can set 
 the control_exchange value and the driver (rabbit, for example) and voila! 
 An exchange is
 set up with a default queue bound to the key. 

Correct.

 Likewise we only have to specify the queue name to consume, since we
 only need an exchange to publish.

 Here's where my gap is: the notification plugins seem to assume that 
 Ceilometer 
 is sitting on the same messaging node/endpoint as the service. The config 
 file allows
 us to specify the exchange names for the services , but not endpoints, so if 
 Glance 
 is publishing to notifications.info on rabbit.glance.hpcloud.net, and 
 ceilometer
  is  publishing/consuming from the rabbit.ceil.hpcloud.net node then the 
 Glance
  notifications won't be collected.

Hmm, I think I see your point. All the rabbit endpoints are determined
by these switches:
https://github.com/openstack/nova/blob/master/etc/nova/nova.conf.sample#L1532-L1592

We will need a way in CM to pull from multiple rabbits.

  I took another look at the Ceilometer config options...rabbit_hosts
 takes multiple hosts (i.e. rabbit.glance.hpcloud.net:, 
 rabbit.ceil.hpcloud.net:) 
 but it's not clear whether that's for publishing, collection, or both?  The 
 impl_kombu
 module does cycle through that list to create the connection pool, but it's 
 not
 clear to me how it all comes together in the plugin instantiation...

Nice catch. I'll have a look at that as well.

Regardless, I think CM should have separate switches for each collector
we run and break out the consume rabbit from the service rabbit.

I may be in a position to work on this shortly if that's needed.

 I have a bare-bones oslo notifier consumer and client here if you want
 to mess around with it (and a bare-bones kombu version in the parent).

 Will take a look! 

 https://github.com/SandyWalsh/amqp_sandbox/tree/master/oslo

 Not sure if that answered your question or made it worse? :)

 Cheers
 -S

 - Phil

 ___ OpenStack- dev mailing
 list OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Does openstack have a notification system that will let us know when a server changes state ?

2013-10-20 Thread Sandy Walsh

Notifications work great.

Actually StackTach has a web interface where you can watch the notifications 
coming through in real-time. We're slowing trying to get Ceilometer to have 
this functionality.

StackTach works with Nova and Glance events currently.

https://github.com/rackerlabs/stacktach

Here is a video of how to use it (and the cmdline interface to it Stacky)

http://www.sandywalsh.com/2012/10/debugging-openstack-with-stacktach-and.html

And, if you're a roll-your-own kinda guy, I have a bare-bones Olso-based 
notifier service here you can look at to see how it works:

https://github.com/SandyWalsh/amqp_sandbox

Feel free to reach out if you have any other questions about it.

Notifications are awesome.

-S




From: openstack learner [openstacklea...@gmail.com]
Sent: Friday, October 18, 2013 3:56 PM
To: openst...@lists.openstack.org; openstack-dev@lists.openstack.org
Subject: [openstack-dev] Does openstack have a notification system that will 
let us know when a server changes state ?

Hi all,


I am using the openstack python api. After I boot an instance, I will keep 
polling the instance status to check if its status changes from BUILD to ACTIVE.

My question is:

does openstack have a notification system that will let us know when a vm 
changes state (e.g. goes into ACTIVE state)? then we won't have to keep on 
polling it  when we need to know the change of the machine state.

Thanks
xin

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TripleO][Nova][neutron][Heat][Oslo][Ceilometer][Havana]Single Subscription Point for event notification

2013-10-28 Thread Sandy Walsh

Here's the current adoption of notifications in OpenStack ... hope it helps!

http://www.sandywalsh.com/2013/09/notification-usage-in-openstack-report.html

-S

From: Qing He [qing...@radisys.com]
Sent: Monday, October 28, 2013 8:48 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] 
[TripleO][Nova][neutron][Heat][Oslo][Ceilometer][Havana]Single Subscription 
Point for event notification

Thanks Angus!
Yes, if this rpc notification mechanism works for all other components, e.g., 
Neutron, in addition to Nova, which seems to be the only documented component 
working with this notification system.

For example, can we do something like

Network.instance.shutdown/.end

Or
Storage.instance.shutdown/.end

Or
Image.instance.shutdown/.end
...

-Original Message-
From: Angus Salkeld [mailto:asalk...@redhat.com]
Sent: Monday, October 28, 2013 4:36 PM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] 
[TripleO][Nova][neutron][Heat][Oslo][Ceilometer][Havana]Single Subscription 
Point for event notification

On 28/10/13 22:30 +, Qing He wrote:
All,
I found multiple places/components you can get event alarms, e.g., Heat, 
Ceilometer, Oslo, Nova etc, notification. But I fail to find any documents as 
to how to do it in the respective component documents.

I 'm  wondering if there is document as to if there is a single API entry 
point where you can subscribe and get event notification from all components, 
such as Nova, Neutron.

Hi,

If you are talking about rpc notifications, then this is one wiki page I know 
about:
https://wiki.openstack.org/wiki/SystemUsageData

(I have just added some heat notifications to it).

-Angus

Thanks,

Qing

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [Ceilometer] Suggestions for alarm improvements ...

2013-10-29 Thread Sandy Walsh

Hey y'all,

Here are a few notes I put together around some ideas for alarm improvements. 
In order to set it up I spent a little time talking about the Ceilometer 
architecture in general, including some of the things we have planned for 
IceHouse. 

I think Parts 1-3 will be useful to anyone looking into Ceilometer. Part 4 is 
where the meat of it is. 

https://wiki.openstack.org/wiki/Ceilometer/AlarmImprovements

Look forward to feedback from everyone and chatting about it at the summit.

If I missed something obvious, please mark it up so we can address it.

-S

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Heat] Locking and ZooKeeper - a space oddysey

2013-10-30 Thread Sandy Walsh



On 10/30/2013 03:10 PM, Steven Dake wrote:
 I will -2 any patch that adds zookeeper as a dependency to Heat.

Certainly any distributed locking solution should be plugin based and
optional. Just as a database-oriented solution could be the default plugin.

Re: the Java issue, we already have optional components in other
languages. I know Java is a different league of pain, but if it's an
optional component and left as a choice of the deployer, should we care?

-S

PS As an aside, what are your issues with ZK?


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Heat] Locking and ZooKeeper - a space oddysey

2013-10-30 Thread Sandy Walsh

Doh, sorry, left out the important part I had originally intended.

The ZK unit tests could be split to not run by default, but if you're a
ZK shop ... run them yourself. They might not be included in the gerrit
tests, but should be the nature with heavy-weight drivers.

We need to do more of this test splitting in general anyway.

-S


On 10/30/2013 04:20 PM, Sandy Walsh wrote:
 
 
 On 10/30/2013 03:10 PM, Steven Dake wrote:
 I will -2 any patch that adds zookeeper as a dependency to Heat.
 
 Certainly any distributed locking solution should be plugin based and
 optional. Just as a database-oriented solution could be the default plugin.
 
 Re: the Java issue, we already have optional components in other
 languages. I know Java is a different league of pain, but if it's an
 optional component and left as a choice of the deployer, should we care?
 
 -S
 
 PS As an aside, what are your issues with ZK?
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Heat] Locking and ZooKeeper - a space oddysey

2013-10-30 Thread Sandy Walsh



On 10/30/2013 04:44 PM, Robert Collins wrote:
 On 31 October 2013 08:37, Sandy Walsh sandy.wa...@rackspace.com wrote:
 Doh, sorry, left out the important part I had originally intended.

 The ZK unit tests could be split to not run by default, but if you're a
 ZK shop ... run them yourself. They might not be included in the gerrit
 tests, but should be the nature with heavy-weight drivers.

 We need to do more of this test splitting in general anyway.
 
 Yes... but.
 
 We need to aim at production. If ZK is going to be the production sane
 way of doing it with the reference OpenStack code base, then we
 absolutely have to have our functional and integration tests run with
 ZK. Unit tests shouldn't be talking to a live ZK anyhow, so they don't
 concern me.

Totally agree at the functional/integration test level. My concern was
having to bring ZK into a dev env.

We've already set the precedent with Erlang (rabbitmq). There are HBase
(Java) drivers out there and Torpedo tests against a variety of other
databases.

I think the horse has left the barn.


 
 -Rob
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Heat] Locking and ZooKeeper - a space oddysey

2013-10-31 Thread Sandy Walsh



On 10/30/2013 08:08 PM, Steven Dake wrote:
 On 10/30/2013 12:20 PM, Sandy Walsh wrote:

 On 10/30/2013 03:10 PM, Steven Dake wrote:
 I will -2 any patch that adds zookeeper as a dependency to Heat.
 Certainly any distributed locking solution should be plugin based and
 optional. Just as a database-oriented solution could be the default
 plugin.

 Sandy,
 
 Even if it is optional, some percentage of the userbase will enable it
 and expect the Heat community to debug and support it.

But, that's the nature of every openstack project. I don't support
HyperV in Nova or HBase in Ceilometer. The implementers deal with that
support. I can help guide someone to those people but have no intentions
of standing up those environments.

 Re: the Java issue, we already have optional components in other
 languages. I know Java is a different league of pain, but if it's an
 optional component and left as a choice of the deployer, should we care?

 -S

 PS As an aside, what are your issues with ZK?

 
 
 I realize zookeeper exists for a reason.  But unfortunately Zookeeper is
 a server, rather then an in-process library.  This means someone needs
 to figure out how to document, scale, secure, and provide high
 availability for this component.  

Yes, that's why we would use it. Same goes for rabbit and mysql.

 This is extremely challenging for the
 two server infrastructure components OpenStack server processes depend
 on today (AMQP, SQL).  If the entire OpenStack community saw value in
 biting the bullet and accepting zookeeper as a dependency and taking on
 this work, I might be more ameniable.  

Why do other services need to agree on adopting ZK? If some Heat users
need it, they can use it. Nova shouldn't care.

 What we are talking about in the
 review, however, is that the Heat team bite that bullet, which is a big
 addition to the scope of work we already execute for the ability to gain
 a distributed lock.  I would expect there are simpler approaches to
 solve the problem without dragging the baggage of a new server component
 into the OpenStack deployment.

Yes, there probably are, and alternatives are good. But, as others have
attested, ZK is tried and true. Why not support it also?

 Using zookeeper as is suggested in the review is far different then the
 way Nova uses Zookeeper.  With the Nova use case, Nova still operates
 just dandy without zookeeper.  With zookeeper in the Heat usecase, it
 essentially becomes the default way people are expected to deploy Heat.

Why, if it's a plugin?

 What I would prefer is taskflow over AMQP, to leverage existing server
 infrastructure (that has already been documented, scaled, secured, and
 HA-ified).

Same problem exists, we're just pushing the ZK decision to another service.

 Regards
 -steve
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Heat] Locking and ZooKeeper - a space oddysey

2013-10-31 Thread Sandy Walsh



On 10/31/2013 11:43 AM, Monty Taylor wrote:
 
 Yes. I'm strongly opposed to ZooKeeper finding its way into the already
 complex pile of things we use.

Monty, is that just because the stack is very complicated now, or
something personal against ZK (or Java specifically)?

Curious.

-S


 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] When is it okay for submitters to say 'I don't want to add tests' ?

2013-10-31 Thread Sandy Walsh



On 10/30/2013 11:37 PM, Robert Collins wrote:
 This is a bit of a social norms thread
 
 I've been consistently asking for tests in reviews for a while now,
 and I get the occasional push-back. I think this falls into a few
 broad camps:
 
 A - there is no test suite at all, adding one in unreasonable
 B - this thing cannot be tested in this context (e.g. functional tests
 are defined in a different tree)
 C - this particular thing is very hard to test
 D - testing this won't offer benefit
 E - other things like this in the project don't have tests
 F - submitter doesn't know how to write tests
 G - submitter doesn't have time to write tests
 
 Now, of these, I think it's fine not add tests in cases A, B, C in
 combination with D, and D.
 
 I don't think E, F or G are sufficient reasons to merge something
 without tests, when reviewers are asking for them. G in the special
 case that the project really wants the patch landed - but then I'd
 expect reviewers to not ask for tests or to volunteer that they might
 be optional.
 
 Now, if I'm wrong, and folk have different norms about when to accept
 'reason X not to write tests' as a response from the submitter -
 please let me know!

I've done a lot of thinking around this topic [1][2] and really it comes
down to this: everything can be tested and should be.

There is an argument to A, but that goes beyond the scope of our use
case I think. If I hear B, I would suspect the tests aren't unit tests,
but are functional/integration tests (a common problem in OpenStack).
Functional tests are brittle and usually have painful setup sequences.
The other cases fall into the -1 camp for me. Tests required.

That said, recently I was -1'ed for not updating a test, because I added
code that didn't change the program flow, but introduced a new call.
According to my rules, that didn't need a test, but I agreed with the
logic that people would be upset if the call wasn't made (it was a
notification). So a test was added. Totally valid argument.

TL;DR: Tests are always required. We need to fix our tests to be proper
unit tests and not functional/integration tests so it's easy to add new
ones.

-S


[1]
http://www.sandywalsh.com/2011/06/effective-units-tests-and-integration.html
[2]
http://www.sandywalsh.com/2011/08/pain-of-unit-tests-and-dynamically.html


 -Rob
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Oslo] Improving oslo-incubator update.py

2013-11-23 Thread Sandy Walsh

Seeing this thread reminded me: 

We need support in the update script for entry points in olso setup.cfg to make 
their way into the target project.

So, if update is getting some love, please keep that in mind. 
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Adding notifications to Horizon

2013-11-25 Thread Sandy Walsh

+1 on the inline method. It makes it clear when a notification should be
emitted and, as you say, handles the exception handling better.

Also, if it makes sense for Horizon, consider bracketing long-running
operations in .start/.end pairs. This will help with performance tuning
and early error detection.

More info on well behaved notifications in here:
http://www.sandywalsh.com/2013/09/notification-usage-in-openstack-report.html

Great to see!

-S


On 11/25/2013 11:58 AM, Florent Flament wrote:
 Hi,
 
 I am interested in adding AMQP notifications to the Horizon dashboard,
 as described in the following blueprint:
 https://blueprints.launchpad.net/horizon/+spec/horizon-notifications
 
 There are currently several implementations in Openstack. While
 Nova and Cinder define `notify_about_*` methods that are called
 whenever a notification has to be sent, Keystone uses decorators,
 which send appropriate notifications when decorated methods are
 called.
 
 I fed the blueprint's whiteboard with an implementation proposal,
 based on Nova and Cinder implementation. I would be interested in
 having your opinion about which method would fit best, and whether
 these notifications make sense at all.
 
 Cheers,
 Florent Flament
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [Ceilometer] storage driver testing

2013-11-27 Thread Sandy Walsh

Hey!

We've ballparked that we need to store a million events per day. To that end, 
we're flip-flopping between sql and no-sql solutions, hybrid solutions that 
include elastic search and other schemes. Seems every road we go down has some 
limitations. So, we've started working on test suite for load testing the 
ceilometer storage drivers. The intent is to have a common place to record our 
findings and compare with the efforts of others.

There's an etherpad where we're tracking our results [1] and a test suite that 
we're building out [2]. The test suite works against a fork of ceilometer where 
we can keep our experimental storage driver tweaks [3].

The test suite hits the storage drivers directly, bypassing the api, but still 
uses the ceilometer models. We've added support for dumping the results to 
statsd/graphite for charting of performance results in real-time.

If you're interested in large scale deployments of ceilometer, we would welcome 
any assistance.

Thanks!
-Sandy

[1] https://etherpad.openstack.org/p/ceilometer-data-store-scale-testing
[2] https://github.com/rackerlabs/ceilometer-load-tests
[3] https://github.com/rackerlabs/instrumented-ceilometer

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Ceilometer] storage driver testing

2013-11-29 Thread Sandy Walsh



On 11/29/2013 11:41 AM, Julien Danjou wrote:
 On Fri, Nov 29 2013, Nadya Privalova wrote:
 
 I'm very interested in performance results for Ceilometer. Now we have
 successfully installed Ceilometer in the HA-lab with 200 computes and 3
 controllers. Now it works pretty good with MySQL. Our next steps are:
 
 What I'd like to know in both your and Sandy's tests, is the number of
 collector you are running in parallel.

For our purposes we aren't interested in the collector. We're purely
testing the performance of the storage drivers and the underlying
databases.




 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Ceilometer] storage driver testing

2013-11-29 Thread Sandy Walsh



On 11/29/2013 11:32 AM, Nadya Privalova wrote:
 Hello Sandy,
 
 I'm very interested in performance results for Ceilometer. Now we have
 successfully installed Ceilometer in the HA-lab with 200 computes and 3
 controllers. Now it works pretty good with MySQL. Our next steps are:
 
 1. Configure alarms
 2. Try to use Rally for OpenStack performance with MySQL and MongoDB
 (https://wiki.openstack.org/wiki/Rally)
 
 We are open to any suggestions.

Awesome, as a group we really need to start a similar effort as the
storage driver tests for ceilometer in general.

I assume you're just pulling Samples via the agent? We're really just
focused on event storage and retrieval.

There seems to be three levels of load testing required:
1. testing through the collectors (either sample or event collection)
2. testing load on the CM api
3. testing the storage drivers.

Sounds like you're addressing #1, we're addressing #3 and Tempest
integration tests will be handling #2.

I should also add that we've instrumented the db and ceilometer hosts
using Diamond to statsd/graphite for tracking load on the hosts while
the tests are underway. This will help with determining how many
collectors we need, where the bottlenecks are coming from, etc.

It might be nice to standardize on that so we can compare results?

-S

 
 Thanks,
 Nadya
 
 
 
 On Wed, Nov 27, 2013 at 9:42 PM, Sandy Walsh sandy.wa...@rackspace.com
 mailto:sandy.wa...@rackspace.com wrote:
 
 Hey!
 
 We've ballparked that we need to store a million events per day. To
 that end, we're flip-flopping between sql and no-sql solutions,
 hybrid solutions that include elastic search and other schemes.
 Seems every road we go down has some limitations. So, we've started
 working on test suite for load testing the ceilometer storage
 drivers. The intent is to have a common place to record our findings
 and compare with the efforts of others.
 
 There's an etherpad where we're tracking our results [1] and a test
 suite that we're building out [2]. The test suite works against a
 fork of ceilometer where we can keep our experimental storage driver
 tweaks [3].
 
 The test suite hits the storage drivers directly, bypassing the api,
 but still uses the ceilometer models. We've added support for
 dumping the results to statsd/graphite for charting of performance
 results in real-time.
 
 If you're interested in large scale deployments of ceilometer, we
 would welcome any assistance.
 
 Thanks!
 -Sandy
 
 [1] https://etherpad.openstack.org/p/ceilometer-data-store-scale-testing
 [2] https://github.com/rackerlabs/ceilometer-load-tests
 [3] https://github.com/rackerlabs/instrumented-ceilometer
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 mailto:OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] maintenance policy for code graduating from the incubator

2013-11-29 Thread Sandy Walsh

So, as I mention in the branch, what about deployments that haven't 
transitioned to the library but would like to cherry pick this feature? 

after it starts moving into a library can leave a very big gap when the 
functionality isn't available to users.

-S


From: Eric Windisch [e...@cloudscaling.com]
Sent: Friday, November 29, 2013 2:47 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [oslo] maintenance policy for code graduating from 
the incubator

 Based on that, I would like to say that we do not add new features to
 incubated code after it starts moving into a library, and only provide
 stable-like bug fix support until integrated projects are moved over to
 the graduated library (although even that is up for discussion). After all
 integrated projects that use the code are using the library instead of the
 incubator, we can delete the module(s) from the incubator.

+1

Although never formalized, this is how I had expected we would handle
the graduation process. It is also how we have been responding to
patches and blueprints offerings improvements and feature requests for
oslo.messaging.

--
Regards,
Eric Windisch

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] maintenance policy for code graduating from the incubator

2013-11-30 Thread Sandy Walsh



On 11/29/2013 03:58 PM, Doug Hellmann wrote:
 
 
 
 On Fri, Nov 29, 2013 at 2:14 PM, Sandy Walsh sandy.wa...@rackspace.com
 mailto:sandy.wa...@rackspace.com wrote:
 
 So, as I mention in the branch, what about deployments that haven't
 transitioned to the library but would like to cherry pick this feature?
 
 after it starts moving into a library can leave a very big gap
 when the functionality isn't available to users.
 
 
 Are those deployments tracking trunk or a stable branch? Because IIUC,
 we don't add features like this to stable branches for the main
 components, either, and if they are tracking trunk then they will get
 the new feature when it ships in a project that uses it. Are you
 suggesting something in between?

Tracking trunk. If the messaging branch has already landed in Nova, then
this is a moot discussion. Otherwise we'll still need it in incubator.

That said, consider if messaging wasn't in nova trunk. According to this
policy the new functionality would have to wait until it was. And, as
we've seen with messaging, that was a very long time. That doesn't seem
reasonable.

 
 Doug
 
  
 
 
 -S
 
 
 From: Eric Windisch [e...@cloudscaling.com
 mailto:e...@cloudscaling.com]
 Sent: Friday, November 29, 2013 2:47 PM
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [oslo] maintenance policy for code
 graduating from the incubator
 
  Based on that, I would like to say that we do not add new features to
  incubated code after it starts moving into a library, and only provide
  stable-like bug fix support until integrated projects are moved
 over to
  the graduated library (although even that is up for discussion).
 After all
  integrated projects that use the code are using the library
 instead of the
  incubator, we can delete the module(s) from the incubator.
 
 +1
 
 Although never formalized, this is how I had expected we would handle
 the graduation process. It is also how we have been responding to
 patches and blueprints offerings improvements and feature requests for
 oslo.messaging.
 
 --
 Regards,
 Eric Windisch
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 mailto:OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 mailto:OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] maintenance policy for code graduating from the incubator

2013-12-02 Thread Sandy Walsh



On 12/01/2013 06:40 PM, Doug Hellmann wrote:
 
 
 
 On Sat, Nov 30, 2013 at 3:52 PM, Sandy Walsh sandy.wa...@rackspace.com
 mailto:sandy.wa...@rackspace.com wrote:
 
 
 
 On 11/29/2013 03:58 PM, Doug Hellmann wrote:
 
 
 
  On Fri, Nov 29, 2013 at 2:14 PM, Sandy Walsh
 sandy.wa...@rackspace.com mailto:sandy.wa...@rackspace.com
  mailto:sandy.wa...@rackspace.com
 mailto:sandy.wa...@rackspace.com wrote:
 
  So, as I mention in the branch, what about deployments that
 haven't
  transitioned to the library but would like to cherry pick this
 feature?
 
  after it starts moving into a library can leave a very big gap
  when the functionality isn't available to users.
 
 
  Are those deployments tracking trunk or a stable branch? Because IIUC,
  we don't add features like this to stable branches for the main
  components, either, and if they are tracking trunk then they will get
  the new feature when it ships in a project that uses it. Are you
  suggesting something in between?
 
 Tracking trunk. If the messaging branch has already landed in Nova, then
 this is a moot discussion. Otherwise we'll still need it in incubator.
 
 That said, consider if messaging wasn't in nova trunk. According to this
 policy the new functionality would have to wait until it was. And, as
 we've seen with messaging, that was a very long time. That doesn't seem
 reasonable.
 
 
 The alternative is feature drift between the incubated version of rpc
 and oslo.messaging, which makes the task of moving the other projects to
 messaging even *harder*.
 
 What I'm proposing seems like a standard deprecation/backport policy;
 I'm not sure why you see the situation as different. Sandy, can you
 elaborate on how you would expect to maintain feature parity between the
 incubator and library while projects are in transition?

Deprecation usually assumes there is something in place to replace the
old way.

If I'm reading this correctly, you're proposing we stop adding to the
existing library as soon as the new library has started?

Shipping code always wins out. We can't stop development simply based on
the promise that something new is on the way. Leaving the existing code
to bug fix only status is far too limiting. In the case of messaging
this would have meant an entire release cycle with no new features in
oslo.rpc.

Until the new code replaces the old, we have to suffer the pain of
updating both codebases.


 Doug
 
  
 
 
 
  Doug
 
 
 
 
  -S
 
  
  From: Eric Windisch [e...@cloudscaling.com
 mailto:e...@cloudscaling.com
  mailto:e...@cloudscaling.com mailto:e...@cloudscaling.com]
  Sent: Friday, November 29, 2013 2:47 PM
  To: OpenStack Development Mailing List (not for usage questions)
  Subject: Re: [openstack-dev] [oslo] maintenance policy for code
  graduating from the incubator
 
   Based on that, I would like to say that we do not add new
 features to
   incubated code after it starts moving into a library, and
 only provide
   stable-like bug fix support until integrated projects are
 moved
  over to
   the graduated library (although even that is up for discussion).
  After all
   integrated projects that use the code are using the library
  instead of the
   incubator, we can delete the module(s) from the incubator.
 
  +1
 
  Although never formalized, this is how I had expected we would
 handle
  the graduation process. It is also how we have been responding to
  patches and blueprints offerings improvements and feature
 requests for
  oslo.messaging.
 
  --
  Regards,
  Eric Windisch
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
 mailto:OpenStack-dev@lists.openstack.org
  mailto:OpenStack-dev@lists.openstack.org
 mailto:OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
 mailto:OpenStack-dev@lists.openstack.org
  mailto:OpenStack-dev@lists.openstack.org
 mailto:OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
 mailto:OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Unified Guest Agent proposal

2013-12-06 Thread Sandy Walsh



On 12/06/2013 03:45 PM, Dmitry Mescheryakov wrote:
 Hello all,
 
 We would like to push further the discussion on unified guest agent. You
 may find the details of our proposal at [1].
 
 Also let me clarify why we started this conversation. Savanna currently
 utilizes SSH to install/configure Hadoop on VMs. We were happy with that
 approach until recently we realized that in many OpenStack deployments
 VMs are not accessible from controller. That brought us to idea to use
 guest agent for VM configuration instead. That approach is already used
 by Trove, Murano and Heat and we can do the same.
 
 Uniting the efforts on a single guest agent brings a couple advantages:
 1. Code reuse across several projects.
 2. Simplified deployment of OpenStack. Guest agent requires additional
 facilities for transport like message queue or something similar.
 Sharing agent means projects can share transport/config and hence ease
 life of deployers.
 
 We see it is a library and we think that Oslo is a good place for it.
 
 Naturally, since this is going to be a _unified_ agent we seek input
 from all interested parties.

It might be worth while to consider building from the Rackspace guest
agents for linux [2] and windows [3]. Perhaps get them moved over to
stackforge and scrubbed?

These are geared towards Xen, but that would be a good first step in
making the HV-Guest pipe configurable.

[2] https://github.com/rackerlabs/openstack-guest-agents-unix
[3] https://github.com/rackerlabs/openstack-guest-agents-windows-xenserver

-S


 [1] https://wiki.openstack.org/wiki/UnifiedGuestAgent
 
 Thanks,
 
 Dmitry
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [taskflow] Recommendations for the granularity of tasks and their stickiness to workers

2014-06-17 Thread Sandy Walsh

On 6/17/2014 7:04 AM, Eoghan Glynn wrote:
 Folks,

 A question for the taskflow ninjas.

 Any thoughts on best practice WRT $subject?

 Specifically I have in mind this ceilometer review[1] which adopts
 the approach of using very fine-grained tasks (at the level of an
 individual alarm evaluation) combined with short-term assignments
 to individual workers.

 But I'm also thinking of future potential usage of taskflow within
 ceilometer, to support partitioning of work over a scaled-out array
 of central agents.

 Does taskflow also naturally support a model whereby more chunky
 tasks (possibly including ongoing periodic work) are assigned to
 workers in a stickier fashion, such that re-balancing of workload
 can easily be triggered when a change is detected in the pool of
 available workers?

I don't think taskflow today is really focused on load balancing of
tasks. Something like gearman [1] might be better suited in the near term?

My understanding is that taskflow is really focused on in-process tasks
(with retry, restart, etc) and later will support distributed tasks. But
my data could be stale too. (jharlow?)

Even still, the decision of smaller tasks vs. chunky ones really comes
down to how much work you want to re-do if there is a failure. I've seen
some uses of taskflow where the breakdown of tasks seemed artificially
small. Meaning, the overhead of going back to the library on an
undo/rewind is greater than the undo itself.

-S

[1] http://gearman.org/


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Why is there a 'None' task_state between 'SCHEDULING' 'BLOCK_DEVICE_MAPPING'?

2014-06-26 Thread Sandy Walsh

Nice ... that's always bugged me.

From: wu jiang [win...@gmail.com]
Sent: Thursday, June 26, 2014 9:30 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [nova] Why is there a 'None' task_state between 
'SCHEDULING'  'BLOCK_DEVICE_MAPPING'?

Hi Phil,

Ok, I'll submit a patch to add a new task_state(like 'STARTING_BUILD') in these 
two days.
And related modifications will be definitely added in the Doc.

Thanks for your help. :)

WingWJ

On Thu, Jun 26, 2014 at 6:42 PM, Day, Phil 
philip@hp.commailto:philip@hp.com wrote:
Why do others think – do we want a spec to add an additional task_state value 
that will be set in a well defined place.   Kind of feels overkill for me in 
terms of the review effort that would take compared to just reviewing the code 
- its not as there are going to be lots of alternatives to consider here.

From: wu jiang [mailto:win...@gmail.commailto:win...@gmail.com]
Sent: 26 June 2014 09:19
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [nova] Why is there a 'None' task_state between 
'SCHEDULING'  'BLOCK_DEVICE_MAPPING'?

 Hi Phil,

thanks for your reply. So should I need to submit a patch/spec to add it now?

On Wed, Jun 25, 2014 at 5:53 PM, Day, Phil 
philip@hp.commailto:philip@hp.com wrote:
Looking at this a bit deeper the comment in _start_buidling() says that its 
doing this to “Save the host and launched_on fields and log appropriately “.
But as far as I can see those don’t actually get set until the claim is made 
against the resource tracker a bit later in the process, so this whole update 
might just be not needed – although I still like the idea of a state to show 
that the request has been taken off the queue by the compute manager.

From: Day, Phil
Sent: 25 June 2014 10:35

To: OpenStack Development Mailing List
Subject: RE: [openstack-dev] [nova] Why is there a 'None' task_state between 
'SCHEDULING'  'BLOCK_DEVICE_MAPPING'?

Hi WingWJ,

I agree that we shouldn’t have a task state of None while an operation is in 
progress.  I’m pretty sure back in the day this didn’t use to be the case and 
task_state stayed as Scheduling until it went to Networking  (now of course 
networking and BDM happen in parallel, so you have to be very quick to see the 
Networking state).

Personally I would like to see the extra granularity of knowing that a request 
has been started on the compute manager (and knowing that the request was 
started rather than is still sitting on the queue makes the decision to put it 
into an error state when the manager is re-started more robust).

Maybe a task state of “STARTING_BUILD” for this case ?

BTW I don’t think _start_building() is called anymore now that we’ve switched 
to conductor calling build_and_run_instance() – but the same task_state issue 
exist in there well.

From: wu jiang [mailto:win...@gmail.com]
Sent: 25 June 2014 08:19
To: OpenStack Development Mailing List
Subject: [openstack-dev] [nova] Why is there a 'None' task_state between 
'SCHEDULING'  'BLOCK_DEVICE_MAPPING'?

Hi all,

Recently, some of my instances were stuck in task_state 'None' during VM 
creation in my environment.

So I checked  found there's a 'None' task_state between 'SCHEDULING'  
'BLOCK_DEVICE_MAPPING'.

The related codes are implemented like this:

#def _start_building():
#self._instance_update(context, instance['uuid'],
#  vm_state=vm_states.BUILDING,
#  task_state=None,
#  expected_task_state=(task_states.SCHEDULING,
#   None))

So if compute node is rebooted after that procession, all building VMs on it 
will always stay in 'None' task_state. And it's useless and not convenient for 
locating problems.

Why not a new task_state for this step?

WingWJ

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.orgmailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.orgmailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo][messaging] Further improvements and refactoring

2014-06-27 Thread Sandy Walsh

Something to consider is the create the queue in advance feature is done for 
notifications, so we don't drop important messages on the floor by having an 
Exchange with no associated Queue.

For RPC operations, this may not be required (we assume the service is 
available). If this check is truly a time-sink we could ignore that check for 
rpc calls.

-S


On 6/10/2014 9:31 AM, Alexei Kornienko wrote:
Hi,

Please find some answers inline.

Regards,
Alexei

On 06/10/2014 03:06 PM, Flavio Percoco wrote:
On 10/06/14 15:03 +0400, Dina Belova wrote:
Hello, stackers!


Oslo.messaging is future of how different OpenStack components communicate with
each other, and really I’d love to start discussion about how we can make this
library even better then it’s now and how can we refactor it make more
production-ready.


As we all remember, oslo.messaging was initially inspired to be created as a
logical continuation of nova.rpc - as a separated library, with lots of
transports supported, etc. That’s why oslo.messaging inherited not only
advantages of now did the nova.rpc work (and it were lots of them), but also
some architectural decisions that currently sometimes lead to the performance
issues (we met some of them while Ceilometer performance testing [1] during the
Icehouse).


For instance, simple testing messaging server (with connection pool and
eventlet) can process 700 messages per second. The same functionality
implemented using plain kombu (without connection pool and eventlet)  driver is
processing ten times more - 7000-8000 messages per second.


So we have the following suggestions about how we may make this process better
and quicker (and really I’d love to collect your feedback, folks):


1) Currently we have main loop running in the Executor class, and I guess it’ll
be much better to move it to the Server class, as it’ll make relationship
between the classes easier and will leave Executor only one task - process the
message and that’s it (in blocking or eventlet mode). Moreover, this will make
further refactoring much easier.

To some extent, the executors are part of the server class since the
later is the one actually controlling them. If I understood your
proposal, the server class would implement the event loop, which means
we would have an EventletServer / BlockingServer, right?

If what I said is what you meant, then I disagree. Executors keep the
eventloop isolated from other parts of the library and this is really
important for us. One of the reason is to easily support multiple
python versions - by having different event loops.

Is my assumption correct? Could you elaborate more?
No It's not how we plan it. Server will do the loop and pass received message 
to dispatcher and executor. It means that we would still have blocking executor 
and eventlet executor in the same server class. We would just change the 
implementation part to make it more consistent and easier to control.



2) Some of the drivers implementations (such as impl_rabbit and impl_qpid, for
instance) are full of useless separated classes that in reality might be
included to other ones. There are already some changes making the whole
structure easier [2], and after the 1st issue will be solved Dispatcher and
Listener also will be able to be refactored.

This was done on purpose. The idea was to focus on backwards
compatibility rather than cleaning up/improving the drivers. That
said, sounds like those drivers could user some clean up. However, I
think we should first extend the test suite a bit more before hacking
the existing drivers.



3) If we’ll separate RPC functionality and messaging functionality it’ll make
code base clean and easily reused.

What do you mean with this?
We mean that current drivers are written with RPC code hardcoded inside 
(ReplyWaiter, etc.). Thats not how messaging library is supposed to work. We 
can move RPC to a separate layer and this would be beneficial for both rpc 
(code will become more clean and less error-prone) and core messaging part 
(we'll be able to implement messaging in way that will work much faster).


4) Connection pool can be refactored to implement more efficient connection
reusage.

Please, elaborate. What changes do you envision?
Currently there is a class that is called ConnectionContext that is used to 
manage pool. Additionaly it can be accessed/configured in several other places. 
If we refactor it a little bit it would be much easier to use connections from 
the pool.

As Dims suggested, I think filing some specs for this (and keeping the
proposals separate) would help a lot in understanding what the exact
plan is.

Glad to know you're looking forward to help improving oslo.messaging.

Thanks,
Flavio

Folks, are you ok with such a plan? Alexey Kornienko already started some of
this work [2], but really we want to be sure that we chose the correct vector
of development here.


Thanks!


[1] https://docs.google.com/document/d/

Re: [openstack-dev] [oslo][messaging] Further improvements and refactoring

2014-06-27 Thread Sandy Walsh

On 6/27/2014 11:27 AM, Alexei Kornienko wrote:
Hi,

Why should we create queue in advance?

Notifications are used for communicating with downstream systems (which may or 
may not be online at the time). This includes dashboards, monitoring systems, 
billing systems, etc. They can't afford to lose these important updates. So, a 
queue has to exist and the events just build-up until they are eaten.

RPC doesn't need this though.

Let's consider following use cases:
1)
* listener starts and creates a queue
* publishers connect to exchange and start publishing

No need to create a queue in advance here since listener does it when it starts


Right, this is the RPC case.

2)
* publishers create a queue in advance and start publishing


Creation is not correct since there is no guarantee that someone would ever use 
this queue...


This is why notifications are turned off by default.


IMHO listener should create a queue and publishers should not care about it at 
all.

What do you think?


See above. There are definite use-cases where the queue has to be created in 
advance. But, as I say, RPC isn't one of them. So, for 90% of the AMQP traffic, 
we don't need this feature. We should be able to disable it for RPC in 
oslo.messaging.

(I say should because I'm not positive some aspect of openstack doesn't 
depend on the queue existing. Thinking about the scheduler mostly)

-S


On 06/27/2014 05:16 PM, Sandy Walsh wrote:
Something to consider is the create the queue in advance feature is done for 
notifications, so we don't drop important messages on the floor by having an 
Exchange with no associated Queue.

For RPC operations, this may not be required (we assume the service is 
available). If this check is truly a time-sink we could ignore that check for 
rpc calls.

-S


On 6/10/2014 9:31 AM, Alexei Kornienko wrote:
Hi,

Please find some answers inline.

Regards,
Alexei

On 06/10/2014 03:06 PM, Flavio Percoco wrote:
On 10/06/14 15:03 +0400, Dina Belova wrote:
Hello, stackers!


Oslo.messaging is future of how different OpenStack components communicate with
each other, and really I’d love to start discussion about how we can make this
library even better then it’s now and how can we refactor it make more
production-ready.


As we all remember, oslo.messaging was initially inspired to be created as a
logical continuation of nova.rpc - as a separated library, with lots of
transports supported, etc. That’s why oslo.messaging inherited not only
advantages of now did the nova.rpc work (and it were lots of them), but also
some architectural decisions that currently sometimes lead to the performance
issues (we met some of them while Ceilometer performance testing [1] during the
Icehouse).


For instance, simple testing messaging server (with connection pool and
eventlet) can process 700 messages per second. The same functionality
implemented using plain kombu (without connection pool and eventlet)  driver is
processing ten times more - 7000-8000 messages per second.


So we have the following suggestions about how we may make this process better
and quicker (and really I’d love to collect your feedback, folks):


1) Currently we have main loop running in the Executor class, and I guess it’ll
be much better to move it to the Server class, as it’ll make relationship
between the classes easier and will leave Executor only one task - process the
message and that’s it (in blocking or eventlet mode). Moreover, this will make
further refactoring much easier.

To some extent, the executors are part of the server class since the
later is the one actually controlling them. If I understood your
proposal, the server class would implement the event loop, which means
we would have an EventletServer / BlockingServer, right?

If what I said is what you meant, then I disagree. Executors keep the
eventloop isolated from other parts of the library and this is really
important for us. One of the reason is to easily support multiple
python versions - by having different event loops.

Is my assumption correct? Could you elaborate more?
No It's not how we plan it. Server will do the loop and pass received message 
to dispatcher and executor. It means that we would still have blocking executor 
and eventlet executor in the same server class. We would just change the 
implementation part to make it more consistent and easier to control.



2) Some of the drivers implementations (such as impl_rabbit and impl_qpid, for
instance) are full of useless separated classes that in reality might be
included to other ones. There are already some changes making the whole
structure easier [2], and after the 1st issue will be solved Dispatcher and
Listener also will be able to be refactored.

This was done on purpose. The idea was to focus on backwards
compatibility rather than cleaning up/improving the drivers. That
said, sounds like those drivers could user some clean up. However, I
think we should first extend the test suite a bit more before

Re: [openstack-dev] [oslo] Openstack and SQLAlchemy

2014-06-30 Thread Sandy Walsh

woot! 

From: Mike Bayer [mba...@redhat.com]
Sent: Monday, June 30, 2014 1:56 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: [openstack-dev] [oslo] Openstack and SQLAlchemy

Hi all -

For those who don't know me, I'm Mike Bayer, creator/maintainer of
SQLAlchemy, Alembic migrations and Dogpile caching.   In the past month
I've become a full time Openstack developer working for Red Hat, given
the task of carrying Openstack's database integration story forward.
To that extent I am focused on the oslo.db project which going forward
will serve as the basis for database patterns used by other Openstack
applications.

I've summarized what I've learned from the community over the past month
in a wiki entry at:

https://wiki.openstack.org/wiki/Openstack_and_SQLAlchemy

The page also refers to an ORM performance proof of concept which you
can see at https://github.com/zzzeek/nova_poc.

The goal of this wiki page is to publish to the community what's come up
for me so far, to get additional information and comments, and finally
to help me narrow down the areas in which the community would most
benefit by my contributions.

I'd like to get a discussion going here, on the wiki, on IRC (where I am
on freenode with the nickname zzzeek) with the goal of solidifying the
blueprints, issues, and SQLAlchemy / Alembic features I'll be focusing
on as well as recruiting contributors to help in all those areas.  I
would welcome contributors on the SQLAlchemy / Alembic projects directly
as well, as we have many areas that are directly applicable to Openstack.

I'd like to thank Red Hat and the Openstack community for welcoming me
on board and I'm looking forward to digging in more deeply in the coming
months!

- mike

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] Treating notifications as a contract

2014-07-10 Thread Sandy Walsh

On 7/10/2014 5:52 AM, Eoghan Glynn wrote:
 TL;DR: do we need to stabilize notifications behind a versioned
and discoverable contract?

Thanks for dusting this off. Versioning and published schemas for
notifications are important to the StackTach team.  It would be nice to
get this resolved. We're happy to help out.

 Folks,

 One of the issues that has been raised in the recent discussions with
 the QA team about branchless Tempest relates to some legacy defects
 in the OpenStack notification system.

 Now, I don't personally subscribe to the PoV that ceilometer, or
 indeed any other consumer of these notifications (e.g. StackTach), was
 at fault for going ahead and depending on this pre-existing mechanism
 without first fixing it.

 But be that as it may, we have a shortcoming here that needs to be
 called out explicitly, and possible solutions explored.

 In many ways it's akin to the un-versioned RPC that existed in nova
 before the versioned-rpc-apis BP[1] was landed back in Folsom IIRC,
 except that notification consumers tend to be at arms-length from the
 producer, and the effect of a notification is generally more advisory
 than actionable.

 A great outcome would include some or all of the following:

  1. more complete in-tree test coverage of notification logic on the
 producer side

Ultimately this is the core problem. A breaking change in the
notifications caused tests to fail in other systems. Should we be adding
more tests or simply add version checking at the lower levels (like the
first pass of RPC versioning did)?

(more on this below)

  2. versioned notification payloads to protect consumers from breaking
 changes in payload format
Yep, like RPC the biggies are:
1. removal of fields from notifications
2. change in semantics of a particular field
3. addition of new fields (not a biggie)

The urgency for notifications is a little different than RPC where there
is a method on the other end expecting a certain format. Notifications
consumers have to be a little more forgiving when things don't come in
as expected.

This isn't a justification for breaking changes. Just stating that we
have some leeway.

I guess it really comes down to systems that are using notifications for
critical synchronization vs. purely informational.

  
  3. external discoverability of which event types a service is emitting
These questions can be saved for later, but ...

Is the use-case that a downstream system can learn which queue to
subscribe to programmatically?

Is this a nice-to-have?

Would / should this belong in a metadata service?

  4. external discoverability of which event types a service is consuming

Isn't this what the topic queues are for? Consumers should only
subscribe to the topics they're interested in.

 If you're thinking that sounds like a substantial chunk of cross-project
 work  co-ordination, you'd be right :)

Perhaps notification schemas should be broken out into a separate
repo(s)? That way we can test independent of the publishing system. For
example, our notigen event simulator [5] could use it.

These could just be dependent libraries/plugins to oslo.messaging.


 So the purpose of this thread is simply to get a read on the appetite
 in the community for such an effort. At the least it would require:

  * trashing out the details in say a cross-project-track session at
the K* summit

  * buy-in from the producer-side projects (nova, glance, cinder etc.)
in terms of stepping up to make the changes

  * acquiescence from non-integrated projects that currently consume
these notifications

(we shouldn't, as good citizens, simply pull the rug out from under
projects such as StackTach without discussion upfront)
We'll adapt StackTach.v2 accordingly. StackTach.v3 is far less impacted
by notification changes since they are offloaded and processed in a
secondary step. Breaking changes will just stall the processing. I
suspect .v3 will be in place before .v2 is affected.

Adding version handling to Stack-Distiller (our notification-event
translator) should be pretty easy (and useful) [6]

  * dunno if the TC would need to give their imprimatur to such an
approach, or whether we could simply self-organize and get it done
without the need for governance resolutions etc.

 Any opinions on how desirable or necessary this is, and how the
 detailed mechanics might work, would be welcome.

A published set of schemas would be very useful for StackTach, we'd love
to help out in any way possible. In the near-term we have to press on
under the assumption notification definitions are fragile.

 Apologies BTW if this has already been discussed and rejected as
 unworkable. I see a stalled versioned-notifications BP[2] and some
 references to the CADF versioning scheme in the LP fossil-record.
 Also an inconclusive ML thread from 2012[3], and a related grizzly
 summit design session[4], but it's unclear to me whether these
 aspirations got much traction in the

Re: [openstack-dev] [all] Treating notifications as a contract

2014-07-10 Thread Sandy Walsh

On 7/10/2014 2:59 PM, Daniel Dyer wrote:
 From my perspective, the requirement is to be able to have a consistent and 
 predictable format for notifications that are being sent from all services. 
 This means:
 1. a set of required fields that all events contain and have consistent 
 meaning
 2. a set of optional fields, you don’t have to include these but if you do 
 then you follow the same format and meaning

That is the design of notifications [7]. I guess we're debating the
schema of the Payload section on a per-event basis.
(as opposed to the somewhat loose definitions we have for those sections
currently [8])

[7] https://wiki.openstack.org/wiki/NotificationSystem
[8] https://wiki.openstack.org/wiki/SystemUsageData

 3. versioning of events: version is updated whenever the required fields are 
 changed. managing optional fields can be done via a specification

 Discovery of events would be interesting from an automated testing 
 perspective, but I am not sure how effective this would be for an application 
 actually consuming the event.s
 Not sure how you would use enumerating the consumption of events




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] Treating notifications as a contract

2014-07-11 Thread Sandy Walsh

On 7/10/2014 12:10 PM, Chris Dent wrote:
 On Thu, 10 Jul 2014, Julien Danjou wrote:

 My initial plan was to leverage a library like voluptuous to do schema
 based validation on the sender side. That would allow for receiver to
 introspect schema and know the data structure to expect. I didn't think
 deeply on how to handle versioning, but that should be doable too.
 It's not clear to me in this discussion what it is that is being
 versioned, contracted or standardized.

 Is it each of the many different notifications that various services
 produce now?

 Is it the general concept of a notification which can be considered
 a sample that something like Ceilometer or StackTack might like to
 consume?


The only real differences between a sample and an event are:

1. the size of the context. Host X CPU = 70% tells you nearly
everything you need to know. But compute.scheduler.host_selected will
require lots of information to tell you why and how host X was
selected. The event payload should be atomic and not depend on previous
events for context. With samples, the context is sort of implied by the
key or queue name.

2. The handling of Samples can be sloppy. If you miss a CPU sample, just
wait for the next one. But if you drop an Event, a billing report is
going to be wrong or a dependent system loses sync.

3. There are a *lot* more samples emitted than events. Samples are a
shotgun blast while events are registered mail. This is why samples
don't usually have the schema problems of events. They are so tiny,
there's not much to change. Putting a lot of metadata in a sample is
generally a bad idea. Leave it to the queue or key name.

That said, Monasca is doing some really cool stuff with high-speed
sample processing such that the likelihood of dropping a sample is so
low that event support should be able to come from the same framework.
The difference is simply the size of the payload and if the system can
handle it at volume (quickly and reliably).



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] Treating notifications as a contract

2014-07-15 Thread Sandy Walsh

On 7/15/2014 3:51 AM, Mark McLoughlin wrote:
 On Fri, 2014-07-11 at 10:04 +0100, Chris Dent wrote:
 On Fri, 11 Jul 2014, Lucas Alvares Gomes wrote:

 The data format that Ironic will send was part of the spec proposed
 and could have been reviewed. I think there's still time to change it
 tho, if you have a better format talk to Haomeng which is the guys
 responsible for that work in Ironic and see if he can change it (We
 can put up a following patch to fix the spec with the new format as
 well) . But we need to do this ASAP because we want to get it landed
 in Ironic soon.
 It was only after doing the work that I realized how it might be an
 example for the sake of this discussion. As the architecure of
 Ceilometer currently exist there still needs to be some measure of
 custom code, even if the notifications are as I described them.

 However, if we want to take this opportunity to move some of the
 smarts from Ceilomer into the Ironic code then the paste that I created
 might be a guide to make it possible:

 http://paste.openstack.org/show/86071/
 So you're proposing that all payloads should contain something like:

 'events': [
 # one or more dicts with something like
 {
 # some kind of identifier for the type of event
 'class': 'hardware.ipmi.temperature',
 'type': '#thing that indicates threshold, discrete, 
 cumulative',
 'id': 'DIMM GH VR Temp (0x3b)',
 'value': '26',
 'unit': 'C',
 'extra': {
 ...
 }
  }

 i.e. a class, type, id, value, unit and a space to put additional metadata.

This looks like a particular schema for one event-type (let's say
foo.sample).  It's hard to extrapolate this one schema to a generic
set of common metadata applicable to all events. Really the only common
stuff we can agree on is the stuff already there: tenant, user, server,
message_id, request_id, timestamp, event_type, etc.

Side note on using notifications for sample data:

1. you should generate a proper notification when the rules of a sample
change (limits, alarms, sources, etc) ... but no actual measurements. 
This would be something like a ironic.alarm-rule-change notification
or something
2. you should generate a minimal event for the actual samples CPU-xxx:
70% that relates to the previous rule-changing notification. And do
this on a queue something like foo.sample.

This way, we can keep important notifications in a priority queue and
handle them accordingly (since they hold important data), but let the
samples get routed across less-reliable transports (like UDP) via the
RoutingNotifier.

Also, send the samples one-at-a-time and let them either a) drop on the
floor (udp) or b) let the aggregator roll them up into something smaller
(sliding window, etc). Making these large notifications contain a list
of samples means we had to store state somewhere on the server until
transmission time. Ideally something we wouldn't want to rely on.



 On the subject of notifications as a contract, calling the additional
 metadata field 'extra' suggests to me that there are no stability
 promises being made about those fields. Was that intentional?

 However on that however, if there's some chance that a large change could
 happen, it might be better to wait, I don't know.
 Unlikely that a larger change will be made in Juno - take small window
 of opportunity to rationalize Ironic's payload IMHO.

 Mark.
___ OpenStack-dev mailing
list OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] Treating notifications as a contract

2014-07-16 Thread Sandy Walsh

On 7/11/2014 6:08 AM, Chris Dent wrote:
 On Fri, 11 Jul 2014, Lucas Alvares Gomes wrote:

 The data format that Ironic will send was part of the spec proposed
 and could have been reviewed. I think there's still time to change it
 tho, if you have a better format talk to Haomeng which is the guys
 responsible for that work in Ironic and see if he can change it (We
 can put up a following patch to fix the spec with the new format as
 well) . But we need to do this ASAP because we want to get it landed
 in Ironic soon.
 It was only after doing the work that I realized how it might be an
 example for the sake of this discussion. As the architecure of
 Ceilometer currently exist there still needs to be some measure of
 custom code, even if the notifications are as I described them.

 However, if we want to take this opportunity to move some of the
 smarts from Ceilomer into the Ironic code then the paste that I created
 might be a guide to make it possible:

 http://paste.openstack.org/show/86071/

 However on that however, if there's some chance that a large change could
 happen, it might be better to wait, I don't know.


Just to give a sense of what we're dealing with, as while back I wrote a
little script to dump the schema of all events StackTach collected from
Nova.  The value fields are replaced with types (or ? if it was a
class object).

http://paste.openstack.org/show/54140/




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Ceilometer] Generate Event or Notification in Ceilometer

2014-07-30 Thread Sandy Walsh

If all you want to do is publish a notification you can use oslo.messaging 
directly. Or, for something lighter weight, we have Notabene, which is a small 
wrapper on Kombu.

An example of how our notification simulator/generator uses it is available 
here:
https://github.com/StackTach/notigen/blob/master/bin/event_pump.py

Of course, you'll have to ensure you fabricate a proper event payload.

Hope it helps
-S


From: Duan, Li-Gong (Gary@HPServers-Core-OE-PSC) [li-gong.d...@hp.com]
Sent: Tuesday, July 29, 2014 6:05 AM
To: openstack-dev@lists.openstack.org
Subject: [openstack-dev] [Ceilometer] Generate Event or Notification in 
Ceilometer

Hi Folks,

Are there any guide or examples to show how to produce a new event or 
notification add add a handler for this event in ceilometer?

I am asked to implement OpenStack service monitoring which will send an event 
and trigger the handler once a service, say nova-compute, crashes, in a short 
time. :(
The link (http://docs.openstack.org/developer/ceilometer/events.html) does a 
good job on the explanation of concept and hence I know that I need to emit 
notification to message queue and ceilometer-collector will process them and 
generate events but it is far from real implementations.

Regards,
Gary
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Payload within RabbitMQ messages for Nova related exchanges

2014-04-15 Thread Sandy Walsh



On 04/15/2014 10:07 AM, George Monday wrote:
 Hey there,
 
 I've got a quick question about the RabbitMQ exchanges. We are writing
 listeners
 for the RabbitMQ exchanges. The basic information about the tasks like
 compute.instance.create.[start|stop] etc. as stored in the 'payload'
 attribute of the
 json message are my concern at the moment.
 
 Does this follow a certain predefined structure that's consistent for
 the lifetime of, say,
 a specific nova api version? Will this change in major releases (from
 havana to icehouse)?
 Is this subject to change without notice? Is there a definition
 available somewhere? Like for
 the api versions?
 
 In short, how reliable is the json structure of the payload attribute in
 a rabbitMQ message?
 
 We just want to make sure, that with an update to the OpenStack
 controller, we wouldn't
 break our listeners?

Hey George,

Most of the notifications are documented here
https://wiki.openstack.org/wiki/SystemUsageData

But, you're correct that there is no versioning on these currently, but
there are some efforts to fix this (specifically around CADF-support)

Here's some more info on notifications if you're interested:
http://www.sandywalsh.com/2013/09/notification-usage-in-openstack-report.html

Hope it helps!
-S





 My Best,
 George
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] How to re-compile Devstack Code

2014-04-24 Thread Sandy Walsh

Also, I find setting this in my localrc/local.conf helps debugging:

# get an actual log file vs. screen scrollback
LOGFILE=/opt/stack/logs/stack.sh.log

# gimme all the info
VERBOSE=True

# don't pull from git every time I run stack.sh
RECLONE=False

# make the logs readable
LOG_COLOR=False



From: shiva m [anjane...@gmail.com]
Sent: Thursday, April 24, 2014 2:42 AM
To: openstack-dev@lists.openstack.org
Subject: [openstack-dev] How to re-compile Devstack Code

Hi,

I have Devstack havana setup on Ubuntu 13.10. I am trying to modify some files  
in /opt/stack/* folder. How do I re-compile the Devstack to make my changes get 
effect.

Does unstacking and stacking works?. I see unstacking and stacking installs 
everything fresh.

Correct me  if  wrong.

Thanks,
Shiva
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Monitoring as a Service

2014-05-06 Thread Sandy Walsh

On 5/6/2014 10:04 AM, Thierry Carrez wrote:
 John Dickinson wrote:
 One of the advantages of the program concept within OpenStack is that 
 separate code projects with complementary goals can be managed under the 
 same program without needing to be the same codebase. The most obvious 
 example across every program are the server and client projects under 
 most programs.

 This may be something that can be used here, if it doesn't make sense to 
 extend the ceilometer codebase itself.
 +1

 Being under the Telemetry umbrella lets you make the right technical
 decision between same or separate codebase, as both would be supported
 by the organizational structure.

 It also would likely give you an initial set of contributors interested
 in the same end goals. So at this point I'd focus on engaging with the
 Telemetry program folks and see if they would be interested in that
 capability (inside or outside of the Ceilometer code base).

This is interesting.

I'd be curious to know more what managed means in this situation? Is
the core project expected to allocate time in the IRC meeting to the
concerns of these adjacent projects? What if the core project doesn't
agree with the direction or deems there's too much overlap? Does the
core team instantly have sway over the adjacent project?

Or does it simply mean we tag ML discussions with [Telemetry] and people
can filter accordingly?

I mean, this all sounds good in theory, but I'd like to know more about
the practical implementation of it. Related client and server projects
seem like the low-hanging fruit.

-S

 Cheers,



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Monitoring as a Service

2014-05-06 Thread Sandy Walsh

On 5/6/2014 1:48 PM, Thierry Carrez wrote:
 Sandy Walsh wrote:
 I'd be curious to know more what managed means in this situation? Is
 the core project expected to allocate time in the IRC meeting to the
 concerns of these adjacent projects? What if the core project doesn't
 agree with the direction or deems there's too much overlap? Does the
 core team instantly have sway over the adjacent project?
 It has to be basically the same team of people working on the two
 projects. The goals and the direction of the project are shared. There
 is no way it can work if you consider some core and some adjacent,
 that would quickly create a us vs. them mentality and not work out that
 well in reviews.

 Of course, there can be contributors that are not interested in one
 project or another. But if you end up with completely-separated
 subteams, then there is little value in living under the same umbrella
 and sharing a core team.

Ok, that's what I thought. Thanks for the clarification.

-S



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Treating notifications as a contract

2014-10-07 Thread Sandy Walsh


From: Chris Dent [chd...@redhat.com] Tuesday, October 07, 2014 12:07 PM

On Wed, 3 Sep 2014, Sandy Walsh wrote:

 Good goals. When Producer and Consumer know what to expect, things are
 good ... I know to find the Instance ID here. When the consumer
 wants to deal with a notification as a generic object, things get tricky
 (find the instance ID in the payload, What is the image type?, Is
 this an error notification?)

 Basically, how do we define the principle artifacts for each service and
 grant the consumer easy/consistent access to them? (like the 7-W's above)

 I'd really like to find a way to solve that problem.

 Is that a good summary? What did I leave out or get wrong?

 Great start! Let's keep it simple and do-able.

Has there been any further thinking on these topics? Summit is soon
and kilo specs are starting so I imagine more people than just me are
hoping to get rolling on plans.

If there is going to be a discussion at summit I hope people will be
good about keeping some artifacts for those of us watching from afar.

It seems to me that if the notifications ecosystem becomes
sufficiently robust and resilient we ought to be able to achieve
some interesting scale and distributed-ness opporunities throughout
OpenStack, not just in telemetry/metering/eventing (choose your term
of art).

Haven't had any time to get anything written down (pressing deadlines with 
StackTach.v3) but open to suggestions. Perhaps we should just add something to 
the olso.messaging etherpad to find time at the summit to talk about it?

-S


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Treating notifications as a contract

2014-10-07 Thread Sandy Walsh


From: Sandy Walsh [sandy.wa...@rackspace.com] Tuesday, October 07, 2014 6:07 PM

Haven't had any time to get anything written down (pressing deadlines with 
StackTach.v3) but open to suggestions. Perhaps we should just add something to 
the olso.messaging etherpad to find time at the summit to talk about it?

-S

Actually, that's not really true.

The Monasca team has been playing with schema definitions for their wire
format (a variation on the kind of notification we ultimately want). And
http://apiary.io/ is introducing support for structure schemas soon. Perhaps
we can start with some schema proposals there? JSON-Schema based?

For green-field installations, CADF is a possibility, but for already
established services we will to document what's in place first.

At some point we'll need a cross-project effort to identify all the important
characteristics of the various services.

Also, we've been finding no end of problems with the wild-west payload
section.

For example, look at all the different places we have to look to find the
instance UUID from Nova. 

https://github.com/SandyWalsh/stacktach-sandbox/blob/verifier/winchester/event_definitions.yaml#L12-L17

Likewise for project_id, flavor_id, deleted_at, etc.

Definitely need a solution to this. 
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Treating notifications as a contract

2014-10-14 Thread Sandy Walsh

From: Doug Hellmann [d...@doughellmann.com] Tuesday, October 14, 2014 7:19 PM

 It might be more appropriate to put it on the cross-project session list: 
 https://etherpad.openstack.org/p/kilo-crossproject-summit-topics

Done ... thanks!

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] request_id deprecation strategy question

2014-10-20 Thread Sandy Walsh

Does this mean we're losing request-id's?

Will they still appear in the Context objects?

And there was the effort to keep consistent request-id's in cross-service 
requests, will this deprecation affect that?

-S


From: Steven Hardy [sha...@redhat.com]
Sent: Monday, October 20, 2014 10:58 AM
To: openstack-dev@lists.openstack.org
Subject: [openstack-dev] [oslo] request_id deprecation strategy question

Hi all,

I have a question re the deprecation strategy for the request_id module,
which was identified as a candidate for removal in Doug's recent message[1],
as it's moved from oslo-incubator to oslo.middleware.

The problem I see is that oslo-incubator deprecated this in Juno, but
(AFAICS) all projects shipped Juno without the versionutils deprecation
warning sync'd [2]

Thus, we can't remove the local openstack.common.middleware.request_id, or
operators upgrading from Juno to Kilo without changing their api-paste.ini
files will experience breakage without any deprecation warning.

I'm sure I've read and been told that all backwards incompatible config
file changes require a deprecation period of at least one cycle, so does
this mean all projects just sync the Juno oslo-incubator request_id into
their kilo trees, leave it there until kilo releases, while simultaneously
switching their API configs to point to oslo.middleware?

Guidance on how to proceed would be great, if folks have thoughts on how
best to handle this.

Thanks!

Steve


[1] http://lists.openstack.org/pipermail/openstack-dev/2014-October/048303.html
[2] 
https://github.com/openstack/oslo-incubator/blob/stable/juno/openstack/common/middleware/request_id.py#L33

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] request_id deprecation strategy question

2014-10-20 Thread Sandy Walsh

Phew :)

Thanks Steve. 

From: Steven Hardy [sha...@redhat.com]
Sent: Monday, October 20, 2014 12:52 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [oslo] request_id deprecation strategy question

On Mon, Oct 20, 2014 at 02:17:54PM +, Sandy Walsh wrote:
 Does this mean we're losing request-id's?

No, it just means the implementation has moved from oslo-incubator[1] to
oslo.middleware[2].

The issue I'm highlighting is that those projects using the code now have
to update their api-paste.ini files to import from the new location,
presumably while giving some warning to operators about the impending
removal of the old code.

All I'm seeking to clarify is the most operator sensitive way to handle
this transition, given that we seem to have missed the boat on including a
nice deprecation warning for Juno.

Steve

[1] 
https://github.com/openstack/oslo-incubator/blob/stable/juno/openstack/common/middleware/request_id.py#L33
[2] 
https://github.com/openstack/oslo.middleware/blob/master/oslo/middleware/request_id.py

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] How can we get more feedback from users?

2014-10-24 Thread Sandy Walsh

Nice work Angus ... great idea. Would love to see more of this.

-S

From: Angus Salkeld [asalk...@mirantis.com]
Sent: Friday, October 24, 2014 1:32 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: [openstack-dev] [all] How can we get more feedback from users?

Hi all

I have felt some grumblings about usability issues with Heat 
templates/client/etc..
and wanted a way that users could come and give us feedback easily (low 
barrier). I started an etherpad 
(https://etherpad.openstack.org/p/heat-useablity-improvements) - the first win 
is it is spelt wrong :-O

We now have some great feedback there in a very short time, most of this we 
should be able to solve.

This lead me to think, should OpenStack have a more general mechanism for 
users to provide feedback. The idea is this is not for bugs or support, but 
for users to express pain points, requests for features and docs/howtos.

It's not easy to improve your software unless you are listening to your users.

Ideas?

-Angus
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] StackTach users?

2014-10-24 Thread Sandy Walsh

Hey y'all!

I'm taking a page from Angus and trying to pull together a list of StackTach 
users. We're moving quickly on our V3 implementation and I'd like to ensure 
we're addressing the problems you've faced/are facing with older versions. 

For example, I know initial setup has been a concern and we're starting with an 
ansible installer in V3. Would that help?

We're also ditching the web gui (for now) and buffing up the REST API and 
client tools. Is that a bad thing?

Feel free to contact me directly if you don't like the public forums. Or we can 
chat at the summit. 

Cheers!
-S

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Ceilometer] Notifications as a contract summit prep

2014-10-24 Thread Sandy Walsh

Thanks ... we'll be sure to address your concerns. 

And there's the list we've compiled here:
https://etherpad.openstack.org/p/kilo-crossproject-summit-topics
(section 4)

-S


From: Chris Dent [chd...@redhat.com]
Sent: Friday, October 24, 2014 2:45 PM
To: OpenStack-dev@lists.openstack.org
Subject: [openstack-dev] [Ceilometer] Notifications as a contract summit prep

Since I'm not going to be at summit and since I care about
notifications I was asked to write down some thoughts prior to
summit so my notions didn't get missed. The notes are at:

https://tank.peermore.com/tanks/cdent-rhat/SummitNotifications

TL;DR: make sure that adding new stuff (producers, consumers,
notifications) is easy.

--
Chris Dent tw:@anticdent freenode:cdent
https://tank.peermore.com/tanks/cdent

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] Summit Recap: Notification Schema

2014-11-18 Thread Sandy Walsh

https://etherpad.openstack.org/p/kilo-crossproject-notifications

The big takeaways:
1. We want the schema to be external so other languages can utilize them. 
2. JSON-Schema seems fine, but AVRO has traction in the Big Data world and 
should be considered.
3. The challenge of have text-file based schema's is how to make them available 
for CI and deployments. Packaging problems. There is no simple pip install for 
text files. Talked about the possibility of making them available by the 
service API itself or exposing their location via a Service Catalog entry. 
4. There are a lot of other services that need a solution to this problem. 
Monasca needs to define a message bus schema. Nova Objects has its own for RPC 
calls. It would be nice to solve this problem once. 
5. The CADF group is very open to making changes to the spec to accommodate our 
needs. Regardless, we need a way to transform existing notifications to 
whatever the new format is. So, we not only need schema definition grammar, but 
we will need a transformation grammar out of the gate for backwards 
compatibility. 
6. Like Nova Objects, it would be nice to make a single smart schema object 
that can read a schema file and become that object with proper setters and 
getters (and validation, version up-conversion/down-conversion, etc)
7. If we can nail down the schema grammar, the transformation grammar and 
perhaps the schema object in Kilo we can start to promote it for adoption in 
L-release. 
8. People should be freed up to work on this around Kilo-2 (new year)

Lots of other details in the etherpad. 

It would be good to arrange a meeting soon to discuss the schema grammar again. 
And how to distribute the schemas in test and prod env's. Perhaps come up with 
some concrete recommendations. 
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] Where should Schema files live?

2014-11-20 Thread Sandy Walsh

Hey y'all,

To avoid cross-posting, please inform your -infra / -operations buddies about 
this post. 

We've just started thinking about where notification schema files should live 
and how they should be deployed. Kind of a tricky problem.  We could really use 
your input on this problem ...

The assumptions:
1. Schema files will be text files. They'll live in their own git repo 
(stackforge for now, ideally oslo eventually). 
2. Unit tests will need access to these files for local dev
3. Gating tests will need access to these files for integration tests
4. Many different services are going to want to access these files during 
staging and production. 
5. There are going to be many different versions of these files. There are 
going to be a lot of schema updates. 

Some problems / options:
a. Unlike Python, there is no simple pip install for text files. No version 
control per se. Basically whatever we pull from the repo. The problem with a 
git clone is we need to tweak config files to point to a directory and that's a 
pain for gating tests and CD. Could we assume a symlink to some well-known 
location?
a': I suppose we could make a python installer for them, but that's a pain 
for other language consumers.
b. In production, each openstack service could expose the schema files via 
their REST API, but that doesn't help gating tests or unit tests. Also, this 
means every service will need to support exposing schema files. Big 
coordination problem.
c. In production, We could add an endpoint to the Keystone Service Catalog to 
each schema file. This could come from a separate metadata-like service. Again, 
yet-another-service to deploy and make highly available. 
d. Should we make separate distro packages? Install to a well known location 
all the time? This would work for local dev and integration testing and we 
could fall back on B and C for production distribution. Of course, this will 
likely require people to add a new distro repo. Is that a concern?

Personally, I'm leaning towards option D but I'm not sure what the implications 
are. 

We're early in thinking about these problems, but would like to start the 
conversation now to get your opinions. 

Look forward to your feedback.

Thanks
-Sandy




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Where should Schema files live?

2014-11-20 Thread Sandy Walsh

From: Doug Hellmann [d...@doughellmann.com] Thursday, November 20, 2014 3:51 PM
   On Nov 20, 2014, at 8:12 AM, Sandy Walsh sandy.wa...@rackspace.com wrote:
Hey y'all,
   
To avoid cross-posting, please inform your -infra / -operations buddies 
about this post.
   
We've just started thinking about where notification schema files should 
live and how they should be deployed. Kind of a tricky problem.  We could 
really use your input on this problem ...
   
The assumptions:
1. Schema files will be text files. They'll live in their own git repo 
(stackforge for now, ideally oslo eventually).
   Why wouldn’t they live in the repo of the application that generates the 
notification, like we do with the database schema and APIs defined by those 
apps?

That would mean downstream consumers (potentially in different languages) would 
need to pull all repos and extract just the schema parts. A separate repo would 
make it more accessible. 


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Where should Schema files live?

2014-11-21 Thread Sandy Walsh


From: Eoghan Glynn [egl...@redhat.com] Thursday, November 20, 2014 5:34 PM

Some questions/observations inline.

 Hey y'all,

 To avoid cross-posting, please inform your -infra / -operations buddies 
 about this post.

 We've just started thinking about where notification schema files should 
 live and how they should be deployed. Kind of a tricky problem.  We could 
 really use your input on this problem ...

 The assumptions:
 1. Schema files will be text files. They'll live in their own git repo 
 (stackforge for now, ideally oslo eventually).
 2. Unit tests will need access to these files for local dev
 3. Gating tests will need access to these files for integration tests
 4. Many different services are going to want to access these files during 
 staging and production.
 5. There are going to be many different versions of these files. There are 
 going to be a lot of schema updates.

 Some problems / options:
 a. Unlike Python, there is no simple pip install for text files. No version 
 control per se. Basically whatever we pull from the repo. The problem with a 
 git clone is we need to tweak config files to point to a directory and 
 that's a pain for gating tests and CD. Could we assume a symlink to some 
 well-known location?
 a': I suppose we could make a python installer for them, but that's a 
 pain for other language consumers.

Would it be unfair to push that burden onto the writers of clients
in other languages?

i.e. OpenStack, being largely python-centric, would take responsibility
for both:

  1. Maintaining the text versions of the schema in-tree (e.g. as json)

and:

  2. Producing a python-specific installer based on #1

whereas, the first Java-based consumer of these schema would take
#1 and package it up in their native format, i.e. as a jar or
OSGi bundle.

Certainly an option. My gut says it will lead to abandoned/fragmented efforts.
If I was a ruby developer, would I want to take on the burden of maintaining 
yet another package? 
I think we need to treat this data as a form of API and there it's our 
responsibility to make easily consumable. 

(I'm not hard-line on this, again, just my gut feeling)

 b. In production, each openstack service could expose the schema files via 
 their REST API, but that doesn't help gating tests or unit tests. Also, this 
 means every service will need to support exposing schema files. Big 
 coordination problem.

I kind of liked this schemaURL endpoint idea when it was first
mooted at summit.

The attraction for me was that it would allow the consumer of the
notifications always have access to the actual version of schema
currently used on the emitter side, independent of the (possibly
out-of-date) version of the schema that the consumer has itself
installed locally via a static dependency.

However IIRC there were also concerns expressed about the churn
during some future rolling upgrades - i.e. if some instances of
the nova-api schemaURL endpoint are still serving out the old
schema, after others in the same deployment have already been
updated to emit the new notification version.

Yeah, I like this idea too. In the production / staging phase this seems like 
the best route. The local dev / testing situation seems to be the real tough 
nut to crack. 

WRT rolling upgrades we have to ensure we update the service catalog first, the 
rest should be fine. 

 c. In production, We could add an endpoint to the Keystone Service Catalog 
 to each schema file. This could come from a separate metadata-like service. 
 Again, yet-another-service to deploy and make highly available.

Also to {puppetize|chef|ansible|...}-ize.

Yeah, agreed, we probably don't want to do down that road.

Which is kinda unfortunate since it's the lowest impact on other projects. 

 d. Should we make separate distro packages? Install to a well known location 
 all the time? This would work for local dev and integration testing and we 
 could fall back on B and C for production distribution. Of course, this will 
 likely require people to add a new distro repo. Is that a concern?

Quick clarification ... when you say distro packages, do you mean
Linux-distro-specific package formats such as .rpm or .deb?

Yep. 

Cheers,
Eoghan

Thanks for the feedback!
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Where should Schema files live?

2014-11-21 Thread Sandy Walsh

From: Doug Hellmann [d...@doughellmann.com] Thursday, November 20, 2014 5:09 PM
On Nov 20, 2014, at 3:40 PM, Sandy Walsh sandy.wa...@rackspace.com wrote:
 From: Doug Hellmann [d...@doughellmann.com] Thursday, November 20, 2014 3:51 
 PM
 On Nov 20, 2014, at 8:12 AM, Sandy Walsh sandy.wa...@rackspace.com wrote:
 The assumptions:
 1. Schema files will be text files. They'll live in their own git repo 
 (stackforge for now, ideally oslo eventually).
 Why wouldn’t they live in the repo of the application that generates the 
 notification, like we do with the database schema and APIs defined by those 
 apps?

 That would mean downstream consumers (potentially in different languages) 
 would need to pull all repos and extract just the schema parts. A separate 
 repo would make it more accessible.

OK, fair. Could we address that by publishing the schemas for an app in a tar 
ball using a post merge job?

That's something to consider. At first blush it feels a little clunky to pull 
all projects to extract schemas whenever any of the projects change. 

But there is something to be said about having the schema files next to the 
code that going to generate the data. 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Where should Schema files live?

2014-11-24 Thread Sandy Walsh

From: Eoghan Glynn [egl...@redhat.com] Friday, November 21, 2014 11:03 AM
  Some problems / options:
  a. Unlike Python, there is no simple pip install for text files. No
  version control per se. Basically whatever we pull from the repo. The
  problem with a git clone is we need to tweak config files to point to a
  directory and that's a pain for gating tests and CD. Could we assume a
  symlink to some well-known location?
  a': I suppose we could make a python installer for them, but that's a
  pain for other language consumers.

 Would it be unfair to push that burden onto the writers of clients
 in other languages?
 
 i.e. OpenStack, being largely python-centric, would take responsibility
 for both:
 
   1. Maintaining the text versions of the schema in-tree (e.g. as json)
 
 and:
 
   2. Producing a python-specific installer based on #1
 
 whereas, the first Java-based consumer of these schema would take
 #1 and package it up in their native format, i.e. as a jar or
 OSGi bundle.

I think Doug's suggestion of keeping the schema files in-tree and pushing them 
to a well-known tarball maker in a build step is best so far. 

It's still a little clunky, but not as clunky as having to sync two repos. 

[snip]
  d. Should we make separate distro packages? Install to a well known
  location all the time? This would work for local dev and integration
  testing and we could fall back on B and C for production distribution. Of
  course, this will likely require people to add a new distro repo. Is that
  a concern?

 Quick clarification ... when you say distro packages, do you mean
 Linux-distro-specific package formats such as .rpm or .deb?

 Yep.

So that would indeed work, but just to sound a small note of caution
that keeping an oft-changing package (assumption #5) up-to-date for
fedora20/21  epel6/7, or precise/trusty, would involve some work.

I don't know much about the Debian/Ubuntu packaging pipeline, in
particular how it could be automated.

But in my small experience of Fedora/EL packaging, the process is
somewhat resistant to many fine-grained updates.

Ah, good to know. So, if we go with the tarball approach, we should be able to 
avoid this. And it allows the service to easily service up the schema using 
their existing REST API. 

Should we proceed under the assumption we'll push to a tarball in a post-build 
step? It could change if we find it's too messy. 

-S

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] is there a way to simulate thousands or millions of compute nodes?

2014-11-27 Thread Sandy Walsh

From: Michael Still [mi...@stillhq.com] Thursday, November 27, 2014 6:57 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [nova] is there a way to simulate thousands or 
millions of compute nodes?

I would say that supporting millions of compute nodes is not a current
priority for nova... We are actively working on improving support for
thousands of compute nodes, but that is via cells (so each nova deploy
except the top is still in the hundreds of nodes).

ramble on

Agreed, it wouldn't make much sense to simulate this on a single machine. 

That said, if one *was* to simulate this, there are the well known bottlenecks:

1. the API. How much can one node handle with given hardware specs? Which 
operations hit the DB the hardest?
2. the Scheduler. There's your API bottleneck and big load on the DB for Create 
operations. 
3. the Conductor. Shouldn't be too bad, essentially just a proxy. 
4. child-to-global-cell updates. Assuming a two-cell deployment. 
5. the virt driver. YMMV. 
... and that's excluding networking, volumes, etc. 

The virt driver should be load tested independently. So FakeDriver would be 
fine (with some delays added for common operations as Gareth suggests). 
Something like Bees-with-MachineGuns could be used to get a baseline metric for 
the API. Then it comes down to DB performance in the scheduler and conductor 
(for a single cell). Finally, inter-cell loads. Who blows out the queue first?

All-in-all, I think you'd be better off load testing each piece independently 
on a fixed hardware platform and faking out all the incoming/outgoing services. 
Test the API with fake everything. Test the Scheduler with fake API calls and 
fake compute nodes. Test the conductor with fake compute nodes (not 
FakeDriver). Test the compute node directly. 

Probably all going to come down to the DB and I think there is some good 
performance data around that already?

But I'm just spit-ballin' ... and I agree, not something I could see the Nova 
team taking on in the near term ;)

-S

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Where should Schema files live?

2014-12-01 Thread Sandy Walsh

From: Duncan Thomas [duncan.tho...@gmail.com]
Sent: Sunday, November 30, 2014 5:40 AM
To: OpenStack Development Mailing List
Subject: Re: [openstack-dev] Where should Schema files live?

Duncan Thomas
On Nov 27, 2014 10:32 PM, Sandy Walsh sandy.wa...@rackspace.com wrote:

 We were thinking each service API would expose their schema via a new 
 /schema resource (or something). Nova would expose its schema. Glance its 
 own. etc. This would also work well for installations still using older 
 deployments.
This feels like externally exposing info that need not be external (since the 
notifications are not external to the deploy) and it sounds like it will 
potentially leak fine detailed version and maybe deployment config details 
that you don't want to make public - either for commercial reasons or to make 
targeted attacks harder

Yep, good point. Makes a good case for standing up our own service or just 
relying on the tarballs being in a well know place.

Thanks for the feedback.

-S

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Event Service

2013-07-14 Thread Sandy Walsh

https://wiki.openstack.org/wiki/SystemUsageData

Both Ceilometer and StackTach can be used to consume these notifications.
https://github.com/openstack/ceilometer
https://github.com/rackerlabs/stacktach

(StackTach functionality is slowly being merged into Ceilometer)

Hope it helps!

-S



From: Michael Still [mi...@stillhq.com]
Sent: Friday, July 12, 2013 10:38 PM
To: OpenStack Development Mailing List
Subject: Re: [openstack-dev] Event Service

OpenStack has a system called notifications which does what you're
looking for. I've never used it, but I am sure its documented.

Cheers,
Michael

On Sat, Jul 13, 2013 at 10:12 AM, Qing He qing...@radisys.com wrote:
 All,

 Does open stack have pub/sub event service? I would like to be notified of
 the event of VM creation/deletion/Migration etc. What is the best way to do
 this?



 Thanks,


 Qing


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




--
Rackspace Australia

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Change I30b127d6] Cheetah vs Jinja

2013-07-16 Thread Sandy Walsh

There's a ton of reviews/comparisons out there, only a google away.

From: Doug Hellmann [doug.hellm...@dreamhost.com]
Sent: Tuesday, July 16, 2013 1:45 PM
To: OpenStack Development Mailing List
Subject: Re: [openstack-dev] [Change I30b127d6] Cheetah vs Jinja

Great, I think I had the Mako syntax mixed up with a different templating 
language that depended on having a DOM to work on.

Can someone put together a more concrete analysis than this is working so we 
can compare the tools? :-)

Doug

On Tue, Jul 16, 2013 at 12:29 PM, Nachi Ueno 
na...@ntti3.commailto:na...@ntti3.com wrote:
Hi Doug

Mako looks OK for config generation
This is code in review.
https://review.openstack.org/#/c/33148/23/neutron/services/vpn/device_drivers/template/ipsec.conf.template

2013/7/16 Doug Hellmann 
doug.hellm...@dreamhost.commailto:doug.hellm...@dreamhost.com:

 On Tue, Jul 16, 2013 at 9:51 AM, Daniel P. Berrange 
 berra...@redhat.commailto:berra...@redhat.com
 wrote:

 On Tue, Jul 16, 2013 at 09:41:55AM -0400, Solly Ross wrote:
  (This email is with regards to https://review.openstack.org/#/c/36316/)

  Hello All,

  I have been implementing the Guru Meditation Report blueprint
  (https://blueprints.launchpad.net/oslo/+spec/guru-meditation-report),
  and the question of a templating engine was raised.  Currently, my
  version of the code includes the Jinja2 templating engine
  (http://jinja.pocoo.org/), which is modeled after the Django
  templating engine (it was designed to be an implementation of the
  Django templating engine without requiring the use of Django), which
  is used in Horizon.  Apparently, the Cheetah templating engine
  (http://www.cheetahtemplate.org/) is used in a couple places in Nova.

  IMO, the Jinja template language produces much more readable templates,
  and I think is the better choice for inclusion in the Report framework.
   It also shares a common format with Django (making it slightly easier
  to write for people coming from that area), and is also similar to
  template engines for other languages. What does everyone else think?

 Repeating my comments from the review...

 I don't have an opinion on whether Jinja or Cheetah is a better
 choice, since I've essentially never used either of them (beyond
 deleting usage of ceetah from libvirt). I do, however, feel we
 should not needlessly use multiple different templating libraries
 across OpenStack. We should take care to standardize on one option
 that is suitable for all our needs. So if the consensus is that
 Jinja is better, then IMHO, there would need to be an blueprint
 + expected timeframe to port existing Ceetah usage to use Jinja.

 Regards,
 Daniel

 The most current release of Cheetah is from 2010. I don't have a problem
 adding a new dependency on a tool that is actively maintained, with a plan
 to migrate off of the older tool to come later.

 The Neutron team seems to want to use Mako
 (https://review.openstack.org/#/c/37177/). Maybe we should pick one? Keep in
 mind that we won't always be generating XML or HTML, so my first question is
 how well does Mako work for plain text?

 Doug

 --
 |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/
 :|
 |: http://libvirt.org  -o- http://virt-manager.org
 :|
 |: http://autobuild.org   -o- http://search.cpan.org/~danberr/
 :|
 |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc
 :|

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.orgmailto:OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.orgmailto:OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.orgmailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] Opinions needed: Changing method signature in RPC callback ...

2013-07-18 Thread Sandy Walsh

Hey y'all!

Running into an interesting little dilemma with a branch I'm working on.
Recently, I introduced a branch in oslo-common to optionally .reject() a
kombu message on an exception. Currently, we always .ack() all messages
even if the processing callback fails. For Ceilometer, this is a problem
... we have to guarantee we get all notifications.

The patch itself was pretty simple, but didn't work :) The spawn_n()
call was eating the exceptions coming from the callback. So, in order to
get the exceptions it's simple enough to re-wrap the callback, but I
need to pool.waitall() after the spawn_n() to ensure none of the
consumers failed. Sad, but a necessary evil. And remember, it's only
used in a special case, normal openstack rpc is unaffected and remains
async.

But it does introduce a larger problem ... I have to change the rpc
callback signature.

Old: callback(message)
New: callback(message, delivery_info=None, wait_for_consumers=False)

(The delivery_info is another thing, we were dumping the message info on
the floor, but this has important info in it)

My worry is busting all the other callbacks out there that use
olso-common.rpc

Some options:
1. embed all these flags and extra data in the message structure

message = {'_context_stuff': ...,
   'payload: {...},
   '_extra_magic': {...}}

2. make a generic CallContext() object to include with message that has
anything else we need (a one-time signature break)

call_context = CallContext({delivery_info: {...}, wait: False})
callback(message, call_context)

3. some other ugly python hack that I haven't thought of yet.

Look forward to your thoughts on a solution!

Thanks
-S


My work-in-progess is here:
https://github.com/SandyWalsh/openstack-common/blob/callback_exceptions/openstack/common/rpc/amqp.py#L373

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Opinions needed: Changing method signature in RPC callback ...

2013-07-18 Thread Sandy Walsh



On 07/18/2013 11:09 AM, Sandy Walsh wrote:
 2. make a generic CallContext() object to include with message that has
 anything else we need (a one-time signature break)
 
 call_context = CallContext({delivery_info: {...}, wait: False})
 callback(message, call_context)

or just callback(message, **kwargs) of course.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Opinions needed: Changing method signature in RPC callback ...

2013-07-18 Thread Sandy Walsh



On 07/18/2013 03:55 PM, Eric Windisch wrote:
 On Thu, Jul 18, 2013 at 10:09 AM, Sandy Walsh sandy.wa...@rackspace.com
 mailto:sandy.wa...@rackspace.com wrote:
 
 
 My worry is busting all the other callbacks out there that use
 olso-common.rpc
 
 
 These callback methods are part of the Kombu driver (and maybe part of
 Qpid), but are NOT part of the RPC abstraction. These are private
 methods. They can be broken for external consumers of these methods,
 because there shouldn't be any. It will be a good lesson to anyone that
 tries to abuse private methods.

I was wondering about that, but I assumed some parts of amqp.py were
used by other transports as well (and not just impl_kombu.py)

There are several callbacks in amqp.py that would be affected.




 
 -- 
 Regards,
 Eric Windisch
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Opinions needed: Changing method signature in RPC callback ...

2013-07-19 Thread Sandy Walsh



On 07/18/2013 05:56 PM, Eric Windisch wrote:
 
  These callback methods are part of the Kombu driver (and maybe part of
  Qpid), but are NOT part of the RPC abstraction. These are private
  methods. They can be broken for external consumers of these methods,
  because there shouldn't be any. It will be a good lesson to anyone
 that
  tries to abuse private methods.
 
 I was wondering about that, but I assumed some parts of amqp.py were
 used by other transports as well (and not just impl_kombu.py)
 
 There are several callbacks in amqp.py that would be affected.
 
  
 The code in amqp.py is used by the Kombu and Qpid drivers and might
 implement the public methods expected by the abstraction, but does not
 define it. The RPC abstraction is defined in __init__.py, and does not
 define callbacks. Other drivers, granted only being the ZeroMQ driver at
 present, are not expected to define a callback method and as a private
 method -- would have no template to follow nor an expectation to have
 this method.
 
 I'm not saying your proposed changes are bad or invalid, but there is no
 need to make concessions to the possibility that code outside of oslo
 would be using callback(). This opens up the option, besides creating a
 new method, to simply updating all the existing method calls that exist
 in amqp.py, impl_kombu.py, and impl_qpid.py.

Gotcha ... thanks Eric. Yeah, the outer api is very generic.

I did a little more research and, unfortunately, it seems the inner amqp
implementations are being used by others. So I'll have to be careful
with the callback signature. Ceilometer, for example, seems to be
leaving zeromq support as an exercise for the reader.

Perhaps oslo-messaging will make this abstraction easier to enforce.

Cheers!
-S


 
 -- 
 Regards,
 Eric Windisch
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] New DB column or new DB table?

2013-07-19 Thread Sandy Walsh

On 07/18/2013 11:12 PM, Lu, Lianhao wrote:
Sean Dague wrote on 2013-07-18:
On 07/17/2013 10:54 PM, Lu, Lianhao wrote:
Hi fellows,

Currently we're implementing the BP
https://blueprints.launchpad.net/nova/+spec/utilization-aware-scheduling.
The main idea is to have
an extensible plugin framework on nova-compute where every plugin can get
different metrics(e.g. CPU utilization, memory cache
utilization, network bandwidth, etc.) to store into the DB, and the
nova-scheduler will use that data from DB for scheduling decision.

Currently we adds a new table to store all the metric data and have
nova-scheduler join loads the new table with the compute_nodes
table to get all the data(https://review.openstack.org/35759). Someone is
concerning about the performance penalty of the join load
operation when there are many metrics data stored in the DB for every single
compute node. Don suggested adding a new column in the
current compute_nodes table in DB, and put all metric data into a dictionary
key/value format and store the json encoded string of the
dictionary into that new column in DB.

I'm just wondering which way has less performance impact, join load
with a new table with quite a lot of rows, or json encode/decode a
dictionary with a lot of key/value pairs?

Thanks,
-Lianhao

I'm really confused. Why are we talking about collecting host metrics in
nova when we've got a whole project to do that in ceilometer? I think
utilization based scheduling would be a great thing, but it really out
to be interfacing with ceilometer to get that data. Storing it again in
nova (or even worse collecting it a second time in nova) seems like the
wrong direction.

I think there was an equiv patch series at the end of Grizzly that was
pushed out for the same reasons.

If there is a reason ceilometer can't be used in this case, we should
have that discussion here on the list. Because my initial reading of
this blueprint and the code patches is that it partially duplicates
ceilometer function, which we definitely don't want to do. Would be
happy to be proved wrong on that.

-Sean

Using ceilometer as the source of those metrics was discussed in the
nova-scheduler subgroup meeting. (see #topic extending data in host
state in the following link).
http://eavesdrop.openstack.org/meetings/scheduler/2013/scheduler.2013-04-30-15.04.log.html

In that meeting, all agreed that ceilometer would be a great source of
metrics for scheduler, but many of them don't want to make the
ceilometer as a mandatory dependency for nova scheduler.

This was also discussed at the Havana summit and rejected since we
didn't want to introduce the external dependency of Ceilometer into Nova.

That said, we already have hooks at the virt layer for collecting host
metrics and we're talking about removing the pollsters from nova compute
nodes if the data can be collected from these existing hooks.

Whatever solution the scheduler group decides to use should utilize the
existing (and maintained/growing) mechanisms we have in place there.
That is, it should likely be a special notification driver that can get
the data back to the scheduler in a timely fashion. It wouldn't have to
use the rpc mechanism if it didn't want to, but it should be a plug-in
at the notification layer.

Please don't add yet another way of pulling metric data out of the hosts.

-S

Besides, currently ceilometer doesn't have host metrics, like the
cpu/network/cache utilization data of the compute node host, which
will affect the scheduling decision. What ceilometer has currently
is the VM metrics, like cpu/network utilization of each VM instance.

After the nova compute node collects the host metrics, those metrics
could also be fed into ceilometer framework(e.g. through a ceilometer
listener) for further processing, like alarming, etc.

-Lianhao

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] New DB column or new DB table?

2013-07-19 Thread Sandy Walsh

On 07/19/2013 09:43 AM, Sandy Walsh wrote:

On 07/18/2013 11:12 PM, Lu, Lianhao wrote:
Sean Dague wrote on 2013-07-18:
On 07/17/2013 10:54 PM, Lu, Lianhao wrote:
Hi fellows,

Currently we adds a new table to store all the metric data and have
nova-scheduler join loads the new table with the compute_nodes
table to get all the data(https://review.openstack.org/35759). Someone is
concerning about the performance penalty of the join load
operation when there are many metrics data stored in the DB for every
single compute node. Don suggested adding a new column in the
current compute_nodes table in DB, and put all metric data into a
dictionary key/value format and store the json encoded string of the
dictionary into that new column in DB.

I'm just wondering which way has less performance impact, join load
with a new table with quite a lot of rows, or json encode/decode a
dictionary with a lot of key/value pairs?

Thanks,
-Lianhao

I think there was an equiv patch series at the end of Grizzly that was
pushed out for the same reasons.

-Sean

In that meeting, all agreed that ceilometer would be a great source of
metrics for scheduler, but many of them don't want to make the
ceilometer as a mandatory dependency for nova scheduler.

This was also discussed at the Havana summit and rejected since we
didn't want to introduce the external dependency of Ceilometer into Nova.

Please don't add yet another way of pulling metric data out of the hosts.

-S

I should also add, that if you go the notification route, that doesn't
close the door on ceilometer integration. All you need is a means to get
the data from the notification driver to the scheduler, that part could
easily be replaced with a ceilometer driver if an operator wanted to go
that route.

The benefits of using Ceilometer would be having access to the
downstream events/meters and generated statistics that could be produced
there. We certainly don't want to add an advanced statistical package or
event-stream manager to Nova, when Ceilometer already has aspirations of
that.

The out-of-the-box nova experience should be better scheduling when
simple host metrics are used internally but really great scheduling when
integrated with Ceilometer.

After the nova compute node collects the host metrics, those metrics
could also be fed into ceilometer framework(e.g. through a ceilometer
listener) for further processing, like alarming, etc.

-Lianhao

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list

Re: [openstack-dev] [Nova] Ceilometer vs. Nova internal metrics collector for scheduling

2013-07-19 Thread Sandy Walsh



On 07/19/2013 12:30 PM, Sean Dague wrote:
 On 07/19/2013 10:37 AM, Murray, Paul (HP Cloud Services) wrote:
 If we agree that something like capabilities should go through Nova,
 what do you suggest should be done with the change that sparked this
 debate: https://review.openstack.org/#/c/35760/

 I would be happy to use it or a modified version.
 
 CPU sys, user, idle, iowait time isn't capabilities though. That's a
 dynamically changing value. I also think the current approach where this
 is point in time sampling, because we only keep a single value, is going
 to cause some oddly pathologic behavior if you try to use it as
 scheduling criteria.
 
 I'd really appreciate the views of more nova core folks on this thread,
 as it looks like these blueprints have seen pretty minimal code review
 at this point. H3 isn't that far away, and there is a lot of high
 priority things ahead of this, and only so much coffee and review time
 in a day.

You really need to have a moving window average of these meters in order
to have anything sensible. Also, some sort of view into the pipeline of
scheduler requests (what's coming up?)

Capabilities are only really used in the host filtering phase. The host
weighing phase is where these measurements would be applied.


 -Sean
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] A simple way to improve nova scheduler

2013-07-19 Thread Sandy Walsh



On 07/19/2013 05:01 PM, Boris Pavlovic wrote:
 Sandy,
 
 Hm I don't know that algorithm. But our approach doesn't have
 exponential exchange.
 I don't think that in 10k nodes cloud we will have a problems with 150
 RPC call/sec. Even in 100k we will have only 1.5k RPC call/sec.
 More then (compute nodes update their state in DB through conductor
 which produce the same count of RPC calls). 
 
 So I don't see any explosion here.

Sorry, I was commenting on Soren's suggestion from way back (essentially
listening on a separate exchange for each unique flavor ... so no
scheduler was needed at all). It was a great idea, but fell apart rather
quickly.

The existing approach the scheduler takes is expensive (asking the db
for state of all hosts) and polling the compute nodes might be do-able,
but you're still going to have latency problems waiting for the
responses (the states are invalid nearly immediately, especially if a
fill-first scheduling algorithm is used). We ran into this problem
before in an earlier scheduler implementation. The round-tripping kills.

We have a lot of really great information on Host state in the form of
notifications right now. I think having a service (or notification
driver) listening for these and keeping an the HostState incrementally
updated (and reported back to all of the schedulers via the fanout
queue) would be a better approach.

-S


 
 Best regards,
 Boris Pavlovic
 
 Mirantis Inc.  
 
 
 On Fri, Jul 19, 2013 at 11:47 PM, Sandy Walsh sandy.wa...@rackspace.com
 mailto:sandy.wa...@rackspace.com wrote:
 
 
 
 On 07/19/2013 04:25 PM, Brian Schott wrote:
  I think Soren suggested this way back in Cactus to use MQ for compute
  node state rather than database and it was a good idea then.
 
 The problem with that approach was the number of queues went exponential
 as soon as you went beyond simple flavors. Add Capabilities or other
 criteria and you get an explosion of exchanges to listen to.
 
 
 
  On Jul 19, 2013, at 10:52 AM, Boris Pavlovic bo...@pavlovic.me
 mailto:bo...@pavlovic.me
  mailto:bo...@pavlovic.me mailto:bo...@pavlovic.me wrote:
 
  Hi all,
 
 
  In Mirantis Alexey Ovtchinnikov and me are working on nova scheduler
  improvements.
 
  As far as we can see the problem, now scheduler has two major issues:
 
  1) Scalability. Factors that contribute to bad scalability are these:
  *) Each compute node every periodic task interval (60 sec by default)
  updates resources state in DB.
  *) On every boot request scheduler has to fetch information about all
  compute nodes from DB.
 
  2) Flexibility. Flexibility perishes due to problems with:
  *) Addiing new complex resources (such as big lists of complex
 objects
  e.g. required by PCI Passthrough
 
 https://review.openstack.org/#/c/34644/5/nova/db/sqlalchemy/models.py)
  *) Using different sources of data in Scheduler for example from
  cinder or ceilometer.
  (as required by Volume Affinity Filter
  https://review.openstack.org/#/c/29343/)
 
 
  We found a simple way to mitigate this issues by avoiding of DB usage
  for host state storage.
 
  A more detailed discussion of the problem state and one of a possible
  solution can be found here:
 
 
 
 https://docs.google.com/document/d/1_DRv7it_mwalEZzLy5WO92TJcummpmWL4NWsWf0UWiQ/edit#
 
 
  Best regards,
  Boris Pavlovic
 
  Mirantis Inc.
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
 mailto:OpenStack-dev@lists.openstack.org
  mailto:OpenStack-dev@lists.openstack.org
 mailto:OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
 mailto:OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 mailto:OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] A simple way to improve nova scheduler

2013-07-19 Thread Sandy Walsh



On 07/19/2013 05:36 PM, Boris Pavlovic wrote:
 Sandy,
 
 I don't think that we have such problems here.
 Because scheduler doesn't pool compute_nodes. 
 The situation is another compute_nodes notify scheduler about their
 state. (instead of updating their state in DB)
 
 So for example if scheduler send request to compute_node, compute_node
 is able to run rpc call to schedulers immediately (not after 60sec).
 
 So there is almost no races.

There are races that occur between the eventlet request threads. This is
why the scheduler has been switched to single threaded and we can only
run one scheduler.

This problem may have been eliminated with the work that Chris Behrens
and Brian Elliott were doing, but I'm not sure.

But certainly, the old approach of having the compute node broadcast
status every N seconds is not suitable and was eliminated a long time ago.

 
 
 Best regards,
 Boris Pavlovic
 
 Mirantis Inc. 
 
 
 
 On Sat, Jul 20, 2013 at 12:23 AM, Sandy Walsh sandy.wa...@rackspace.com
 mailto:sandy.wa...@rackspace.com wrote:
 
 
 
 On 07/19/2013 05:01 PM, Boris Pavlovic wrote:
  Sandy,
 
  Hm I don't know that algorithm. But our approach doesn't have
  exponential exchange.
  I don't think that in 10k nodes cloud we will have a problems with 150
  RPC call/sec. Even in 100k we will have only 1.5k RPC call/sec.
  More then (compute nodes update their state in DB through conductor
  which produce the same count of RPC calls).
 
  So I don't see any explosion here.
 
 Sorry, I was commenting on Soren's suggestion from way back (essentially
 listening on a separate exchange for each unique flavor ... so no
 scheduler was needed at all). It was a great idea, but fell apart rather
 quickly.
 
 The existing approach the scheduler takes is expensive (asking the db
 for state of all hosts) and polling the compute nodes might be do-able,
 but you're still going to have latency problems waiting for the
 responses (the states are invalid nearly immediately, especially if a
 fill-first scheduling algorithm is used). We ran into this problem
 before in an earlier scheduler implementation. The round-tripping kills.
 
 We have a lot of really great information on Host state in the form of
 notifications right now. I think having a service (or notification
 driver) listening for these and keeping an the HostState incrementally
 updated (and reported back to all of the schedulers via the fanout
 queue) would be a better approach.
 
 -S
 
 
 
  Best regards,
  Boris Pavlovic
 
  Mirantis Inc.
 
 
  On Fri, Jul 19, 2013 at 11:47 PM, Sandy Walsh
 sandy.wa...@rackspace.com mailto:sandy.wa...@rackspace.com
  mailto:sandy.wa...@rackspace.com
 mailto:sandy.wa...@rackspace.com wrote:
 
 
 
  On 07/19/2013 04:25 PM, Brian Schott wrote:
   I think Soren suggested this way back in Cactus to use MQ
 for compute
   node state rather than database and it was a good idea then.
 
  The problem with that approach was the number of queues went
 exponential
  as soon as you went beyond simple flavors. Add Capabilities or
 other
  criteria and you get an explosion of exchanges to listen to.
 
 
 
   On Jul 19, 2013, at 10:52 AM, Boris Pavlovic
 bo...@pavlovic.me mailto:bo...@pavlovic.me
  mailto:bo...@pavlovic.me mailto:bo...@pavlovic.me
   mailto:bo...@pavlovic.me mailto:bo...@pavlovic.me
 mailto:bo...@pavlovic.me mailto:bo...@pavlovic.me wrote:
  
   Hi all,
  
  
   In Mirantis Alexey Ovtchinnikov and me are working on nova
 scheduler
   improvements.
  
   As far as we can see the problem, now scheduler has two
 major issues:
  
   1) Scalability. Factors that contribute to bad scalability
 are these:
   *) Each compute node every periodic task interval (60 sec
 by default)
   updates resources state in DB.
   *) On every boot request scheduler has to fetch information
 about all
   compute nodes from DB.
  
   2) Flexibility. Flexibility perishes due to problems with:
   *) Addiing new complex resources (such as big lists of complex
  objects
   e.g. required by PCI Passthrough
  
 
 https://review.openstack.org/#/c/34644/5/nova/db/sqlalchemy/models.py)
   *) Using different sources of data in Scheduler for example
 from
   cinder or ceilometer.
   (as required by Volume Affinity Filter
   https://review.openstack.org/#/c/29343/)
  
  
   We found a simple way to mitigate this issues by avoiding
 of DB usage
   for host state storage.
  
   A more detailed discussion of the problem state

Re: [openstack-dev] [Openstack] Ceilometer and notifications

2013-08-01 Thread Sandy Walsh



On 08/01/2013 07:02 AM, Mark McLoughlin wrote:
 On Thu, 2013-08-01 at 10:36 +0200, Julien Danjou wrote:
 On Thu, Aug 01 2013, Sam Morrison wrote:

 OK so is it that ceilometer just leaves the message on the queue or
 only consumes certain messages?

 Ceilometer uses its own queue. There might be other processes consuming
 this notifications, so removing them may be not a good idea.

 The problem may be that the notification sender create a queue by
 default even if there's no consumer on that. Maybe that's something we
 should avoid doing in Oslo (Cc'ing -dev to get advice on that).
 
 I'm missing the context here, but it sounds like the default
 notifications queue created isn't the one consumed by ceilometer so it
 fills up and we just shouldn't be creating that queue.
 
 Sounds reasonable to me. Definitely file a bug for it.

Hmm, if notifications are turned on, it should fill up. For billing
purposes we don't want to lose events simply because there is no
consumer. Operations would alert on it and someone would need to put out
the fire.

That's the reason we create the queue up front in the first place.
Ideally, we could only write to the exchange, but we need the queue to
ensure we don't lose any events.

The CM Collector consumes from two queues: it's internal queue and the
Nova queue (if configured). If CM is looking at the wrong nova queue by
default, the bug would be over there.


 
 Cheers,
 Mark.
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Weight normalization in scheduler

2013-08-01 Thread Sandy Walsh



On 08/01/2013 04:24 AM, Álvaro López García wrote:
 Hi all.
 
 TL;DR: I've created a blueprint [1] regarding weight normalization.
 I would be very glad if somebody could examine and comment it.

Something must have changed. It's been a while since I've done anything
with the scheduler, but normalized weights is the way it was designed
and implemented.

The separate Weighing plug-ins are responsible for taking the specific
units (cpu load, disk, ram, etc) and converting them into normalized
0.0-1.0 weights. Internally the plug-ins can work however they like, but
their output should be 0-1.

The multiplier, however, could scale this outside that range (if disk is
more important than cpu, for example).

Actually, I remember it being offset + scale * weight, so you could put
certain factors in bands: cpu: 1000+, disk: 1+, etc. Hopefully
offset is still there too?

-S



 Recently I've been developing some weighers to be used within nova and I
 found that the weight system was using raw values. This makes difficult
 for an operator to stablish the importance of a weigher against the rest
 of them, since the values can range freely and one big magnitude
 returned by a weigher could shade another one.
 
 One solution is to inflate either the multiplier or the weight that is
 returned by the weigher, but this is an ugly hack (for example, if you
 increase the RAM on your systems, you will need to adjust the
 multipliers again). A much better approach is to use weight
 normalization before actually using the weights
 
 With weight normalization a weigher will still return a list of RAW
 values, but the BaseWeightHandler will normalize all of them into a
 range of values (0.0 and 1.0) before adding them up. This way, a weight
 for a given object will be:
 
   weight = w1_multiplier * norm(w1) + w2_multiplier * norm(w2) + ...
 
 This makes easier to stablish the importance of a weigher regarding the
 rest, by just adjusting the multiplier. This is explained in [1], and
 implemented in [2] (with some suggestions by the reviewers).
 
 [1] 
 https://blueprints.launchpad.net/openstack/?searchtext=normalize-scheduler-weights
 [2] https://review.openstack.org/#/c/27160/
 
 Thanks for your feedback,
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Openstack] Ceilometer and notifications

2013-08-01 Thread Sandy Walsh



On 08/01/2013 09:19 AM, Julien Danjou wrote:
 On Thu, Aug 01 2013, Sandy Walsh wrote:
 
 Hmm, if notifications are turned on, it should fill up. For billing
 purposes we don't want to lose events simply because there is no
 consumer. Operations would alert on it and someone would need to put out
 the fire.
 
 So currently, are we're possibly losing events because we don't use the
 standard queue but one defined by Ceilometer upon connection?
 We can't consume events from the default notifications queue or we would
 break any tool possibly using it. Each tool consuming needs a copy of
 them.

Right, that is a concern. Within RAX we have two downstream services
that consume notifications (StackTach and Yagi) and we've configured
nova to write to two queues. --notifications_topics can take a list.

 Isn't there a way to queue the message in exchanges if there's no queue
 at all?

I don't think so, but if that was possible it would solve our problem.

AFAIK, amqp only uses the exchange as a dispatcher and all storage is
done in the queue ... but I could be wrong. I vaguely recall there being
a durable exchange setting as well as durable queue.

I'll do some investigating.

 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

1 2 >

1 - 100 of 140 matches

Mail list logo