from:"Mark McLoughlin"

Re: [openstack-dev] [nova][libvirt] Deprecating the live_migration_flag and block_migration_flag config options

2016-01-08 Thread Mark McLoughlin

On Fri, 2016-01-08 at 14:11 +, Daniel P. Berrange wrote:
> On Thu, Jan 07, 2016 at 09:07:00PM +0000, Mark McLoughlin wrote:
> > On Thu, 2016-01-07 at 12:23 +0100, Sahid Orentino Ferdjaoui wrote:
> > > On Mon, Jan 04, 2016 at 09:12:06PM +0000, Mark McLoughlin wrote:
> > > > Hi
> > > > 
> > > > commit 8ecf93e[1] got me thinking - the live_migration_flag config
> > > > option unnecessarily allows operators choose arbitrary behavior of the
> > > > migrateToURI() libvirt call, to the extent that we allow the operator
> > > > to configure a behavior that can result in data loss[1].
> > > > 
> > > > I see that danpb recently said something similar:
> > > > 
> > > >   https://review.openstack.org/171098
> > > > 
> > > >   "Honestly, I wish we'd just kill off  'live_migration_flag' and
> > > >   'block_migration_flag' as config options. We really should not be
> > > >   exposing low level libvirt API flags as admin tunable settings.
> > > > 
> > > >   Nova should really be in charge of picking the correct set of flags
> > > >   for the current libvirt version, and the operation it needs to
> > > >   perform. We might need to add other more sensible config options in
> > > >   their place [..]"
> > > 
> > > Nova should really handle internal flags and this serie is running in
> > > the right way.
> > > 
> > > > ...
> > > 
> > > >   4) Add a new config option for tunneled versus native:
> > > > 
> > > >    [libvirt]
> > > >    live_migration_tunneled = true
> > > > 
> > > >  This enables the use of the VIR_MIGRATE_TUNNELLED flag. We have 
> > > >      historically defaulted to tunneled mode because it requires the 
> > > >      least configuration and is currently the only way to have a 
> > > >      secure migration channel.
> > > > 
> > > >      danpb's quote above continues with:
> > > > 
> > > >        "perhaps a "live_migration_secure_channel" to indicate that 
> > > >         migration must use encryption, which would imply use of 
> > > >         TUNNELLED flag"
> > > > 
> > > >      So we need to discuss whether the config option should express the
> > > >      choice of tunneled vs native, or whether it should express another
> > > >      choice which implies tunneled vs native.
> > > > 
> > > >        https://review.openstack.org/263434
> > > 
> > > We probably have to consider that operator does not know much about
> > > internal libvirt flags, so options we are exposing for him should
> > > reflect benefice of using them. I commented on your review we should
> > > at least explain benefice of using this option whatever the name is.
> > 
> > As predicted, plenty of discussion on this point in the review :)
> > 
> > You're right that we don't give the operator any guidance in the help
> > message about how to choose true or false for this:
> > 
> >   Whether to use tunneled migration, where migration data is 
> >   transported over the libvirtd connection. If True,
> >   we use the VIR_MIGRATE_TUNNELLED migration flag
> > 
> > libvirt's own docs on this are here:
> > 
> >   https://libvirt.org/migration.html#transport
> > 
> > which emphasizes:
> > 
> >   - the data copies involved in tunneling
> >   - the extra configuration steps required for native
> >   - the encryption support you get when tunneling
> > 
> > The discussions I've seen on this topic wrt Nova have revolved around:
> > 
> >   - that tunneling allows for an encrypted transport[1]
> >   - that qemu's NBD based drive-mirror block migration isn't supported
> >     using tunneled mode, and that danpb is working on fixing this
> >     limitation in libvirt
> >   - "selective" block migration[2] won't work with the fallback qemu
> >     block migration support, and so won't currently work in tunneled
> >     mode
> 
> I'm not working on fixing it, but IIRC some other dev had proposed
> patches.
> 
> > 
> > So, the advise to operators would be:
> > 
> >   - You may want to choose tunneled=False for improved block migration 
> >     capabilities, but this limitation will go away in future.
> >   - You may want to choose tunneled=False if you wish to trade and
> >     encrypted tra

Re: [openstack-dev] [nova][libvirt] Deprecating the live_migration_flag and block_migration_flag config options

2016-01-07 Thread Mark McLoughlin

On Thu, 2016-01-07 at 12:23 +0100, Sahid Orentino Ferdjaoui wrote:
> On Mon, Jan 04, 2016 at 09:12:06PM +0000, Mark McLoughlin wrote:
> > Hi
> > 
> > commit 8ecf93e[1] got me thinking - the live_migration_flag config
> > option unnecessarily allows operators choose arbitrary behavior of the
> > migrateToURI() libvirt call, to the extent that we allow the operator
> > to configure a behavior that can result in data loss[1].
> > 
> > I see that danpb recently said something similar:
> > 
> >   https://review.openstack.org/171098
> > 
> >   "Honestly, I wish we'd just kill off  'live_migration_flag' and
> >   'block_migration_flag' as config options. We really should not be
> >   exposing low level libvirt API flags as admin tunable settings.
> > 
> >   Nova should really be in charge of picking the correct set of flags
> >   for the current libvirt version, and the operation it needs to
> >   perform. We might need to add other more sensible config options in
> >   their place [..]"
> 
> Nova should really handle internal flags and this serie is running in
> the right way.
> 
> > ...
> 
> >   4) Add a new config option for tunneled versus native:
> > 
> >    [libvirt]
> >    live_migration_tunneled = true
> > 
> >  This enables the use of the VIR_MIGRATE_TUNNELLED flag. We have 
> >      historically defaulted to tunneled mode because it requires the 
> >      least configuration and is currently the only way to have a 
> >      secure migration channel.
> > 
> >      danpb's quote above continues with:
> > 
> >        "perhaps a "live_migration_secure_channel" to indicate that 
> >         migration must use encryption, which would imply use of 
> >         TUNNELLED flag"
> > 
> >      So we need to discuss whether the config option should express the
> >      choice of tunneled vs native, or whether it should express another
> >      choice which implies tunneled vs native.
> > 
> >        https://review.openstack.org/263434
> 
> We probably have to consider that operator does not know much about
> internal libvirt flags, so options we are exposing for him should
> reflect benefice of using them. I commented on your review we should
> at least explain benefice of using this option whatever the name is.

As predicted, plenty of discussion on this point in the review :)

You're right that we don't give the operator any guidance in the help
message about how to choose true or false for this:

  Whether to use tunneled migration, where migration data is 
  transported over the libvirtd connection. If True,
  we use the VIR_MIGRATE_TUNNELLED migration flag

libvirt's own docs on this are here:

  https://libvirt.org/migration.html#transport

which emphasizes:

  - the data copies involved in tunneling
  - the extra configuration steps required for native
  - the encryption support you get when tunneling

The discussions I've seen on this topic wrt Nova have revolved around:

  - that tunneling allows for an encrypted transport[1]
  - that qemu's NBD based drive-mirror block migration isn't supported
    using tunneled mode, and that danpb is working on fixing this
    limitation in libvirt
  - "selective" block migration[2] won't work with the fallback qemu
    block migration support, and so won't currently work in tunneled
    mode

So, the advise to operators would be:

  - You may want to choose tunneled=False for improved block migration 
    capabilities, but this limitation will go away in future.
  - You may want to choose tunneled=False if you wish to trade and
    encrypted transport for a (potentially negligible) performance
    improvement.

Does that make sense?

As for how to name the option, and as I said in the review, I think it
makes sense to be straightforward here and make it clearly about
choosing to disable libvirt's tunneled transport.

If we name it any other way, I think our explanation for operators will
immediately jump to explaining (a) that it influences the TUNNELLED
flag, and (b) the differences between the tunneled and native
transports. So, if we're going to have to talk about tunneled versus
native, why obscure that detail?

But, Pawel strongly disagrees.

One last point I'd make is this isn't about adding a *new*
configuration capability for operators. As we deprecate and remove
these configuration options, we need to be careful not to remove a
capability that operators are currently depending on for arguably
reasonable reasons.

[1] - https://review.openstack.org/#/c/171098/
[2] - https://review.openstack.org/#/c/227278


> >   5) Add a new config option for additional migration flags:
> > 
> >    [libvirt]
> >

[openstack-dev] [nova][libvirt] Deprecating the live_migration_flag and block_migration_flag config options

2016-01-04 Thread Mark McLoughlin

Hi

commit 8ecf93e[1] got me thinking - the live_migration_flag config
option unnecessarily allows operators choose arbitrary behavior of the
migrateToURI() libvirt call, to the extent that we allow the operator
to configure a behavior that can result in data loss[1].

I see that danpb recently said something similar:

  https://review.openstack.org/171098

  "Honestly, I wish we'd just kill off  'live_migration_flag' and
  'block_migration_flag' as config options. We really should not be
  exposing low level libvirt API flags as admin tunable settings.

  Nova should really be in charge of picking the correct set of flags
  for the current libvirt version, and the operation it needs to
  perform. We might need to add other more sensible config options in
  their place [..]"

I've just proposed a series of patches, which boils down to the
following steps:

  1) Modify the approach taken in commit 8ecf93e so that instead of 
     just warning about unsafe use of NON_SHARED_INC, we fix up the 
     config option to a safe value.

       https://review.openstack.org/263431

  2) Hard-code the P2P flag for live and block migrations as 
     appropriate for the libvirt driver being used.

 For the qemu driver, We should always use VIR_MIGRATE_PEER2PEER 
     both live and block migrations. Without this option, you get:

   Live Migration failure: Requested operation is not valid: direct 
migration is not supported by the connection driver

 OTOH, the Xen driver does not support P2P, and only supports 
     "unmanaged direct connection".

       https://review.openstack.org/263432

  3) Require the use of the UNDEFINE_SOURCE flag, and the non-use of
 the PERSIST_DEST flag.

 Nova itself persists the domain configuration on the destination
 host, but it assumes the libvirt migration call removes it from
the
 source host. So it makes no sense to allow operators configure
 these flags.

   https://review.openstack.org/263433

  4) Add a new config option for tunneled versus native:

   [libvirt]
   live_migration_tunneled = true

 This enables the use of the VIR_MIGRATE_TUNNELLED flag. We have 
     historically defaulted to tunneled mode because it requires the 
     least configuration and is currently the only way to have a 
     secure migration channel.

     danpb's quote above continues with:

       "perhaps a "live_migration_secure_channel" to indicate that 
        migration must use encryption, which would imply use of 
        TUNNELLED flag"

     So we need to discuss whether the config option should express the
     choice of tunneled vs native, or whether it should express another
     choice which implies tunneled vs native.

       https://review.openstack.org/263434

  5) Add a new config option for additional migration flags:

   [libvirt]
   live_migration_extra_flags = VIR_MIGRATE_COMPRESSED

 This allows operators to continue to experiment with libvirt behaviors
 in safe ways without each use case having to be accounted for.

   https://review.openstack.org/263435

 We would disallow setting the following flags via this option:

   VIR_MIGRATE_LIVE
   VIR_MIGRATE_PEER2PEER
   VIR_MIGRATE_TUNNELLED
   VIR_MIGRATE_PERSIST_DEST
   VIR_MIGRATE_UNDEFINE_SOURCE
   VIR_MIGRATE_NON_SHARED_INC
   VIR_MIGRATE_NON_SHARED_DISK

which would allow the following currently available flags to be set:

   VIR_MIGRATE_PAUSED
   VIR_MIGRATE_CHANGE_PROTECTION
   VIR_MIGRATE_UNSAFE
   VIR_MIGRATE_OFFLINE
   VIR_MIGRATE_COMPRESSED
   VIR_MIGRATE_ABORT_ON_ERROR
   VIR_MIGRATE_AUTO_CONVERGE
   VIR_MIGRATE_RDMA_PIN_ALL

  6) Deprecate the existing live_migration_flag and block_migration_flag
 config options. Operators would be expected to migrate to using the
 live_migration_tunneled or live_migration_extra_flags config options.
 During the deprecation period we would invite feedback as to whether
 additional config options are needed to cover unanticipated use cases.

   https://review.openstack.org/263436


Thanks in advance for any feedback.

I'm going to guess that one piece of feedback will be that some subset
of this needs a blueprint (and maybe a spec), and that the blueprint
freeze was a month ago, so that subset needs to be punted until after
Mitaka? I'd love to be wrong about that, though :)

Thanks,
Mark.

[1] - https://review.openstack.org/228853
[2] - Data loss can occur when you have disk images on shared storage and 
you specify the VIR_MIGRATE_NON_SHARED_INC or VIR_MIGRATE_NON_SHARED_DISK 
because during the block migration the disk is copied back over itself
while it is in use from another node.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

Re: [openstack-dev] [puppet] naming of the project

2015-04-19 Thread Mark McLoughlin

Hi Emilien,

On Fri, 2015-04-17 at 10:52 -0400, Emilien Macchi wrote:
 
 On 04/16/2015 02:32 PM, Emilien Macchi wrote:
  
  
  On 04/16/2015 02:23 PM, Richard Raseley wrote:
  Emilien Macchi wrote:
  Hi all,
 
  I sent a patch to openstack/governance to move our project under the big
  tent, and it came up [1] that we should decide of a project name and be
  careful about trademarks issues with Puppet name.
 
  I would like to hear from Puppetlabs if there is any issue to use Puppet
  in the project title; also, I open a new etherpad so people can suggest
  some names: https://etherpad.openstack.org/p/puppet-openstack-naming
 
  Thanks,
 
  [1] https://review.openstack.org/#/c/172112/1/reference/projects.yaml,cm
 
  Emilien,
 
  I went ahead and had a discussion with Puppet's legal team on this
  issue. Unfortunately at this time we are unable to sanction the use of
  Puppet's name or registered trademarks as part of the project's name.
 
  To be clear, this decision is in no way indicative of Puppet not feeling
  the project is 'worthy' or 'high quality' (in fact the opposite is
  true), but rather is a purely defensive decision.
 
  We are in the process of reevaluating our usage guidelines, but there is
  no firm timetable as of this moment.
  
  I guess our best option is to choose a name without Puppet in the title.
  We will proceed to a vote after all proposals on the etherpad.
 
 While we hear from Puppetlabs about the trademark potential issue, I
 would like to run a vote for a name that does not contain `Puppet`, so
 we can go ahead on the governance thing.
 I took all proposals on the etherpad [1] and created a poll that will
 close on next Tuesday 3pm, just before our weekly meeting so we will
 make it official.
 
 Anyone is welcome to vote:
 http://civs.cs.cornell.edu/cgi-bin/vote.pl?id=E_6c81ad92b71422d6akey=f2e85294f17caa9a
 
 Any feedback on the vote itself is also welcome.
 
 Thanks,
 
 [1] https://etherpad.openstack.org/p/puppet-openstack-naming

Another idea on this ... a number of OpenStack projects have purely
descriptive names (and no 'service' attribute), for example
Infrastructure, Documentation, Security, Quality Assurance, and Release
Cycle Management.

Simply calling the project OpenStack Puppet Modules would follow that
pattern, and a straightforward descriptive use of the Puppet name may
not be objectionable.

Mark.


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] interesting problem with config filter

2014-12-16 Thread Mark McLoughlin

Hi Doug,

On Mon, 2014-12-08 at 15:58 -0500, Doug Hellmann wrote:
 As we’ve discussed a few times, we want to isolate applications from
 the configuration options defined by libraries. One way we have of
 doing that is the ConfigFilter class in oslo.config. When a regular
 ConfigOpts instance is wrapped with a filter, a library can register
 new options on the filter that are not visible to anything that
 doesn’t have the filter object.

Or to put it more simply, the configuration options registered by the
library should not be part of the public API of the library.

  Unfortunately, the Neutron team has identified an issue with this
 approach. We have a bug report [1] from them about the way we’re using
 config filters in oslo.concurrency specifically, but the issue applies
 to their use everywhere. 
 
 The neutron tests set the default for oslo.concurrency’s lock_path
 variable to “$state_path/lock”, and the state_path option is defined
 in their application. With the filter in place, interpolation of
 $state_path to generate the lock_path value fails because state_path
 is not known to the ConfigFilter instance.

It seems that Neutron sets this default in its etc/neutron.conf file in
its git tree:

  lock_path = $state_path/lock

I think we should be aiming for defaults like this to be set in code,
and for the sample config files to contain nothing but comments. So,
neutron should do:

  lockutils.set_defaults(lock_path=$state_path/lock)

That's a side detail, however.

 The reverse would also happen (if the value of state_path was somehow
 defined to depend on lock_path),

This dependency wouldn't/shouldn't be code - because Neutron *code*
shouldn't know about the existence of library config options.
Neutron deployers absolutely will be aware of lock_path however.

  and that’s actually a bigger concern to me. A deployer should be able
 to use interpolation anywhere, and not worry about whether the options
 are in parts of the code that can see each other. The values are all
 in one file, as far as they know, and so interpolation should “just
 work”.

Yes, if a deployer looks at a sample configuration file, all options
listed in there seem like they're in-play for substitution use within
the value of another option. For string substitution only, I'd say there
should be a global namespace where all options are registered.

Now ... one caveat on all of this ... I do think the string substitution
feature is pretty obscure and mostly just used in default values.

 I see a few solutions:
 
 1. Don’t use the config filter at all.
 2. Make the config filter able to add new options and still see
 everything else that is already defined (only filter in one
 direction).
 3. Leave things as they are, and make the error message better.

4. Just tackle this specific case by making lock_path implicitly
relative to a base path the application can set via an API, so Neutron
would do:

  lockutils.set_base_path(CONF.state_path)

at startup.

5. Make the toplevel ConfigOpts aware of all filters hanging off it, and
somehow cycle through all of those filters just when doing string
substitution.

 Because of the deployment implications of using the filter, I’m
 inclined to go with choice 1 or 2. However, choice 2 leaves open the
 possibility of a deployer wanting to use the value of an option
 defined by one filtered set of code when defining another. I don’t
 know how frequently that might come up, but it seems like the error
 would be very confusing, especially if both options are set in the
 same config file.
 
 I think that leaves option 1, which means our plans for hiding options
 from applications need to be rethought.
 
 Does anyone else see another solution that I’m missing?

I'd do something like (3) and (4), then wait to see if it crops up
multiple times in the future before tackling a more general solution.

With option (1), the basic thing to think about is how to maintain API
compatibility - if we expose the options through the API, how do we deal
with future moves, removals, renames, and changing semantics of those
config options.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Thoughts on OpenStack Layers and a Big Tent model

2014-09-22 Thread Mark McLoughlin

Hey

On Thu, 2014-09-18 at 11:53 -0700, Monty Taylor wrote:
 Hey all,
 
 I've recently been thinking a lot about Sean's Layers stuff. So I wrote
 a blog post which Jim Blair and Devananda were kind enough to help me edit.
 
 http://inaugust.com/post/108

Lots of great stuff here, but too much to respond to everything in
detail.

I love the way you've framed this in terms of the needs of developers,
distributors, deployers and end users. I'd like to make sure we're
focused on tackling those places where we're failing these groups, so:


 - Developers

   I think we're catering pretty well to developers with the big tent
   concept of Programs. There's been some good discussion about how
   Programs could be better at embracing projects in their related area,
   and that would be great to pursue. But the general concept - of 
   endorsing and empowering teams of people collaborating in the
   OpenStack way - has a lot of legs, I think.

   I also think our release cadence does a pretty good job of serving 
   developers. We've talked many times about the benefit of it, and I'd 
   like to see it applied to more than just the server projects.

   OTOH, the integrated gate is straining, and a source of frustration 
   for everyone. You raise the question of whether everything currently 
   in the integrated release needs to be co-gated, and I totally agree 
   that needs re-visiting.


 - Distributors

   We may be doing a better job of catering to distributors than any 
   other group. For example, things like the release cadence, stable 
   branch and common set of dependencies works pretty well.

.  The concept of an integrated release (with an incubation process) is
   great, because it nicely defines a set of stuff that distributors
   should ship. Certainly, life would be more difficult for distributors
   if there was a smaller set of projects in the release and a whole 
   bunch of other projects which are interesting to distro users, but 
   with an ambiguous level of commitment from our community. Right now, 
   our integration process has a huge amount of influence over what 
   gets shipped by distros, and that in turn serves distro users by 
   ensuring a greater level of commonality between distros.


 - Operators

   I think the feedback we've been getting over the past few cycles 
   suggests we are failing this group the most.

   Operators want to offer a compelling set of services to their users, 
   but they want those services to be stable, performant, and perhaps 
   most importantly, cost-effective. No operator wants to have to
   invest a huge amount of time in getting a new service running.

   You suggest a Production Ready tag. Absolutely - our graduation of 
   projects has been interpreted as meaning production ready, when 
   it's actually more useful as a signal to distros rather than 
   operators. Graduation does not necessarily imply that a service is
   ready for production, no matter how you define production.

   I'd like to think we could give more nuanced advice to operators than
   a simple tag, but perhaps the information gathering process that
   projects would need to go through to be awarded that tag would 
   uncover the more detailed information for operators.

   You could question whether the TC is the right body for this 
   process. How might it work if the User Committee owned this?

   There are many other ways we can and should help operators, 
   obviously, but this setting expectations is the aspect most 
   relevant to this discussion.


 - End users

   You're right that we don't pay sufficient attention to this group.
   For me, the highest priority challenge here is interoperability. 
   Particularly interoperability between public clouds.

   The only real interop effort to date you can point to is the 
   board-owned DefCore and RefStack efforts. The idea being that a
   trademark program with API testing requirements will focus minds on
   interoperability. I'd love us (as a community) to be making more
   rapid progress on interoperability, but at least there are no
   encouraging signs that we should make some definite progress soon.

   Your end-user focused concrete suggestions (#7-#10) are interesting,
   and I find myself thinking about how much of a positive effect on 
   interop each of them would have. For example, making our tools 
   multi-cloud aware would help encourage people to demand interop from 
   their providers. I also agree that end-user tools should support 
   older versions of our APIs, but don't think that necessarily implies 
   rolling releases.



So, if I was to pick the areas which I think would address our most
pressing challenges:

  1) Shrinking the integrated gate, and allowing per-project testing 
 strategies other than shoving every integrated project into the 
 gate.

  2) Giving more direction to operators about the readiness of our 
 projects for different use cases. A process around awarding

Re: [openstack-dev] [Zaqar] Zaqar graduation (round 2) [was: Comments on the concerns arose during the TC meeting]

2014-09-12 Thread Mark McLoughlin

On Wed, 2014-09-10 at 14:51 +0200, Thierry Carrez wrote:
 Flavio Percoco wrote:
  [...]
  Based on the feedback from the meeting[3], the current main concern is:
  
  - Do we need a messaging service with a feature-set akin to SQS+SNS?
  [...]
 
 I think we do need, as Samuel puts it, some sort of durable
 message-broker/queue-server thing. It's a basic application building
 block. Some claim it's THE basic application building block, more useful
 than database provisioning. It's definitely a layer above pure IaaS, so
 if we end up splitting OpenStack into layers this clearly won't be in
 the inner one. But I think IaaS+ basic application building blocks
 belong in OpenStack one way or another. That's the reason I supported
 Designate (everyone needs DNS) and Trove (everyone needs DBs).
 
 With that said, I think yesterday there was a concern that Zaqar might
 not fill the some sort of durable message-broker/queue-server thing
 role well. The argument goes something like: if it was a queue-server
 then it should actually be built on top of Rabbit; if it was a
 message-broker it should be built on top of postfix/dovecot; the current
 architecture is only justified because it's something in between, so
 it's broken.
 
 I guess I don't mind that much zaqar being something in between:
 unless I misunderstood, exposing extra primitives doesn't prevent the
 queue-server use case from being filled. Even considering the
 message-broker case, I'm also not convinced building it on top of
 postfix/dovecot would be a net win compared to building it on top of
 Redis, to be honest.

AFAICT, this part of the debate boils down to the following argument:

  If Zaqar implemented messaging-as-a-service with only queuing 
  semantics (and no random access semantics), it's design would 
  naturally be dramatically different and simply implement a 
  multi-tenant REST API in front of AMQP queues like this:

https://www.dropbox.com/s/yonloa9ytlf8fdh/ZaqarQueueOnly.png?dl=0

  and that this architecture would allow for dramatically improved 
  throughput for end-users while not making the cost of providing the 
  service prohibitive to operators.

You can't dismiss that argument out-of-hand, but I wonder (a) whether
the claimed performance improvement is going to make a dramatic
difference to the SQS-like use case and (b) whether backing this thing
with an RDBMS and multiple highly available, durable AMQP broker
clusters is going to be too much of a burden on operators for whatever
performance improvements it does gain.

But the troubling part of this debate is where we repeatedly batter the
Zaqar team with hypotheses like these and appear to only barely
entertain their carefully considered justification for their design
decisions like:

  
https://wiki.openstack.org/wiki/Frequently_asked_questions_%28Zaqar%29#Is_Zaqar_a_provisioning_service_or_a_data_API.3F
  
https://wiki.openstack.org/wiki/Frequently_asked_questions_%28Zaqar%29#What_messaging_patterns_does_Zaqar_support.3F

I would like to see an SQS-like API provided by OpenStack, I accept the
reasons for Zaqar's design decisions to date, I respect that those
decisions were made carefully by highly competent members of our
community and I expect Zaqar to evolve (like all projects) in the years
ahead based on more real-world feedback, new hypotheses or ideas, and
lessons learned from trying things out.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

2014-09-12 Thread Mark McLoughlin

On Wed, 2014-09-10 at 12:46 -0700, Monty Taylor wrote:
 On 09/09/2014 07:04 PM, Samuel Merritt wrote:
  On 9/9/14, 4:47 PM, Devananda van der Veen wrote:

  The questions now before us are:
  - should OpenStack include, in the integrated release, a
  messaging-as-a-service component?
 
  I certainly think so. I've worked on a few reasonable-scale web
  applications, and they all followed the same pattern: HTTP app servers
  serving requests quickly, background workers for long-running tasks, and
  some sort of durable message-broker/queue-server thing for conveying
  work from the first to the second.
 
  A quick straw poll of my nearby coworkers shows that every non-trivial
  web application that they've worked on in the last decade follows the
  same pattern.
 
  While not *every* application needs such a thing, web apps are quite
  common these days, and Zaqar satisfies one of their big requirements.
  Not only that, it does so in a way that requires much less babysitting
  than run-your-own-broker does.
 
 Right. But here's the thing.
 
 What you just described is what we all thought zaqar was aiming to be in 
 the beginning. We did not think it was a GOOD implementation of that, so 
 while we agreed that it would be useful to have one of those, we were 
 not crazy about the implementation.

Those generalizations are uncomfortably sweeping.

What Samuel just described is one of the messaging patterns that Zaqar
implements and some (members of the TC?) believed that this messaging
pattern was the only pattern that Zaqar aimed to implement.

Some (members of the TC?) formed strong, negative opinions about how
this messaging pattern was implemented, but some/all of those same
people agreed a messaging API implementing those semantics would be a
useful thing to have.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [nova] libvirt version_cap, a postmortem

2014-08-30 Thread Mark McLoughlin


Hey

The libvirt version_cap debacle continues to come up in conversation and
one perception of the whole thing appears to be:

  A controversial patch was ninjaed by three Red Hat nova-cores and 
  then the same individuals piled on with -2s when a revert was proposed
  to allow further discussion.

I hope it's clear to everyone why that's a pretty painful thing to hear.
However, I do see that I didn't behave perfectly here. I apologize for
that.

In order to understand where this perception came from, I've gone back
over the discussions spread across gerrit and the mailing list in order
to piece together a precise timeline. I've appended that below.

Some conclusions I draw from that tedious exercise:

 - Some people came at this from the perspective that we already have 
   a firm, unwritten policy that all code must have functional written 
   tests. Others see that test all the things is interpreted as a
   worthy aspiration, but is only one of a number of nuanced factors
   that needs to be taken into account when considering the addition of
   a new feature.

   i.e. the former camp saw Dan Smith's devref addition as attempting 
   to document an existing policy (perhaps even a more forgiving 
   version of an existing policy), whereas other see it as a dramatic 
   shift to a draconian implementation of test all the things.

 - Dan Berrange, Russell and I didn't feel like we were ninjaing a
   controversial patch - you can see our perspective expressed in 
   multiple places. The patch would have helped the live snapshot 
   issue, and has other useful applications. It does not affect the 
   broader testing debate.

   Johannes was a solitary voice expressing concerns with the patch, 
   and you could see that Dan was particularly engaged in trying to 
   address those concerns and repeating his feeling that the patch was 
   orthogonal to the testing debate.

   That all being said - the patch did merge too quickly.

 - What exacerbates the situation - particularly when people attempt to 
   look back at what happened - is how spread out our conversations 
   are. You look at the version_cap review and don't see any of the 
   related discussions on the devref policy review nor the mailing list 
   threads. Our disjoint methods of communicating contribute to 
   misunderstandings.

 - When it came to the revert, a couple of things resulted in 
   misunderstandings, hurt feelings and frayed tempers - (a) that our 
   retrospective veto revert policy wasn't well understood and (b) 
   a feeling that there was private, in-person grumbling about us at 
   the mid-cycle while we were absent, with no attempt to talk to us 
   directly.


To take an even further step back - successful communities like ours
require a huge amount of trust between the participants. Trust requires
communication and empathy. If communication breaks down and the pressure
we're all under erodes our empathy for each others' positions, then
situations can easily get horribly out of control.

This isn't a pleasant situation and we should all strive for better.
However, I tend to measure our flamewars against this:

  https://mail.gnome.org/archives/gnome-2-0-list/2001-June/msg00132.html

GNOME in June 2001 was my introduction to full-time open-source
development, so this episode sticks out in my mind. The two individuals
in that email were/are immensely capable and reasonable people, yet ...

So far, we're doing pretty okay compared to that and many other
open-source flamewars. Let's make sure we continue that way by avoiding
letting situations fester.


Thanks, and sorry for being a windbag,
Mark.

---

= July 1 =

The starting point is this review:

   https://review.openstack.org/103923

Dan Smith proposes a policy that the libvirt driver may not use libvirt
features until they have been available in Ubuntu or Fedora for at least
30 days.

The commit message mentions:

  broken us in the past when we add a new feature that requires a newer
   libvirt than we test with, and we discover that it's totally broken
   when we upgrade in the gate.

which AIUI is a reference to the libvirt live snapshot issue the
previous week, which is described here:

  https://review.openstack.org/102643

where upgrading to Ubuntu Trusty meant the libvirt version in use in the
gate went from 0.9.8 to 1.2.2, which caused the live snapshot code
paths in Nova for the first time, which appeared to be related to some
serious gate instability (although the exact root cause wasn't
identified).

Some background on the libvirt version upgrade can be seen here:

  
http://lists.openstack.org/pipermail/openstack-dev/2014-March/thread.html#30284

= July 1 - July 8 =

Back and forth debate mostly between Dan Smith and Dan Berrange. Sean
votes +2, Dan Berrange votes -2.

= July 14 =

Russell adds his support to Dan Berrange's position, votes -2. Some
debate between Dan and Dan continues. Joe Gordon votes +2. Matt
Riedemann expresses support-in-principal for

Re: [openstack-dev] [oslo.messaging] Request to include AMQP 1.0 support in Juno-3

2014-08-28 Thread Mark McLoughlin

On Thu, 2014-08-28 at 13:24 +0200, Flavio Percoco wrote:
 On 08/27/2014 03:35 PM, Ken Giusti wrote:
  Hi All,
  
  I believe Juno-3 is our last chance to get this feature [1] included
  into olso.messaging.
  
  I honestly believe this patch is about as low risk as possible for a
  change that introduces a whole new transport into oslo.messaging.  The
  patch shouldn't affect the existing transports at all, and doesn't
  come into play unless the application specifically turns on the new
  'amqp' transport, which won't be the case for existing applications.
  
  The patch includes a set of functional tests which exercise all the
  messaging patterns, timeouts, and even broker failover. These tests do
  not mock out any part of the driver - a simple test broker is included
  which allows the full driver codepath to be executed and verified.
  
  IFAIK, the only remaining technical block to adding this feature,
  aside from core reviews [2], is sufficient infrastructure test coverage.
  We discussed this a bit at the last design summit.  The root of the
  issue is that this feature is dependent on a platform-specific library
  (proton) that isn't in the base repos for most of the CI platforms.
  But it is available via EPEL, and the Apache QPID team is actively
  working towards getting the packages into Debian (a PPA is available
  in the meantime).
  
  In the interim I've proposed a non-voting CI check job that will
  sanity check the new driver on EPEL based systems [3].  I'm also
  working towards adding devstack support [4], which won't be done in
  time for Juno but nevertheless I'm making it happen.
  
  I fear that this feature's inclusion is stuck in a chicken/egg
  deadlock: the driver won't get merged until there is CI support, but
  the CI support won't run correctly (and probably won't get merged)
  until the driver is available.  The driver really has to be merged
  first, before I can continue with CI/devstack development.
  
  [1] 
  https://blueprints.launchpad.net/oslo.messaging/+spec/amqp10-driver-implementation
  [2] https://review.openstack.org/#/c/75815/
  [3] https://review.openstack.org/#/c/115752/
  [4] https://review.openstack.org/#/c/109118/
 
 
 Hi Ken,
 
 Thanks a lot for your hard work here. As I stated in my last comment on
 the driver's review, I think we should let this driver land and let
 future patches improve it where/when needed.
 
 I agreed on letting the driver land as-is based on the fact that there
 are patches already submitted ready to enable the gates for this driver.

I feel bad that the driver has been in a pretty complete state for quite
a while but hasn't received a whole lot of reviews. There's a lot of
promise to this idea, so it would be ideal if we could unblock it.

One thing I've been meaning to do this cycle is add concrete advice for
operators on the state of each driver. I think we'd be a lot more
comfortable merging this in Juno if we could somehow make it clear to
operators that it's experimental right now. My idea was:

  - Write up some notes which discusses the state of each driver e.g.

  - RabbitMQ - the default, used by the majority of OpenStack 
deployments, perhaps list some of the known bugs, particularly 
around HA.

  - Qpid - suitable for production, but used in a limited number of 
deployments. Again, list known issues. Mention that it will 
probably be removed with the amqp10 driver matures.

  - Proton/AMQP 1.0 - experimental, in active development, will
support  multiple brokers and topologies, perhaps a pointer to a
wiki page with the current TODO list

  - ZeroMQ - unmaintained and deprecated, planned for removal in
Kilo

  - Propose this addition to the API docs and ask the operators list 
for feedback

  - Propose a patch which adds a load-time deprecation warning to the 
ZeroMQ driver

  - Include a load-time experimental warning in the proton driver

Thoughts on that?

(I understand the ZeroMQ situation needs further discussion - I don't
think that's on-topic for the thread, I was just using it as example of
what kind of advice we'd be giving in these docs)

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] usage patterns for oslo.config

2014-08-27 Thread Mark McLoughlin

On Tue, 2014-08-26 at 10:00 -0400, Doug Hellmann wrote:
 On Aug 26, 2014, at 6:30 AM, Mark McLoughlin mar...@redhat.com wrote:
 
  On Mon, 2014-08-11 at 15:06 -0400, Doug Hellmann wrote:
  On Aug 8, 2014, at 7:22 PM, Devananda van der Veen 
  devananda@gmail.com wrote:
  
  On Fri, Aug 8, 2014 at 12:41 PM, Doug Hellmann d...@doughellmann.com 
  wrote:
  
  That’s right. The preferred approach is to put the register_opt() in
  *runtime* code somewhere before the option will be used. That might be in
  the constructor for a class that uses an option, for example, as 
  described
  in
  http://docs.openstack.org/developer/oslo.config/cfg.html#registering-options
  
  Doug
  
  Interesting.
  
  I've been following the prevailing example in Nova, which is to
  register opts at the top of a module, immediately after defining them.
  Is there a situation in which one approach is better than the other?
  
  The approach used in Nova is the “old” way of doing it. It works, but
  assumes that all of the application code is modifying a global
  configuration object. The runtime approach allows you to pass a
  configuration object to a library, which makes it easier to mock the
  configuration for testing and avoids having the configuration options
  bleed into the public API of the library. We’ve started using the
  runtime approach in new Oslo libraries that have configuration
  options, but changing the implementation in existing application code
  isn’t strictly necessary.
  
  I've been meaning to dig up some of the old threads and reviews to
  document how we got here.
  
  But briefly:
  
   * this global CONF variable originates from the gflags FLAGS variable 
 in Nova before oslo.config
  
   * I was initially determined to get rid of any global variable use 
 and did a lot of work to allow glance use oslo.config without a 
 global variable
  
   * one example detail of this work - when you use paste.deploy to 
 load an app, you have no ability to pass a config object 
 through paste.deploy to the app. I wrote a little helper that 
 used a thread-local variable to mimic this pass-through.
  
   * with glance done, I moved on to making keystone use oslo.config and 
 initially didn't use the global variable. Then I ran into a veto 
 from termie who felt very strongly that a global variable should be 
 used.
  
   * in the end, I bought the argument that the use of a global variable 
 was pretty deeply ingrained (especially in Nova) and that we should 
 aim for consistent coding patterns across projects (i.e. Oslo 
 shouldn't be just about shared code, but also shared patterns). The 
 only realistic standard pattern we could hope for was the use of 
 the global variable.
  
   * with that agreed, we reverted glance back to using a global 
 variable and all projects followed suit
  
   * the case of libraries is different IMO - we'd be foolish to design 
 APIs which lock us into using the global object
  
  So ... I wouldn't quite agree that this is the new way vs the old
  way, but I think it would be reasonable to re-open the discussion about
  using the global object in our applications. Perhaps, at least, we could
  reduce our dependence on it.
 
 The aspect I was calling “old” was the “register options at import
 time” pattern, not the use of a global. Whether we use a global or
 not, registering options at runtime in a code path that will be using
 them is better than relying on import ordering to ensure options are
 registered before they are used.

I don't think I've seen code (except for obscure cases) which uses the
CONF global directly (as opposed to being passed CONF as a parameter)
but doesn't register the options at import time.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] Launchpad tracking of oslo projects

2014-08-26 Thread Mark McLoughlin

On Fri, 2014-08-22 at 11:59 +0200, Thierry Carrez wrote:
 TL;DR:
 Let's create an Oslo projectgroup in Launchpad to track work across all
 Oslo libraries. In library projects, let's use milestones connected to
 published versions rather than the common milestones.

Sounds good to me, Thierry. Thanks for the thoughtful proposal.

The part about using integrated release milestones was more about
highlighting that we follow a similar development model and cadence -
i.e. it's helpful from a planning perspective to predict whether a given
feature is likely to land in juno-1, juno-2 or juno-3. When it comes to
release time, though, I'd much rather have a launchpad milestone that
reflects the release itself rather than the development milestone. 

Sounds like we need to choose between using launchpad milestones for
planning or releases, and choosing the latter makes sense to me.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] usage patterns for oslo.config

2014-08-26 Thread Mark McLoughlin

On Mon, 2014-08-11 at 15:06 -0400, Doug Hellmann wrote:
On Aug 8, 2014, at 7:22 PM, Devananda van der Veen devananda@gmail.com
wrote:

On Fri, Aug 8, 2014 at 12:41 PM, Doug Hellmann d...@doughellmann.com
wrote:

That’s right. The preferred approach is to put the register_opt() in
*runtime* code somewhere before the option will be used. That might be in
the constructor for a class that uses an option, for example, as described
in
http://docs.openstack.org/developer/oslo.config/cfg.html#registering-options

Doug

Interesting.

I've been following the prevailing example in Nova, which is to
register opts at the top of a module, immediately after defining them.
Is there a situation in which one approach is better than the other?

The approach used in Nova is the “old” way of doing it. It works, but
assumes that all of the application code is modifying a global
configuration object. The runtime approach allows you to pass a
configuration object to a library, which makes it easier to mock the
configuration for testing and avoids having the configuration options
bleed into the public API of the library. We’ve started using the
runtime approach in new Oslo libraries that have configuration
options, but changing the implementation in existing application code
isn’t strictly necessary.

I've been meaning to dig up some of the old threads and reviews to
document how we got here.

But briefly:

* this global CONF variable originates from the gflags FLAGS variable
in Nova before oslo.config

* I was initially determined to get rid of any global variable use
and did a lot of work to allow glance use oslo.config without a
global variable

* one example detail of this work - when you use paste.deploy to
load an app, you have no ability to pass a config object
through paste.deploy to the app. I wrote a little helper that
used a thread-local variable to mimic this pass-through.

* with glance done, I moved on to making keystone use oslo.config and
initially didn't use the global variable. Then I ran into a veto
from termie who felt very strongly that a global variable should be
used.

* in the end, I bought the argument that the use of a global variable
was pretty deeply ingrained (especially in Nova) and that we should
aim for consistent coding patterns across projects (i.e. Oslo
shouldn't be just about shared code, but also shared patterns). The
only realistic standard pattern we could hope for was the use of
the global variable.

* with that agreed, we reverted glance back to using a global
variable and all projects followed suit

* the case of libraries is different IMO - we'd be foolish to design
APIs which lock us into using the global object

So ... I wouldn't quite agree that this is the new way vs the old
way, but I think it would be reasonable to re-open the discussion about
using the global object in our applications. Perhaps, at least, we could
reduce our dependence on it.

Oh look, we have a FAQ on this:

https://wiki.openstack.org/wiki/Oslo#Why_does_oslo.config_have_a_CONF_object.3F_Global_object_SUCK.21

Mark.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] [ptls] The Czar system, or how to scale PTLs

2014-08-22 Thread Mark McLoughlin

On Fri, 2014-08-22 at 11:01 -0400, Zane Bitter wrote:

 I don't see that as something the wider OpenStack community needs to 
 dictate. We have a heavyweight election process for PTLs once every 
 cycle because that used to be the process for electing the TC. Now that 
 it no longer serves this dual purpose, PTL elections have outlived their 
 usefulness.
 
 If projects want to have a designated tech lead, let them. If they want 
 to have the lead elected in a form of representative democracy, let 
 them. But there's no need to impose that process on every project. If 
 they want to rotate the tech lead every week instead of every 6 months, 
 why not let them? We'll soon see from experimentation which models work. 
 Let a thousand flowers bloom, c.

I like the idea of projects being free to experiment with their
governance rather than the TC mandating detailed governance models from
above.

But I also like the way Thierry is taking a trend we're seeing work out
well across multiple projects, and generalizing it. If individual
projects are to adopt explicit PTL duty delegation, then all the better
if those projects adopt it in similar ways.

i.e. this should turn out to be an optional best practice model that
projects can choose to adopt, in much the way the *-specs repo idea took
hold.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] The future of the integrated release

2014-08-18 Thread Mark McLoughlin

On Mon, 2014-08-18 at 14:23 +0200, Thierry Carrez wrote:
 Clint Byrum wrote:
  Here's why folk are questioning Ceilometer:
  
  Nova is a set of tools to abstract virtualization implementations.
  Neutron is a set of tools to abstract SDN/NFV implementations.
  Cinder is a set of tools to abstract block-device implementations.
  Trove is a set of tools to simplify consumption of existing databases.
  Sahara is a set of tools to simplify Hadoop consumption.
  Swift is a feature-complete implementation of object storage, none of
  which existed when it was started.
  Keystone supports all of the above, unifying their auth.
  Horizon supports all of the above, unifying their GUI.
  
  Ceilometer is a complete implementation of data collection and alerting.
  There is no shortage of implementations that exist already.
  
  I'm also core on two projects that are getting some push back these
  days:
  
  Heat is a complete implementation of orchestration. There are at least a
  few of these already in existence, though not as many as their are data
  collection and alerting systems.
  
  TripleO is an attempt to deploy OpenStack using tools that OpenStack
  provides. There are already quite a few other tools that _can_ deploy
  OpenStack, so it stands to reason that people will question why we
  don't just use those. It is my hope we'll push more into the unifying
  the implementations space and withdraw a bit from the implementing
  stuff space.
  
  So, you see, people are happy to unify around a single abstraction, but
  not so much around a brand new implementation of things that already
  exist.
 
 Right, most projects focus on providing abstraction above
 implementations, and that abstraction is where the real domain
 expertise of OpenStack should be (because no one else is going to do it
 for us). Every time we reinvent something, we are at larger risk because
 we are out of our common specialty, and we just may not be as good as
 the domain specialists. That doesn't mean we should never reinvent
 something, but we need to be damn sure it's a good idea before we do.
 It's sometimes less fun to piggyback on existing implementations, but if
 they exist that's probably what we should do.

It's certainly a valid angle to evaluate projects on, but it's also easy
to be overly reductive about it - e.g. that rather than re-implement
virtualization management, Nova should just be a thin abstraction over
vSphere, XenServer and oVirt.

To take that example, I don't think we as a project should be afraid of
having such discussions but it wouldn't be productive to frame that
conversation as the sky is falling, Nova re-implements the wheel, we
should de-integrate it.

 While Ceilometer is far from alone in that space, what sets it apart is
 that even after it was blessed by the TC as the one we should all
 converge on, we keep on seeing competing implementations for some (if
 not all) of its scope. Convergence did not happen, and without
 convergence we struggle in adoption. We need to understand why, and if
 this is fixable.

Convergence did not happen is a little unfair. It's certainly a busy
space, and things like Monasca and InfluxDB are new developments. I'm
impressed at how hard the Ceilometer team works to embrace such
developments and patiently talks through possibilities for convergence.
This attitude is something we should be applauding in an integrated
project.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Retrospective veto revert policy

2014-08-14 Thread Mark McLoughlin

On Tue, 2014-08-12 at 15:56 +0100, Mark McLoughlin wrote:
 Hey
 
 (Terrible name for a policy, I know)
 
 From the version_cap saga here:
 
   https://review.openstack.org/110754
 
 I think we need a better understanding of how to approach situations
 like this.
 
 Here's my attempt at documenting what I think we're expecting the
 procedure to be:
 
   https://etherpad.openstack.org/p/nova-retrospective-veto-revert-policy
 
 If it sounds reasonably sane, I can propose its addition to the
 Development policies doc.

(In the spirit of we really need to step back and laugh at ourselves
sometimes ... )

Two years ago, we were worried about patches getting merged in less than
2 hours and had a discussion about imposing a minimum review time. How
times have changed! Is it even possible to land a patch in less than two
hours now? :)

Looking back over the thread, this part stopped me in my tracks:

  https://lists.launchpad.net/openstack/msg08625.html

On Tue, Mar 13, 2012, Mark McLoughlin markmc@xx wrote:

 Sometimes there can be a few folks working through an issue together and
 the patch gets pushed and approved so quickly that no-one else gets a
 chance to review.

Everyone has an opportunity to review even after a patch gets merged.

JE

It's not quite perfect, but if you squint you could conclude that
Johannes and I have both completely reversed our opinions in the
intervening two years :)

The lesson I take from that is to not get too caught up in the current
moment. We're growing and evolving rapidly. If we assume everyone is
acting in good faith, and allow each other to debate earnestly without
feelings getting hurt ... we should be able to work through anything.

Now, back on topic - digging through that thread, it doesn't seem we
settled on the idea of we can just revert it later if someone has an
objection in this thread. Does anyone recall when that idea first came
up?

Thanks,
Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Retrospective veto revert policy

2014-08-14 Thread Mark McLoughlin

On Tue, 2014-08-12 at 15:56 +0100, Mark McLoughlin wrote:
 Hey
 
 (Terrible name for a policy, I know)
 
 From the version_cap saga here:
 
   https://review.openstack.org/110754
 
 I think we need a better understanding of how to approach situations
 like this.
 
 Here's my attempt at documenting what I think we're expecting the
 procedure to be:
 
   https://etherpad.openstack.org/p/nova-retrospective-veto-revert-policy
 
 If it sounds reasonably sane, I can propose its addition to the
 Development policies doc.

Proposed here: https://review.openstack.org/114188

Thanks,
Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] The future of the integrated release

2014-08-13 Thread Mark McLoughlin

On Tue, 2014-08-05 at 18:03 +0200, Thierry Carrez wrote:
 Hi everyone,
 
 With the incredible growth of OpenStack, our development community is
 facing complex challenges. How we handle those might determine the
 ultimate success or failure of OpenStack.
 
 With this cycle we hit new limits in our processes, tools and cultural
 setup. This resulted in new limiting factors on our overall velocity,
 which is frustrating for developers. This resulted in the burnout of key
 firefighting resources. This resulted in tension between people who try
 to get specific work done and people who try to keep a handle on the big
 picture.

Always fun catching up on threads like this after being away ... :)

I think the thread has revolved around three distinct areas:

  1) The per-project review backlog, its implications for per-project 
 velocity, and ideas for new workflows or tooling

  2) Cross-project scaling issues that get worse as we add more 
 integrated projects

  3) The factors that go into deciding whether a project belongs in the 
 integrated release - including the appropriateness of its scope,
 the soundness of its architecture and how production ready it is.

The first is important - hugely important - but I don't think it has any
bearing on the makeup, scope or contents of the integrated release, but
certainly will have a huge bearing on the success of the release and the
project more generally.

The third strikes me as a part of the natural evolution around how we
think about the integrated release. I don't think there's any particular
crisis or massive urgency here. As the TC considers proposals to
integrate (or de-integrate) projects, we'll continue to work through
this. These debates are contentious enough that we should avoid adding
unnecessary drama to them by conflating the issues with more pressing,
urgent issues.

I think the second area is where we should focus. We're concerned that
we're hitting a breaking point with some cross-project issues - like
release management, the gate, a high level of non-deterministic test
failures, insufficient cross-project collaboration on technical debt
(e.g. via Oslo), difficulty in reaching consensus on new cross-project
initiatives (Sean gave the examples of Group Based Policy and Rally) -
such that drastic measures are required. Like maybe we should not accept
any new integrated projects in this cycle while we work through those
issues.

Digging deeper into that means itemizing these cross-project scaling
issues, figuring out which of them need drastic intervention, discussing
what the intervention might be and the realistic overall effects of
those interventions.

AFAICT, the closest we've come in the thread to that level of detail is
Sean's email here:

  http://lists.openstack.org/pipermail/openstack-dev/2014-August/042277.html

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] The future of the integrated release

2014-08-13 Thread Mark McLoughlin

On Thu, 2014-08-07 at 09:30 -0400, Sean Dague wrote:

 While I definitely think re-balancing our quality responsibilities back
 into the projects will provide an overall better release, I think it's
 going to take a long time before it lightens our load to the point where
 we get more breathing room again.

I'd love to hear more about this re-balancing idea. It sounds like we
have some concrete ideas here and we're saying they're not relevant to
this thread because they won't be an immediate solution?

 This isn't just QA issues, it's a coordination issue on overall
 consistency across projects. Something that worked fine at 5 integrated
 projects, got strained at 9, and I think is completely untenable at 15.

I can certainly relate to that from experience with Oslo.

But if you take a concrete example - as more new projects emerge, it
became harder to get them all using oslo.messaging and using it
consistent ways. That's become a lot better with Doug's idea of Oslo
project delegates.

But if we had not added those projects to the release, the only reason
that the problem would be more manageable is that the use of
oslo.messaging would effectively become a requirement for integration.
So, projects requesting integration have to take cross-project
responsibilities more seriously for fear their application would be
denied.

That's a very sad conclusion. Our only tool for encouraging people to
take this cross-project issue is being accepted into the release and,
once achieved, the cross-project responsibilities aren't taken so
seriously?

I don't think it's so bleak as that - given the proper support,
direction and tracking I think we're seeing in Oslo how projects will
play their part in getting to cross-project consistency.

 I think one of the big issues with a large number of projects is that
 implications of implementation of one project impact others, but people
 don't always realize. Locally correct decisions for each project may not
 be globally correct for OpenStack. The GBP discussion, the Rally
 discussion, all are flavors of this.

I think we need two things here - good examples of how these
cross-project initiatives can succeed so people can learn from them, and
for the initiatives themselves to be patiently lead by those whose goal
is a cross-project solution.

It's hard work, absolutely no doubt. The point again, though, is that it
is possible to do this type of work in such a way that once a small
number of projects adopt the approach, most of the others will follow
quite naturally.

If I was trying to get a consistent cross-project approach in a
particular area, the least of my concerns would be whether Ironic,
Marconi, Barbican or Designate would be willing to fall in line behind a
cross-project consensus.

 People are frustrated in infra load, for instance. It's probably worth
 noting that the 'config' repo currently has more commits landed than any
 other project in OpenStack besides 'nova' in this release. It has 30%
 the core team size as Nova (http://stackalytics.com/?metric=commits).

Yes, infra is an extremely busy project. I'm not sure I'd compare
infra/config commits to Nova commits in order to illustrate that,
though.

Infra is a massive endeavor, it's as critical a part of the project as
any project in the integrated release, and like other strategic
efforts struggles to attract contributors from as diverse a number of
companies as the integrated projects.

 So I do think we need to really think about what *must* be in OpenStack
 for it to be successful, and ensure that story is well thought out, and
 that the pieces which provide those features in OpenStack are clearly
 best of breed, so they are deployed in all OpenStack deployments, and
 can be counted on by users of OpenStack.

I do think we try hard to think this through, but no doubt we need to do
better. Is this conversation concrete enough to really move our thinking
along sufficiently, though?

 Because if every version of
 OpenStack deploys with a different Auth API (an example that's current
 but going away), we can't grow an ecosystem of tools around it.

There's a nice concrete example, but it's going away? What's the best
current example to talk through?

 This is organic definition of OpenStack through feedback with operators
 and developers on what's minimum needed and currently working well
 enough that people are happy to maintain it. And make that solid.
 
 Having a TC that is independently selected separate from the PTLs allows
 that group to try to make some holistic calls here.
 
 At the end of the day, that's probably going to mean saying No to more
 things. Everytime I turn around everyone wants the TC to say No to
 things, just not to their particular thing. :) Which is human nature.
 But I think if we don't start saying No to more things we're going to
 end up with a pile of mud that no one is happy with.

That we're being so abstract about all of this is frustrating. I get
that no-one wants to start a

Re: [openstack-dev] [all] The future of the integrated release

2014-08-13 Thread Mark McLoughlin

On Fri, 2014-08-08 at 15:36 -0700, Devananda van der Veen wrote:
 On Tue, Aug 5, 2014 at 10:02 AM, Monty Taylor mord...@inaugust.com wrote:

  Yes.
 
  Additionally, and I think we've been getting better at this in the 2 cycles
  that we've had an all-elected TC, I think we need to learn how to say no on
  technical merit - and we need to learn how to say thank you for your
  effort, but this isn't working out Breaking up with someone is hard to do,
  but sometimes it's best for everyone involved.
 
 
 I agree.
 
 The challenge is scaling the technical assessment of projects. We're
 all busy, and digging deeply enough into a new project to make an
 accurate assessment of it is time consuming. Some times, there are
 impartial subject-matter experts who can spot problems very quickly,
 but how do we actually gauge fitness?

Yes, it's important the TC does this and it's obvious we need to get a
lot better at it.

The Marconi architecture threads are an example of us trying harder (and
kudos to you for taking the time), but it's a little disappointing how
it has turned out. On the one hand there's what seems like a this
doesn't make any sense gut feeling and on the other hand an earnest,
but hardly bite-sized justification for how the API was chosen and how
it lead to the architecture. Frustrating that appears to not be
resulting in either improved shared understanding, or improved
architecture. Yet everyone is trying really hard.

 Letting the industry field-test a project and feed their experience
 back into the community is a slow process, but that is the best
 measure of a project's success. I seem to recall this being an
 implicit expectation a few years ago, but haven't seen it discussed in
 a while.

I think I recall us discussing a must have feedback that it's
successfully deployed requirement in the last cycle, but we recognized
that deployers often wait until a project is integrated.

 I'm not suggesting we make a policy of it, but if, after a
 few cycles, a project is still not meeting the needs of users, I think
 that's a very good reason to free up the hold on that role within the
 stack so other projects can try and fill it (assuming that is even a
 role we would want filled).

I'm certainly not against discussing de-integration proposals. But I
could imagine a case for de-integrating every single one of our
integrated projects. None of our software is perfect. How do we make
sure we approach this sanely, rather than run the risk of someone
starting a witch hunt because of a particular pet peeve?

I could imagine a really useful dashboard showing the current state of
projects along a bunch of different lines - summary of latest
deployments data from the user survey, links to known scalability
issues, limitations that operators should take into account, some
capturing of trends so we know whether things are improving. All of this
data would be useful to the TC, but also hugely useful to operators.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] The future of the integrated release

2014-08-13 Thread Mark McLoughlin

On Tue, 2014-08-12 at 14:26 -0400, Eoghan Glynn wrote:
   It seems like this is exactly what the slots give us, though. The core 
 review
  team picks a number of slots indicating how much work they think they can
  actually do (less than the available number of blueprints), and then
  blueprints queue up to get a slot based on priorities and turnaround time
  and other criteria that try to make slot allocation fair. By having the
  slots, not only is the review priority communicated to the review team, it
  is also communicated to anyone watching the project.
 
 One thing I'm not seeing shine through in this discussion of slots is
 whether any notion of individual cores, or small subsets of the core
 team with aligned interests, can champion blueprints that they have
 a particular interest in.
 
 For example it might address some pain-point they've encountered, or
 impact on some functional area that they themselves have worked on in
 the past, or line up with their thinking on some architectural point.
 
 But for whatever motivation, such small groups of cores currently have
 the freedom to self-organize in a fairly emergent way and champion
 individual BPs that are important to them, simply by *independently*
 giving those BPs review attention.
 
 Whereas under the slots initiative, presumably this power would be
 subsumed by the group will, as expressed by the prioritization
 applied to the holding pattern feeding the runways?
 
 I'm not saying this is good or bad, just pointing out a change that
 we should have our eyes open to.

Yeah, I'm really nervous about that aspect.

Say a contributor proposes a new feature, a couple of core reviewers
think it's important exciting enough for them to champion it but somehow
the 'group will' is that it's not a high enough priority for this
release, even if everyone agrees that it is actually cool and useful.

What does imposing that 'group will' on the two core reviewers and
contributor achieve? That the contributor and reviewers will happily
turn their attention to some of the higher priority work? Or we lose a
contributor and two reviewers because they feel disenfranchised?
Probably somewhere in the middle.

On the other hand, what happens if work proceeds ahead even if not
deemed a high priority? I don't think we can say that the contributor
and two core reviewers were distracted from higher priority work,
because blocking this work is probably unlikely to shift their focus in
a productive way. Perhaps other reviewers are distracted because they
feel the work needs more oversight than just the two core reviewers? It
places more of a burden on the gate?

I dunno ... the consequences of imposing group will worry me more than
the consequences of allowing small groups to self-organize like this.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] The future of the integrated release

2014-08-13 Thread Mark McLoughlin

On Tue, 2014-08-12 at 14:12 -0700, Joe Gordon wrote:

Here is the full nova proposal on Blueprint in Kilo: Runways and
Project Priorities

https://review.openstack.org/#/c/112733/
http://docs-draft.openstack.org/33/112733/4/check/gate-nova-docs/5f38603/doc/build/html/devref/runways.html

Thanks again for doing this.

Four points in the discussion jump out at me. Let's see if I can
paraphrase without misrepresenting :)

- ttx - we need tools to be able to visualize these runways

- danpb - the real problem here is that we don't have good tools to
help reviewers maintain a todo list which feeds, in part, off
blueprint prioritization

- eglynn - what are the implications for our current ability for
groups within the project to self-organize?

- russellb - why is different from reviewers sponsoring blueprints,
how will it work better?

I've been struggling to articulate a tooling idea for a while now. Let
me try again based on the runways idea and the thoughts above ...

When a reviewer sits down to do some reviews, their goal should be to
work through the small number of runways they're signed up to and drive
the list of reviews that need their attention to zero.

Reviewers should be able to create their own runways and allow others
sign up to them.

The reviewers responsible for that runway are responsible for pulling
new reviews from explicitly defined feeder runways.

Some feeder runways could be automated; no more than a search query for
say new libvirt patches which aren't already in the libvirt driver
runway.

All of this activity should be visible to everyone. It should be
possible to look at all the runways, see what runways a patch is in,
understand the flow between runways, etc.

There's a lot of detail that would have to be worked out, but I'm pretty
convinced there's an opportunity to carve up the review backlog, empower
people to help out with managing the backlog, give reviewers manageable
queues for them to stay on top of, help ensure that project priorization
is one of the drivers of reviewer activity and increases contributor
visibility into how decisions are made.

Mark.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] so what do i do about libvirt-python if i'm on precise?

2014-08-13 Thread Mark McLoughlin

On Wed, 2014-08-13 at 10:26 +0100, Daniel P. Berrange wrote:
 On Tue, Aug 12, 2014 at 10:09:52PM +0100, Mark McLoughlin wrote:
  On Wed, 2014-07-30 at 15:34 -0700, Clark Boylan wrote:
   On Wed, Jul 30, 2014, at 03:23 PM, Jeremy Stanley wrote:
On 2014-07-30 13:21:10 -0700 (-0700), Joe Gordon wrote:
 While forcing people to move to a newer version of libvirt is
 doable on most environments, do we want to do that now? What is
 the benefit of doing so?
[...]

The only dog I have in this fight is that using the split-out
libvirt-python on PyPI means we finally get to run Nova unit tests
in virtualenvs which aren't built with system-site-packages enabled.
It's been a long-running headache which I'd like to see eradicated
everywhere we can. I understand though if we have to go about it
more slowly, I'm just excited to see it finally within our grasp.
-- 
Jeremy Stanley
   
   We aren't quite forcing people to move to newer versions. Only those
   installing nova test-requirements need newer libvirt.
  
  Yeah, I'm a bit confused about the problem here. Is it that people want
  to satisfy test-requirements through packages rather than using a
  virtualenv?
  
  (i.e. if people just use virtualenvs for unit tests, there's no problem
  right?)
  
  If so, is it possible/easy to create new, alternate packages of the
  libvirt python bindings (from PyPI) on their own separately from the
  libvirt.so and libvirtd packages?
 
 The libvirt python API is (mostly) automatically generated from a
 description of the XML that is built from the C source files. In
 tree with have fakelibvirt which is a semi-crappy attempt to provide
 a pure python libvirt client API with the same signature. IIUC, what
 you are saying is that we should get a better fakelibvirt that is
 truely identical with same API coverage /signatures as real libvirt ?

No, I'm saying that people are installing packaged versions of recent
releases of python libraries. But they're skeptical about upgrading
their libvirt packages. With the work done to enable libvirt be uploaded
to PyPI, can't the two be decoupled? Can't we have packaged versions of
the recent python bindings on PyPI that are independent of the base
packages containing libvirt.so and libvirtd?

(Or I could be completely misunderstanding the issue people are seeing)

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] stable branches failure to handle review backlog

2014-08-13 Thread Mark McLoughlin

On Tue, 2014-07-29 at 14:04 +0200, Thierry Carrez wrote:
 Ihar Hrachyshka a écrit :
  On 29/07/14 12:15, Daniel P. Berrange wrote:
  Looking at the current review backlog I think that we have to
  seriously question whether our stable branch review process in
  Nova is working to an acceptable level
  
  On Havana
  
- 43 patches pending
- 19 patches with a single +2
- 1 patch with a -1
- 0 patches wit a -2
- Stalest waiting 111 days since most recent patch upload
- Oldest waiting 250 days since first patch upload
- 26 patches waiting more than 1 month since most recent upload
- 40 patches waiting more than 1 month since first upload
  
  On Icehouse:
  
- 45 patches pending
- 17 patches with a single +2
- 4 patches with a -1
- 1 patch with a -2
- Stalest waiting 84 days since most recent patch upload
- Oldest waiting 88 days since first patch upload
- 10 patches waiting more than 1 month since most recent upload
- 29 patches waiting more than 1 month since first upload
  
  I think those stats paint a pretty poor picture of our stable branch
  review process, particularly Havana.
  
  It should not take us 250 days for our review team to figure out whether
  a patch is suitable material for a stable branch, nor should we have
  nearly all the patches waiting more than 1 month in Havana.
  
  These branches are not getting sufficient reviewer attention and we need
  to take steps to fix that.
  
  If I had to set a benchmark, assuming CI passes, I'd expect us to either
  approve or reject submissions for stable within a 2 week window in the
  common case, 1 month at the worst case.
  
  Totally agreed.
 
 A bit of history.
 
 At the dawn of time there were no OpenStack stable branches, each
 distribution was maintaining its own stable branches, duplicating the
 backporting work.

I'm not sure how much backporting was going on at the time of the Essex
summit. I'm sure Ubuntu had some backports, but that was probably about
it?

  At some point it was suggested (mostly by RedHat and
 Canonical folks) that there should be collaboration around that task,
 and the OpenStack project decided to set up official stable branches
 where all distributions could share the backporting work. The stable
 team group was seeded with package maintainers from all over the distro
 world.

During that first design summit session, it was mainly you, me and
Daviey discussing. Both you and Daviey saw this primarily about distros
collaborating, but I never saw it that way.

I don't see how any self-respecting open-source project can throw a
release over the wall and have no ability to address critical bugs with
that release until the next release 6 months later which will also
include a bunch of new feature work with new bugs. That's not a distro
maintainer point of view.

At that Essex summit, we were lamenting how many critical bugs in Nova
had been discovered shortly after the Diablo release. Our inability to
do a bugfix release of Nova for Diablo seemed like a huge problem to me.

 So these branches originally only exist as a convenient place to
 collaborate on backporting work. This is completely separate from
 development work, even if those days backports are often proposed by
 developers themselves. The stable branch team is separate from the rest
 of OpenStack teams. We have always been very clear tht if the stable
 branches are no longer maintained (i.e. if the distributions don't see
 the value of those anymore), then we'll consider removing them. We, as a
 project, only signed up to support those as long as the distros wanted them.

You can certainly argue that the project never signed up for the
responsibility. I don't see it that way, but there was certainly always
a debate whether this was the project taking responsibility for bugfix
releases or whether it was just downstream distros collaborating.

The thing about branches going away if they're not maintained isn't
anything unusual. If *any* effort within the project becomes so
unmaintained due to a lack of interest such that we can't stand over it,
then we should consider retiring it.

 We have been adding new members to the stable branch teams recently, but
 those tend to come from development teams rather than downstream
 distributions, and that starts to bend the original landscape.
 Basically, the stable branch needs to be very conservative to be a
 source of safe updates -- downstream distributions understand the need
 to weigh the benefit of the patch vs. the disruption it may cause.
 Developers have another type of incentive, which is to get the fix they
 worked on into stable releases, without necessarily being very
 conservative. Adding more -core people to the stable team to compensate
 the absence of distro maintainers will ultimately kill those branches.

That's quite a leap to say that -core team members will be so incapable
of the appropriate level of conservatism that the branch will be

Re: [openstack-dev] [all] 3rd Party CI vs. Gerrit

2014-08-13 Thread Mark McLoughlin

On Wed, 2014-08-13 at 12:05 -0700, James E. Blair wrote:
 cor...@inaugust.com (James E. Blair) writes:
 
  Sean Dague s...@dague.net writes:
 
  This has all gone far enough that someone actually wrote a Grease Monkey
  script to purge all the 3rd Party CI content out of Jenkins UI. People
  are writing mail filters to dump all the notifications. Dan Berange
  filters all them out of his gerrit query tools.
 
  I should also mention that there is a pending change to do something
  similar via site-local Javascript in our Gerrit:
 
https://review.openstack.org/#/c/95743/
 
  I don't think it's an ideal long-term solution, but if it works, we may
  have some immediate relief without all having to install greasemonkey
  scripts.
 
 You may have noticed that this has merged, along with a further change
 that shows the latest results in a table format.  (You may need to
 force-reload in your browser to see the change.)

Beautiful! Thank you so much to everyone involved.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-12 Thread Mark McLoughlin

On Mon, 2014-08-11 at 15:25 -0700, Joe Gordon wrote:
 
 
 
 On Sun, Aug 10, 2014 at 11:59 PM, Mark McLoughlin mar...@redhat.com
 wrote:
 On Fri, 2014-08-08 at 09:06 -0400, Russell Bryant wrote:
  On 08/07/2014 08:06 PM, Michael Still wrote:
   It seems to me that the tension here is that there are
 groups who
   would really like to use features in newer libvirts that
 we don't CI
   on in the gate. Is it naive to think that a possible
 solution here is
   to do the following:
  
- revert the libvirt version_cap flag
 
  I don't feel strongly either way on this.  It seemed useful
 at the time
  for being able to decouple upgrading libvirt and enabling
 features that
  come with that.
 
 
 Right, I suggested the flag as a more deliberate way of
 avoiding the
 issue that was previously seen in the gate with live
 snapshots. I still
 think it's a pretty elegant and useful little feature, and
 don't think
 we need to use it as proxy battle over testing requirements
 for new
 libvirt features.
 
 
 Mark,
 
 
 I am not sure if I follow.  The gate issue with live snapshots has
 been worked around by turning it off [0], so presumably this patch is
 forward facing.  I fail to see how this patch is needed to help the
 gate in the future.

On the live snapshot issue specifically, we disabled it by requiring
1.3.0 for the feature. With the version cap set to 1.2.2, we won't
automatically enable this code path again if we update to 1.3.0. No
question that's a bit of a mess, though.

The point was a more general one - we learned from the live snapshot
issue that having a libvirt upgrade immediately enable new code paths
was a bad idea. The patch is a simple, elegant way of avoiding that.

  Wouldn't it just delay the issues until we change the version_cap?

Yes, that's the idea. Rather than having to scramble when the new
devstack-gate image shows up, we'd be able to work on any issues in the
context of a patch series to bump the version_cap.

 The issue I see with the libvirt version_cap [1] is best captured in
 its commit message: The end user can override the limit if they wish
 to opt-in to use of untested features via the 'version_cap' setting in
 the 'libvirt' group. This goes against the very direction nova has
 been moving in for some time now. We have been moving away from
 merging untested (re: no integration testing) features.  This patch
 changes the very direction the project is going in over testing
 without so much as a discussion. While I think it may be time that we
 revisited this discussion, the discussion needs to happen before any
 patches are merged.

You put it well - some apparently see us moving towards a zero-tolerance
policy of not having any code which isn't functionally tested in the
gate. That obviously is not the case right now.

The sentiment is great, but any zero-tolerance policy is dangerous. I'm
very much in favor of discussing this further. We should have some
principles and goals around this, but rather than argue this in the
abstract we should be open to discussing the tradeoffs involved with
individual patches.

 I am less concerned about the contents of this patch, and more
 concerned with how such a big de facto change in nova policy (we
 accept untested code sometimes) without any discussion or consensus.
 In your comment on the revert [2], you say the 'whether not-CI-tested
 features should be allowed to be merged' debate is 'clearly
 unresolved.' How did you get to that conclusion? This was never
 brought up in the mid-cycles as a unresolved topic to be discussed. In
 our specs template we say Is this untestable in gate given current
 limitations (specific hardware / software configurations available)?
 If so, are there mitigation plans (3rd party testing, gate
 enhancements, etc) [3].  We have been blocking untested features for
 some time now.

Asking is this tested in a spec template makes a tonne of sense.
Requiring some thought to be put into mitigation where a feature is
untestable in the gate makes sense. Requiring that the code is tested
where possible makes sense. It's a zero-tolerance get your code
functionally tested or GTFO policy that I'm concerned about.

 I am further perplexed by what Daniel Berrange, the patch author,
 meant when he commented [2] Regardless of the outcome of the testing
 discussion we believe this is a useful feature to have. Who is 'we'?
 Because I don't see how that can be nova-core or even nova-specs-core,
 especially considering how many members of those groups are +2 on the
 revert. So if 'we' is neither of those groups then who is 'we'?

That's for Dan to answer, but I think you're either nitpicking or have a
very serious concern.

If nitpicking, Dan could just be using the Royal 'We' :) Or he could
just mean

[openstack-dev] [nova] Retrospective veto revert policy

2014-08-12 Thread Mark McLoughlin

Hey

(Terrible name for a policy, I know)

From the version_cap saga here:

  https://review.openstack.org/110754

I think we need a better understanding of how to approach situations
like this.

Here's my attempt at documenting what I think we're expecting the
procedure to be:

  https://etherpad.openstack.org/p/nova-retrospective-veto-revert-policy

If it sounds reasonably sane, I can propose its addition to the
Development policies doc.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] so what do i do about libvirt-python if i'm on precise?

2014-08-12 Thread Mark McLoughlin

On Wed, 2014-07-30 at 15:34 -0700, Clark Boylan wrote:
 On Wed, Jul 30, 2014, at 03:23 PM, Jeremy Stanley wrote:
  On 2014-07-30 13:21:10 -0700 (-0700), Joe Gordon wrote:
   While forcing people to move to a newer version of libvirt is
   doable on most environments, do we want to do that now? What is
   the benefit of doing so?
  [...]
  
  The only dog I have in this fight is that using the split-out
  libvirt-python on PyPI means we finally get to run Nova unit tests
  in virtualenvs which aren't built with system-site-packages enabled.
  It's been a long-running headache which I'd like to see eradicated
  everywhere we can. I understand though if we have to go about it
  more slowly, I'm just excited to see it finally within our grasp.
  -- 
  Jeremy Stanley
 
 We aren't quite forcing people to move to newer versions. Only those
 installing nova test-requirements need newer libvirt.

Yeah, I'm a bit confused about the problem here. Is it that people want
to satisfy test-requirements through packages rather than using a
virtualenv?

(i.e. if people just use virtualenvs for unit tests, there's no problem
right?)

If so, is it possible/easy to create new, alternate packages of the
libvirt python bindings (from PyPI) on their own separately from the
libvirt.so and libvirtd packages?

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] Nominating Jay Pipes for nova-core

2014-08-12 Thread Mark McLoughlin

On Wed, 2014-07-30 at 14:02 -0700, Michael Still wrote:
 Greetings,
 
 I would like to nominate Jay Pipes for the nova-core team.
 
 Jay has been involved with nova for a long time now.  He's previously
 been a nova core, as well as a glance core (and PTL). He's been around
 so long that there are probably other types of core status I have
 missed.
 
 Please respond with +1s or any concerns.

Was away, but +1 for the record. Would have been happy to see this some
time ago.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Mark McLoughlin

On Fri, 2014-08-08 at 09:06 -0400, Russell Bryant wrote:
 On 08/07/2014 08:06 PM, Michael Still wrote:
  It seems to me that the tension here is that there are groups who
  would really like to use features in newer libvirts that we don't CI
  on in the gate. Is it naive to think that a possible solution here is
  to do the following:
  
   - revert the libvirt version_cap flag
 
 I don't feel strongly either way on this.  It seemed useful at the time
 for being able to decouple upgrading libvirt and enabling features that
 come with that.

Right, I suggested the flag as a more deliberate way of avoiding the
issue that was previously seen in the gate with live snapshots. I still
think it's a pretty elegant and useful little feature, and don't think
we need to use it as proxy battle over testing requirements for new
libvirt features.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-17 Thread Mark McLoughlin

On Thu, 2014-07-17 at 09:58 +0100, Daniel P. Berrange wrote:
 On Thu, Jul 17, 2014 at 08:46:12AM +1000, Michael Still wrote:
  Top posting to the original email because I want this to stand out...
  
  I've added this to the agenda for the nova mid cycle meetup, I think
  most of the contributors to this thread will be there. So, if we can
  nail this down here then that's great, but if we think we'd be more
  productive in person chatting about this then we have that option too.
 
 FYI, I'm afraid I won't be at the mid-cycle meetup since it clashed with
 my being on holiday. So I'd really prefer if we keep the discussion on
 this mailing list where everyone has a chance to participate.

Same here. Pre-arranged vacation, otherwise I'd have been there.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-07-16 Thread Mark McLoughlin

On Wed, 2014-07-16 at 16:15 +0200, Sean Dague wrote:
..
 Based on these experiences, libvirt version differences seem to be as
 substantial as major hypervisor differences. There is a proposal here -
 https://review.openstack.org/#/c/103923/ to hold newer versions of
 libvirt to the same standard we hold xen, vmware, hyperv, docker,
 ironic, etc.

That's a bit of a mis-characterization - in terms of functional test
coverage, the libvirt driver is the bar that all the other drivers
struggle to meet.

And I doubt any of us pay too close attention to the feature coverage
that the 3rd party CI test jobs have.

 I'm somewhat concerned that the -2 pile on in this review is a double
 standard of libvirt features, and features exploiting really new
 upstream features. I feel like a lot of the language being used here
 about the burden of doing this testing is exactly the same as was
 presented by the docker team before their driver was removed, which was
 ignored by the Nova team at the time.

Personally, I wasn't very comfortable with the docker driver move. It
certainly gave an outward impression that we're an unfriendly community.
The mitigating factor was that a lot of friendly, collaborative,
coaching work went on in the background for months. Expectations were
communicated well in advance.

Kicking the docker driver out of the tree has resulted in an uptick in
the amount of work happening on it, but I suspect most people involved
have a bad taste in their mouths. I guess there's incentives at play
which mean they'll continue plugging away at it, but those incentives
aren't always at play.

 It was the concern by the freebsd
 team, which was also ignored and they were told to go land libvirt
 patches instead.
 
 I'm ok with us as a project changing our mind and deciding that the test
 bar needs to be taken down a notch or two because it's too burdensome to
 contributors and vendors, but if we are doing that, we need to do it for
 everyone. A lot of other organizations have put a ton of time and energy
 into this, and are carrying a maintenance cost of running these systems
 to get results back in a timely basis.

I don't agree that we need to apply the same rules equally to everyone.

At least part of the reasoning behind the emphasis on 3rd party CI
testing was that projects (Neutron in particular) were being overwhelmed
by contributions to drivers from developers who never contributed in any
way to the core. The corollary of that is the contributors who do
contribute to the core should be given a bit more leeway in return.

There's a natural building of trust and element of human relationships
here. As a reviewer, you learn to trust contributors with a good track
record and perhaps prioritize contributions from them.

 As we seem deadlocked in the review, I think the mailing list is
 probably a better place for this.
 
 If we want to reduce the standards for libvirt we should reconsider
 what's being asked of 3rd party CI teams, and things like the docker
 driver, as well as the A, B, C driver classification. Because clearly
 libvirt 1.2.5+ isn't actually class A supported.

No, there are features or code paths of the libvirt 1.2.5+ driver that
aren't as well tested as the class A designation implies. And we have
a proposal to make sure these aren't used by default:

  https://review.openstack.org/107119

i.e. to stray off the class A path, an operator has to opt into it by
changing a configuration option that explains they will be enabling code
paths which aren't yet tested upstream.

These features have value to some people now, they don't risk regressing
the class A driver and there's a clear path to them being elevated to
class A in time. We should value these contributions and nurture these
contributors.

Appending some of my comments from the review below. The tl;dr is that I
think we're losing sight of the importance of welcoming and nurturing
contributors, and valuing whatever contributions they can make. That
terrifies me. 

Mark.

---

Compared to other open source projects, we have done an awesome job in
OpenStack of having good functional test coverage. Arguably, given the
complexity of the system, we couldn't have got this far without it. I
can take zero credit for any of it.

However, not everything is tested now, nor is the tests we have
foolproof. When you consider the number of configuration options we
have, the supported distros, the ranges of library versions we claim to
support, etc., etc. I don't think we can ever get to an everything is
tested point.

In the absence of that, I think we should aim to be more clear what *is*
tested. The config option I suggest does that, which is a big part of
its merit IMHO.

We've had some success with the be nasty enough to driver contributors
and they'll do what we want approach so far, but IMHO that was an
exceptional approach for an exceptional situation - drivers that were
completely broken, and driver developers who didn't contribute to the
core

Re: [openstack-dev] [all] Treating notifications as a contract

2014-07-15 Thread Mark McLoughlin

On Fri, 2014-07-11 at 10:04 +0100, Chris Dent wrote:
 On Fri, 11 Jul 2014, Lucas Alvares Gomes wrote:
 
  The data format that Ironic will send was part of the spec proposed
  and could have been reviewed. I think there's still time to change it
  tho, if you have a better format talk to Haomeng which is the guys
  responsible for that work in Ironic and see if he can change it (We
  can put up a following patch to fix the spec with the new format as
  well) . But we need to do this ASAP because we want to get it landed
  in Ironic soon.
 
 It was only after doing the work that I realized how it might be an
 example for the sake of this discussion. As the architecure of
 Ceilometer currently exist there still needs to be some measure of
 custom code, even if the notifications are as I described them.
 
 However, if we want to take this opportunity to move some of the
 smarts from Ceilomer into the Ironic code then the paste that I created
 might be a guide to make it possible:
 
 http://paste.openstack.org/show/86071/

So you're proposing that all payloads should contain something like:

'events': [
# one or more dicts with something like
{
# some kind of identifier for the type of event
'class': 'hardware.ipmi.temperature',
'type': '#thing that indicates threshold, discrete, cumulative',
'id': 'DIMM GH VR Temp (0x3b)',
'value': '26',
'unit': 'C',
'extra': {
...
}
 }

i.e. a class, type, id, value, unit and a space to put additional metadata.

On the subject of notifications as a contract, calling the additional
metadata field 'extra' suggests to me that there are no stability
promises being made about those fields. Was that intentional?

 However on that however, if there's some chance that a large change could
 happen, it might be better to wait, I don't know.

Unlikely that a larger change will be made in Juno - take small window
of opportunity to rationalize Ironic's payload IMHO.

Mark.




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] Treating notifications as a contract

2014-07-15 Thread Mark McLoughlin

On Thu, 2014-07-10 at 16:21 -0400, Eoghan Glynn wrote:
   One of the issues that has been raised in the recent discussions with
   the QA team about branchless Tempest relates to some legacy defects
   in the OpenStack notification system.
  
  Got links to specifics? I thought the consensus was that there was a
  contract here which we need to maintain, so I'd be curious where that
  broke down.
 
 Well I could go digging in the LP fossil-record for specific bugs, but
 it's late, so for now I'll simply appeal to anecdata and tribal memory
 of ceilometer being broken by notification changes on the nova side.  
 
  Versioning and ability to newer contract versions would be good too, but
  in the absence of such things we should maintain backwards compat.
 
 Yes, I think that was the aspiration, but not always backed up by practice
 in reality.

The reason I ask about specifics is to figure out which is more
important - versioned payloads, or automated testing of payload format.

i.e. have we been accidentally or purposefully changing the format? If
the latter, would the change have warranted a new incompatible version
of the payload format?

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] REST API access to configuration options

2014-07-15 Thread Mark McLoughlin

On Tue, 2014-07-15 at 08:54 +0100, Henry Nash wrote:
 HI
 
 As the number of configuration options increases and OpenStack
 installations become more complex, the chances of incorrect
 configuration increases. There is no better way of enabling cloud
 providers to be able to check the configuration state of an OpenStack
 service than providing a direct REST API that allows the current
 running values to be inspected. Having an API to provide this
 information becomes increasingly important for dev/ops style
 operation.
 
 As part of Keystone we are considering adding such an ability (see:
 https://review.openstack.org/#/c/106558/).  However, since this is the
 sort of thing that might be relevant to and/or affect other projects,
 I wanted to get views from the wider dev audience.  
 
 Any such change obviously has to take security in mind - and as the
 spec says, just like when we log config options, any options marked as
 secret will be obfuscated.  In addition, the API will be protected by
 the normal policy mechanism and is likely in most installations to be
 left as admin required.  And of course, since it is an extension, if
 a particular installation does not want to use it, they don't need to
 load it.
 
 Do people think this is a good idea?  Useful in other projects?
 Concerned about the risks?

I would have thought operators would be comfortable gleaning this
information from the log files?

Also, this is going to tell you how the API service you connected to was
configured. Where there are multiple API servers, what about the others?
How do operators verify all of the API servers behind a load balancer
with this?

And in the case of something like Nova, what about the many other nodes
behind the API server?

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] REST API access to configuration options

2014-07-15 Thread Mark McLoughlin

On Tue, 2014-07-15 at 13:00 +0100, Henry Nash wrote:
 Mark,
 
 
 Thanks for your comments (as well as remarks on the WIP code-review).
 
 
 So clearly gathering and analysing log files is an alternative
 approach, perhaps not as immediate as an API call.  In general, I
 believe that the more capability we provide via easy-to-consume APIs
 (with appropriate permissions) the more effective (and innovative)
 ways of management of OpenStack we will achieve (easier to build
 automated management systems).

I'm skeptical - like Joe says, this is a general problem and management
tooling will have generic ways of tackling this without using a REST
API.

   In terms of multi API servers, obviously each server would respond
 to the API with the values it has set, so operators could check any or
 all of the serversand this actually becomes more important as
 people distribute config files around to the various servers (since
 more chance of something getting out of sync).

The fact that it only deals with API servers, and that you need to
bypass the load balancer in order to iterate over all API servers, makes
this of very limited use IMHO.

Thanks,
Mark.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-10 Thread Mark McLoughlin

On Thu, 2014-07-03 at 16:27 +0100, Mark McLoughlin wrote:
 Hey
 
 This is an attempt to summarize a really useful discussion that Victor,
 Flavio and I have been having today. At the bottom are some background
 links - basically what I have open in my browser right now thinking
 through all of this.
 
 We're attempting to take baby-steps towards moving completely from
 eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
 first victim.

I got a little behind on this thread, but maybe it'd be helpful to
summarize some things from this good discussion:

   - Where/when was this decided?!?

 Victor is working on prototyping how an OpenStack service would
 move to using asyncio. Whether a move to asyncio across the board
 makes sense - and what exactly it would look like - hasn't been
 *decided*. The idea is merely being explored at this point.

   - Is moving to asyncio really a priority compared to other things?

 I think Victor has made a good case on what's wrong with 
 eventlet?[1] and, personally, I'm excited about the prospect of 
 the Python community more generally converging on asyncio. 
 Understanding what OpenStack would need in order move to asyncio 
 will help the asyncio effort more generally.

 Figuring through some of this stuff is a priority for Victor and
 others, but no-one is saying it's an immediate priority for the 
 whole project.

   - Moving from an implicitly async to an explicitly async programming
 has enormous implications and we need to figure out what it means
 for libraries like SQLalchemy and abstraction layers like ORMs. 

 I think that's well understood - the topic of this thread is 
 merely how to make a small addition to oslo.messaging (the ability 
 to dispatch asyncio co-routines on eventlet) so that we can move on
 to figuring out the next piece of puzzle.

   - Some people are clearly skeptical about whether asyncio is the 
 right thing for Python generally, whether it's the right thing for 
 OpenStack, whatever. Personally, I'm optimistic but I don't find 
 the conversation all that interesting right now - I want to see 
 how the prototype efforts work out before making a call about 
 whether it's feasible and useful.

   - Taskflow vs asyncio - good discussion, plenty to figure out. 
 They're mostly orthogonal concerns IMHO but *maybe* we decide
 adopting both makes sense and that both should be adopted together.
 I'd like to see more concrete examples showing taskflow vs asyncio
 vs taskflow/asyncio to understand better.

So, tl;dr is that lots of work remains to even begin to understand how
exactly asyncio could be adopted and whether that makes sense. The
thread raises some interesting viewpoints, but I don't think it moves
our understanding along all that much. The initial mail was simply about
unlocking one very small piece of the puzzle.

Mark.

[1] - http://techs.enovance.com/6562/asyncio-openstack-python3


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-10 Thread Mark McLoughlin

On Mon, 2014-07-07 at 12:48 +0200, Nikola Đipanov wrote:

 When I read all of this stuff and got my head around it (took some time
 :) ), a glaring drawback of such an approach, and as I mentioned on the
 spec proposing it [1] is that we would not really doing asyncio, we
 would just be pretending we are by using a subset of it's APIs, and
 having all of the really important stuff for overall design of the code
 (code that needs to do IO in the callbacks for example) and ultimately -
 performance, completely unavailable to us when porting.
 
 So in Mark's example above:
 
   @asyncio.coroutine
   def foo(self):
 result = yield from some_async_op(...)
 return do_stuff(result)
 
 A developer would not need to do anything that asyncio requires like
 make sure that some_async_op() registers a callback with the eventloop
 (using for example event_loop.add_reader/writer methods) you could just
 simply make it use a 'greened' call and things would continue working
 happily.

Yes, Victor and I noticed this problem and wondered whether there was a
way to e.g. turn-off the monkey-patching at runtime in a single
greenthread, or even just make any attempt to context switch raise an
exception.

i.e. a way to run foo() coroutine above in a greenthread such that
context switching is disallowed, or logged, or whatever while the
function is running. The only way context switching would be allowed to
happen would be if the coroutine yielded.

  I have a feeling this will in turn have a lot of people writing
 code that they don't understand, and as library writers - we are not
 doing an excellent job at that point.
 
 Now porting an OpenStack project to another IO library with completely
 different design is a huge job and there is unlikely a single 'right'
 way to do it, so treat this as a discussion starter, that will hopefully
 give us a better understanding of the problem we are trying to tackle.
 
 So I hacked up together a small POC of a different approach. In short -
 we actually use a real asyncio selector eventloop in a separate thread,
 and dispatch stuff to it when we figure out that our callback is in fact
 a coroutine. More will be clear form the code so:
 
 (Warning - hacky code ahead): [2]
 
 I will probably be updating it - but if you just clone the repo, all the
 history is there. I wrote it without the oslo.messaging abstractions
 like listener and dispatcher, but it is relatively easy to see which
 bits of code would go in those.
 
 Several things worth noting as you read the above. First one is that we
 do not monkeypatch until we have fired of the asyncio thread (Victor
 correctly noticed this would be a problem in a comment on [1]). This may
 seem hacky (and it is) but if decide to go further down this road - we
 would probably not be 'greening the world' but rather importing patched
 non-ported modules when we need to dispatch to them. This may sound like
 a big deal, and it is, but it is critical to actually running ported
 code in a real asyncio evenloop. I have not yet tested this further, but
 from briefly reading eventlet code - it seems like ti should work.
 
 Another interesting problem is (as I have briefly mentioned in [1]) -
 what happens when we need to synchronize between eventlet-run and
 asyncio-run callbacks while we are in the process of porting. I don't
 have a good answer to that yet, but it is worth noting that the proposed
 approach doesn't either, and this is a thing we should have some idea
 about before going in with a knife.
 
 Now for some marketing :) - I can see several advantages of such an
 approach, the obvious one being as stated, that we are in fact doing
 asyncio, so we are all in. Also as you can see [2] the implementation is
 far from magical - it's (surprisingly?) simple, and requires no other
 additional dependencies apart from trollius itself (granted greenio is
 not too complex either). I am sure that we would hit some other problems
 that were not clear from this basic POC (it was done in ~3 hours on a
 bus), but it seems to me that those problems will likely need to be
 solved anyhow if we are to port Ceilometer (or any other project) to
 asyncio, we will just hit them sooner this way.
 
 It was a fun approach to ponder anyway - so I am looking forward to
 comments and thoughts.

It's an interesting idea and I'd certainly welcome a more detailed
analysis of what the approach would mean for a service like Ceilometer.

My instinct is that adding an additional native thread where there is
only one native thread now will lead to tricky concurrency issues and a
more significant change of behavior than with the greenio approach. The
reason I like the greenio idea is that it allows us to make the
programming model changes without very significantly changing what
happens at runtime - the behavior, order of execution, concurrency
concerns, etc. shouldn't be all that different.

Mark.


___
OpenStack-dev mailing list

Re: [openstack-dev] [all] oslo.messaging 1.4.0.0a3 released

2014-07-10 Thread Mark McLoughlin

On Wed, 2014-07-09 at 03:53 +, Paul Michali (pcm) wrote:
 Mark,
 
 
 What is the status of adding the newer oslo.messaging releases to
 global requirements? I had tried to get 1.4.0.0a2 added to
 requirements (https://review.openstack.org/#/c/103536/), but it was
 failing Jenkins. Wondering how we get that version (or newer) into
 global requirements (some issue with pre-releases?).

Yeah, I don't know what the latest status is with this beyond
bandersnatch:

  http://lists.openstack.org/pipermail/openstack-dev/2014-July/039089.html
  
https://review.openstack.org/#/q/project:openstack-infra/config+topic:bandersnatch,n,z
  https://review.openstack.org/103256

Mark.

 
 
 Thanks,
 
 PCM (Paul Michali)
 
 
 MAIL …..…. p...@cisco.com
 IRC ……..… pcm_ (irc.freenode.com)
 TW ………... @pmichali
 GPG Key … 4525ECC253E31A83
 Fingerprint .. 307A 96BB 1A4C D2C7 931D 8D2D 4525 ECC2 53E3 1A83
 
 
 
 
 
 On Jul 8, 2014, at 4:58 PM, Mark McLoughlin mar...@redhat.com wrote:
 
  The Oslo team is pleased to announce the release of oslo.messaging
  1.4.0.0a3, another pre-release in the 1.4.0 series for
  oslo.messaging
  during the Juno cycle:
  
   https://pypi.python.org/pypi/oslo.messaging/1.4.0.0a3
  
  oslo.messaging provides an API which supports RPC and notifications
  over
  a number of different messaging transports.
  
  Full details of the 1.4.0.0a3 release is available here:
  
   http://docs.openstack.org/developer/oslo.messaging/#a3
  
  Please report problems using the oslo.messaging bug tracker:
  
   https://bugs.launchpad.net/oslo.messaging
  
  Thanks to all those who contributed to the release!
  
  Mark.
  
  
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] Treating notifications as a contract

2014-07-10 Thread Mark McLoughlin

On Thu, 2014-07-10 at 04:48 -0400, Eoghan Glynn wrote:
 TL;DR: do we need to stabilize notifications behind a versioned
and discoverable contract?
 
 Folks,
 
 One of the issues that has been raised in the recent discussions with
 the QA team about branchless Tempest relates to some legacy defects
 in the OpenStack notification system.

Got links to specifics? I thought the consensus was that there was a
contract here which we need to maintain, so I'd be curious where that
broke down.

Versioning and ability to newer contract versions would be good too, but
in the absence of such things we should maintain backwards compat.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [all] oslo.config 1.4.0.0a2 released

2014-07-10 Thread Mark McLoughlin

The Oslo team is pleased to announce the release of oslo.config
1.4.0.0a2, another pre-release in the 1.4.0 series for oslo.config
during the Juno cycle:

  https://pypi.python.org/pypi/oslo.config/1.4.0.0a2

oslo.config provides an API which supports parsing command line
arguments and .ini style configuration files.

Full details of the 1.4.0.0a2 release is available here:

  http://docs.openstack.org/developer/oslo.config/#a2

Please report problems using the oslo bug tracker:

  https://bugs.launchpad.net/oslo

Thanks to all those who contributed to the release!

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Policy around Requirements Adds (was: New class of requirements for Stackforge projects)

2014-07-08 Thread Mark McLoughlin

On Mon, 2014-07-07 at 16:46 -0400, Sean Dague wrote:
 This thread was unfortunately hidden under a project specific tag (I
 have thus stripped all the tags).
 
 The crux of the argument here is the following:
 
 Is a stackforge project project able to propose additions to
 global-requirements.txt that aren't used by any projects in OpenStack.
 
 I believe the answer is firmly *no*.
 
 global-requirements.txt provides a way for us to have a single point of
 vetting for requirements for OpenStack. It lets us assess licensing,
 maturity, current state of packaging, python3 support, all in one place.
 And it lets us enforce that integration of OpenStack projects all run
 under a well understood set of requirements.

Allowing Stackforge projects use this as their base set of dependencies,
while still taking additional dependencies makes sense to me. I don't
really understand this GTFO stance.

Solum wants to depend on mistralclient - that seems like a perfectly
reasonable thing to want to do. And they also appear to not want to
stray any further from the base set of dependencies shared by OpenStack
projects - that also seems like a good thing.

Now, perhaps the mechanics are tricky, and perhaps we don't want to
enable Stackforge projects do stuff like pin to a different version of
SQLalchemy, and perhaps this proposal isn't the ideal solution, and
perhaps infra/others don't want to spend a lot of energy on something
specifically for Stackforge projects ... but I don't see something
fundamentally wrong with what they want to do.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Policy around Requirements Adds

2014-07-08 Thread Mark McLoughlin

On Tue, 2014-07-08 at 06:26 -0400, Sean Dague wrote:
 On 07/08/2014 04:33 AM, Mark McLoughlin wrote:
  On Mon, 2014-07-07 at 16:46 -0400, Sean Dague wrote:
  This thread was unfortunately hidden under a project specific tag (I
  have thus stripped all the tags).
 
  The crux of the argument here is the following:
 
  Is a stackforge project project able to propose additions to
  global-requirements.txt that aren't used by any projects in OpenStack.
 
  I believe the answer is firmly *no*.
 
  global-requirements.txt provides a way for us to have a single point of
  vetting for requirements for OpenStack. It lets us assess licensing,
  maturity, current state of packaging, python3 support, all in one place.
  And it lets us enforce that integration of OpenStack projects all run
  under a well understood set of requirements.
  
  Allowing Stackforge projects use this as their base set of dependencies,
  while still taking additional dependencies makes sense to me. I don't
  really understand this GTFO stance.
  
  Solum wants to depend on mistralclient - that seems like a perfectly
  reasonable thing to want to do. And they also appear to not want to
  stray any further from the base set of dependencies shared by OpenStack
  projects - that also seems like a good thing.
  
  Now, perhaps the mechanics are tricky, and perhaps we don't want to
  enable Stackforge projects do stuff like pin to a different version of
  SQLalchemy, and perhaps this proposal isn't the ideal solution, and
  perhaps infra/others don't want to spend a lot of energy on something
  specifically for Stackforge projects ... but I don't see something
  fundamentally wrong with what they want to do.
 
 Once it's in global requirements, any OpenStack project can include it
 in their requirements. Modifying that file for only stackforge projects
 is what I'm against.
 
 If the solum team would like to write up a partial sync mechanism,
 that's fine. It just needs to not be impacting the enforcement mechanism
 we actually need for OpenStack projects.

Totally agree. Solum taking a dependency on mistralclient shouldn't e.g.
allow glance to take a dependency on mistralclient.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [all] oslo.messaging 1.4.0.0a3 released

2014-07-08 Thread Mark McLoughlin

The Oslo team is pleased to announce the release of oslo.messaging
1.4.0.0a3, another pre-release in the 1.4.0 series for oslo.messaging
during the Juno cycle:

  https://pypi.python.org/pypi/oslo.messaging/1.4.0.0a3

oslo.messaging provides an API which supports RPC and notifications over
a number of different messaging transports.

Full details of the 1.4.0.0a3 release is available here:

  http://docs.openstack.org/developer/oslo.messaging/#a3

Please report problems using the oslo.messaging bug tracker:

  https://bugs.launchpad.net/oslo.messaging

Thanks to all those who contributed to the release!

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Mark McLoughlin

On Sun, 2014-07-06 at 09:28 -0400, Eoghan Glynn wrote:
 
  This is an attempt to summarize a really useful discussion that Victor,
  Flavio and I have been having today. At the bottom are some background
  links - basically what I have open in my browser right now thinking
  through all of this.
 
 Thanks for the detailed summary, it puts a more flesh on the bones
 than a brief conversation on the fringes of the Paris mid-cycle.
 
 Just a few clarifications and suggestions inline to add into the
 mix.
 
  We're attempting to take baby-steps towards moving completely from
  eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
  first victim.
 
 First beneficiary, I hope :)
  
  Ceilometer's code is run in response to various I/O events like REST API
  requests, RPC calls, notifications received, etc. We eventually want the
  asyncio event loop to be what schedules Ceilometer's code in response to
  these events. Right now, it is eventlet doing that.
 
 Yes.
 
 And there is one other class of stimulus, also related to eventlet,
 that is very important for triggering the execution of ceilometer
 logic. That would be the timed tasks that drive polling of:
 
  * REST APIs provided by other openstack services 
  * the local hypervisor running on each compute node
  * the SNMP daemons running at host-level etc.
 
 and also trigger periodic alarm evaluation.
 
 IIUC these tasks are all mediated via the oslo threadgroup's
 usage of eventlet.greenpool[1]. Would this logic also be replaced
 as part of this effort?

As part of the broader switch from eventlet to asyncio effort, yes
absolutely.

At the core of any event loop is code to do select() (or equivalents)
waiting for file descriptors to become readable or writable, or timers
to expire. We want to switch from the eventlet event loop to the asyncio
event loop.

The ThreadGroup abstraction from oslo-incubator is an interface to the
eventlet event loop. When you do:

  self.tg.add_timer(interval, self._evaluate_assigned_alarms)

You're saying run evaluate_assigned_alarms() every $interval seconds,
using select() to sleep between executions.

When you do:

  self.tg.add_thread(self.start_udp)

you're saying run some code which will either run to completion or set
wait for fd or timer events using select().

The asyncio versions of those will be:

  event_loop.call_later(delay, callback)
  event_loop.call_soon(callback)

where the supplied callbacks will be asyncio 'coroutines' which rather
than doing:

  def foo(...):
  buf = read(fd)

and rely on eventlet's monkey patch to cause us to enter the event
loop's select() when the read() blocks, we instead do:

  @asyncio.coroutine
  def foo(...):
  buf = yield from read(fd)

which shows exactly where we might yield to the event loop.

The challenge is that porting code like the foo() function above is
pretty invasive and we can't simply port an entire service at once. So,
we need to be able to support a service using both eventlet-reliant code
and asyncio coroutines.

In your example of the openstack.common.threadgroup API - we would
initially need to add support for scheduling asyncio coroutine callback
arguments as eventlet greenthreads in add_timer() and add_thread(), and
later we would port threadgroup itself to rely completely on asyncio.

  Now, because we're using eventlet, the code that is run in response to
  these events looks like synchronous code that makes a bunch of
  synchronous calls. For example, the code might do some_sync_op() and
  that will cause a context switch to a different greenthread (within the
  same native thread) where we might handle another I/O event (like a REST
  API request)
 
 Just to make the point that most of the agents in the ceilometer
 zoo tend to react to just a single type of stimulus, as opposed
 to a mix of dispatching from both message bus and the REST API.
 
 So to classify, we'd have:
 
  * compute-agent: timer tasks for polling
  * central-agent: timer tasks for polling
  * notification-agent: dispatch of external notifications from
the message bus
  * collector: dispatch of internal metering messages from the
message bus
  * api-service: dispatch of REST API calls
  * alarm-evaluator: timer tasks for alarm evaluation
  * alarm-notifier: dispatch of internal alarm notifications
 
 IIRC, the only case where there's a significant mix of trigger
 styles is the partitioned alarm evaluator, where assignments of
 alarm subsets for evaluation is driven over RPC, whereas the
 actual thresholding is triggered by a timer.

Cool, that's helpful. I think the key thing is deciding which stimulus
(and hence agent) we should start with.

  Porting from eventlet's implicit async approach to asyncio's explicit
  async API will be seriously time consuming and we need to be able to do
  it piece-by-piece.
 
 Yes, I agree, a step-wise approach is the key here.
 
 So I'd love to have some sense of the time horizon for this
 effort. It clearly feels like a

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Mark McLoughlin

On Mon, 2014-07-07 at 15:53 +0100, Gordon Sim wrote:
 On 07/07/2014 03:12 PM, Victor Stinner wrote:
  The first step is to patch endpoints to add @trollius.coroutine to the 
  methods,
  and add yield From(...) on asynchronous tasks.
 
 What are the 'endpoints' here? Are these internal to the oslo.messaging 
 library, or external to it?

The callback functions we dispatch to are called 'endpoint methods' -
e.g. they are methods on the 'endpoints' objects passed to
get_rpc_server().

  Later we may modify Oslo Messaging to be able to call an RPC method
  asynchronously, a method which would return a Trollius coroutine or task
  directly. The problem is that Oslo Messaging currently hides 
  implementation
  details like eventlet.
 
 I guess my question is how effectively does it hide it? If the answer to 
 the above is that this change can be contained within the oslo.messaging 
 implementation itself, then that would suggest its hidden reasonably well.
 
 If, as I first understood (perhaps wrongly) it required changes to every 
 use of the oslo.messaging API, then it wouldn't really be hidden.
 
  Returning a Trollius object means that Oslo Messaging
  will use explicitly Trollius. I'm not sure that OpenStack is ready for that
  today.
 
 The oslo.messaging API could evolve/expand to include explicitly 
 asynchronous methods that did not directly expose Trollius.

I'd expect us to add e.g.

  @asyncio.coroutine
  def call_async(self, ctxt, method, **kwargs):
  ...

to RPCClient. Perhaps we'd need to add an AsyncRPCClient in a separate
module and only add the method there - I don't have a good sense of it
yet.

However, the key thing is that I don't anticipate us needing to change
the current API in a backwards incompatible way.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Mark McLoughlin

On Mon, 2014-07-07 at 18:11 +, Angus Salkeld wrote:
 On 03/07/14 05:30, Mark McLoughlin wrote:
  Hey
  
  This is an attempt to summarize a really useful discussion that Victor,
  Flavio and I have been having today. At the bottom are some background
  links - basically what I have open in my browser right now thinking
  through all of this.
  
  We're attempting to take baby-steps towards moving completely from
  eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
  first victim.
 
 Has this been widely agreed on? It seems to me like we are mixing two
 issues:
 1) we need to move to py3
 2) some people want to move from eventlet (I am not convinced that the
volume of code changes warrants the end goal - and review load)
 
 To achieve 1) in a lower risk change, shouldn't we rather run eventlet
 on top of asyncio? - i.e. not require widespread code changes.
 
 So we can maintain the main loop API but move to py3. I am not sure on
 the feasibility, but seems to me like a more contained change.

Right - it's important that we see these orthogonal questions,
particularly now that it appears eventlet is likely to be available for
Python 3 soon.

For example, if it was generally agreed that we all want to end up on
Python 3 with asyncio in the long term, you could imagine deploying
(picking random examples) Glance with Python 3 and eventlet, but
Ceilometer with Python 2 and asyncio/trollius.

However, I don't have a good handle on how your suggestion of switching
to the asyncio event loop without widespread code changes would work?

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo][all] autosync incubator to projects

2014-07-04 Thread Mark McLoughlin

On Fri, 2014-07-04 at 15:31 +0200, Ihar Hrachyshka wrote:
 Hi all,
 at the moment we have several bot jobs that sync contents to affected
 projects:
 
 - translations are copied from transifex;
 - requirements are copied from global requirements repo.
 
 We have another source of common code - oslo-incubator, though we
 still rely on people manually copying the new code from there to
 affected projects. This results in old, buggy, and sometimes
 completely different versions of the same code in all projects.
 
 I wonder why don't we set another bot to sync code from incubator? In
 that way, we would:
 - reduce work to do for developers [I hope everyone knows how boring
 it is to fill in commit message with all commits synchronized and
 create sync requests for  10 projects at once];
 - make sure all projects use (almost) the same code;
 - ensure projects are notified in advance in case API changed in one
 of the modules that resulted in failures in gate;
 - our LOC statistics will be a bit more fair ;) (currently, the one
 who syncs a large piece of code from incubator to a project, gets all
 the LOC credit at e.g. stackalytics.com).
 
 The changes will still be gated, so any failures and incompatibilities
 will be caught. I even don't expect most of sync requests to fail at
 all, meaning it will be just a matter of two +2's from cores.
 
 I know that Oslo team works hard to graduate lots of modules from
 incubator to separate libraries with stable API. Still, I guess we'll
 live with incubator at least another cycle or two.
 
 What are your thoughts on that?

Just repeating what I said on IRC ...

The point of oslo-incubator is that it's a place where APIs can be
cleaned up so that they are ready for graduation. Code living in
oslo-incubator for a long time with unchanging APIs is not the idea. An
automated sync job would IMHO discourage API cleanup work. I'd expect
people would start adding lots of ugly backwards API compat hacks with
their API cleanups just to stop people complaining about failing
auto-syncs. That would be the opposite of what we're trying to achieve.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-03 Thread Mark McLoughlin

Hey

This is an attempt to summarize a really useful discussion that Victor,
Flavio and I have been having today. At the bottom are some background
links - basically what I have open in my browser right now thinking
through all of this.

We're attempting to take baby-steps towards moving completely from
eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
first victim.

Ceilometer's code is run in response to various I/O events like REST API
requests, RPC calls, notifications received, etc. We eventually want the
asyncio event loop to be what schedules Ceilometer's code in response to
these events. Right now, it is eventlet doing that.

Now, because we're using eventlet, the code that is run in response to
these events looks like synchronous code that makes a bunch of
synchronous calls. For example, the code might do some_sync_op() and
that will cause a context switch to a different greenthread (within the
same native thread) where we might handle another I/O event (like a REST
API request) while we're waiting for some_sync_op() to return:

  def foo(self):
  result = some_sync_op()  # this may yield to another greenlet
  return do_stuff(result)

Eventlet's infamous monkey patching is what make this magic happen.

When we switch to asyncio's event loop, all of this code needs to be
ported to asyncio's explicitly asynchronous approach. We might do:

  @asyncio.coroutine
  def foo(self):
  result = yield from some_async_op(...)
  return do_stuff(result)

or:

  @asyncio.coroutine
  def foo(self):
  fut = Future()
  some_async_op(callback=fut.set_result)
  ...
  result = yield from fut
  return do_stuff(result)

Porting from eventlet's implicit async approach to asyncio's explicit
async API will be seriously time consuming and we need to be able to do
it piece-by-piece.

The question then becomes what do we need to do in order to port a
single oslo.messaging RPC endpoint method in Ceilometer to asyncio's
explicit async approach?

The plan is:

  - we stick with eventlet; everything gets monkey patched as normal

  - we register the greenio event loop with asyncio - this means that 
e.g. when you schedule an asyncio coroutine, greenio runs it in a 
greenlet using eventlet's event loop

  - oslo.messaging will need a new variant of eventlet executor which 
knows how to dispatch an asyncio coroutine. For example:

while True:
incoming = self.listener.poll()
method = dispatcher.get_endpoint_method(incoming)
if asyncio.iscoroutinefunc(method):
result = method()
self._greenpool.spawn_n(incoming.reply, result)
else:
self._greenpool.spawn_n(method)

it's important that even with a coroutine endpoint method, we send 
the reply in a greenthread so that the dispatch greenthread doesn't
get blocked if the incoming.reply() call causes a greenlet context
switch

  - when all of ceilometer has been ported over to asyncio coroutines, 
we can stop monkey patching, stop using greenio and switch to the 
asyncio event loop

  - when we make this change, we'll want a completely native asyncio 
oslo.messaging executor. Unless the oslo.messaging drivers support 
asyncio themselves, that executor will probably need a separate
native thread to poll for messages and send replies.

If you're confused, that's normal. We had to take several breaks to get
even this far because our brains kept getting fried.

HTH,
Mark.

Victor's excellent docs on asyncio and trollius:

  https://docs.python.org/3/library/asyncio.html
  http://trollius.readthedocs.org/

Victor's proposed asyncio executor:

  https://review.openstack.org/70948

The case for adopting asyncio in OpenStack:

  https://wiki.openstack.org/wiki/Oslo/blueprints/asyncio

A previous email I wrote about an asyncio executor:

 http://lists.openstack.org/pipermail/openstack-dev/2013-June/009934.html

The mock-up of an asyncio executor I wrote:

  
https://github.com/markmc/oslo-incubator/blob/8509b8b/openstack/common/messaging/_executors/impl_tulip.py

My blog post on async I/O and Python:

  http://blogs.gnome.org/markmc/2013/06/04/async-io-and-python/

greenio - greelets support for asyncio:

  https://github.com/1st1/greenio/


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [infra][oslo][neutron] Need help getting oslo.messaging 1.4.0.0a2 in global requirements

2014-06-30 Thread Mark McLoughlin

On Mon, 2014-06-30 at 16:52 +, Paul Michali (pcm) wrote:
I have out for review 103536 to add this version to global
requirements, so that Neutron has an oslo fix (review 102909) for
encoding failure, which affects some gate runs. This review for global
requirements is failing requirements check
(http://logs.openstack.org/36/103536/1/check/check-requirements-integration-dsvm/6d9581c/console.html#_2014-06-30_12_34_56_921).
I did a recheck bug 1334898, but see the same error, with the release not
found, even though it is in PyPI. Infra folks say this is a known issue with
pushing out pre-releases.

Do we have a work-around?
Any proposed solution to try?

That makes two oslo alpha releases which are failing
openstack/requirements checks:

https://review.openstack.org/103256
https://review.openstack.org/103536

and an issue with the py27 stable/icehouse test jobs seemingly pulling
in oslo.messaging 1.4.0.0a2:

http://lists.openstack.org/pipermail/openstack-dev/2014-June/039021.html

and these comments on IRC:

http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2014-06-30.log

2014-06-30T15:27:33 pcm__ hi. Need help with getting latest oslo.messaging
release added to global requirements. Can someone advise on the issues I see.
2014-06-30T15:28:06 mordred pcm__: there are issues adding oslo
pre-releases to the mirror right now - we're working on a solution ... so
you're not alone at least :)
2014-06-30T15:29:02 pcm__ mordred: Jenkins failed saying that it could not
find the release, but it is available.
2014-06-30T15:29:31 bknudson pcm__: mordred: is the fix to remove the
check for --no-use-wheel in the check-requirements-integration-dsvm ?
2014-06-30T15:29:55 mordred bknudson: nope. it's to completely change our
mirroring infrastructure :)

Presumably there's more information somewhere on what solution infra are
working on, but that's all I got ...

We knew this pre-release-with-wheels stuff was going to be a little
rocky, so this isn't surprising. Hopefully it'll get sorted out soon.

Mark.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [OpenStack-Dev] OSLO messaging update and icehouse config files

2014-06-30 Thread Mark McLoughlin

On Mon, 2014-06-30 at 15:35 -0600, John Griffith wrote:

On Mon, Jun 30, 2014 at 3:17 PM, Mark McLoughlin mar...@redhat.com
wrote:
On Mon, 2014-06-30 at 12:04 -0600, John Griffith wrote:
Hey Everyone,

So I sent a note out yesterday asking about config changes
brought in
to Icehouse due to the OSLO Messaging update that went out
over the
week-end here. My initial email prior to realizing the
update that
caused the problem was OSLO Messaging update here [1].

(Periodic reminder that Oslo is not an acronym)

In the meantime I tried updating the cinder.conf sample in
Cinder's
stable Icehouse branch, but noticed that py26 doesn't seem
to pick up
the changes when running the oslo conf generation tools
against
oslo.messaging. I haven't spent any time digging into this
yet, was
hoping that perhaps somebody from the OSLO team or somewhere
else
maybe had some insight as to what's going on here.

Here's the patch I submitted that shows the failure on py26
and
success on py27 [2]. I'll get around to this eventually if
nobody
else knows anything off the top of their head.

Thanks,
John

[1]:

http://lists.openstack.org/pipermail/openstack-dev/2014-June/038926.html
[2]: https://review.openstack.org/#/c/103426/

Ok, that new cinder.conf.sample is showing changes caused by
these
oslo.messaging changes:

https://review.openstack.org/101583
https://review.openstack.org/99291

Both of those changes were first released in 1.4.0.0a1 which
is an alpha
version targeting Juno and are not available in the 1.3.0
Icehouse
version - i.e. 1.4.0.0a1 should not be used with
stable/icehouse Cinder.

It seems 1.3.0 *is* being used:

http://logs.openstack.org/26/103426/1/check/gate-cinder-python26/5c6c1dd/console.html

2014-06-29 19:17:50.154 | oslo.messaging==1.3.0

and the output is just confusing:

2014-06-29 19:17:49.900 |
--- /tmp/cinder.UtGHjm/cinder.conf.sample 2014-06-29
19:17:50.270071741 +
2014-06-29 19:17:49.900 | +++ etc/cinder/cinder.conf.sample
2014-06-29 19:10:48.396072037 +

...

2014-06-29 19:17:49.903 | +[matchmaker_redis]
2014-06-29 19:17:49.903 | +

i.e. it's showing that the file you proposed was generated
with
1.4.0.0a1 and the file generated during the test job was
generated with
1.3.0. Which is what I'd expect - the update you proposed is
not
appropriate for stable/icehouse.

So why is the py27 job passing?

http://logs.openstack.org/26/103426/1/check/gate-cinder-python27/7844c61/console.html

2014-06-29 19:21:12.875 | oslo.messaging==1.4.0.0a2

That's the problem right there - 1.4.0.0a2 should not be
getting
installed on the stable/icehouse branch. I'm not sure why it
is. Someone
on #openstack-infra could probably help figure it out.

Thanks,
Mark.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Thanks Mark... so the problem is Oslo messaging in requirements is =
in stable/icehouse [1](please note I used Oslo not OSLO).

[1]:
https://github.com/openstack/requirements/blob/stable/icehouse/global-requirements.txt#L49

Thanks for pointing me in the right direction.

Ah, yes! This is the problem:

oslo.messaging=1.3.0a9

This essentially allows *any* alpha release of oslo.messaging to be
used. We should change stable/icehouse to simply be:

oslo.messaging=1.3.0

I'm happy to do that tomorrow, but I suspect you'll get there first

Thanks,
Mark.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] oslo.messaging 1.4.0.0a1 released

2014-06-28 Thread Mark McLoughlin

On Fri, 2014-06-27 at 13:28 +, Paul Michali (pcm) wrote:
 Mark,
 
 When would we be able to get a release of Oslo with 102909 fix in?
 It’s preventing Jenkins passing for some commits in Neutron.

I've just pushed 1.4.0.0a2 with the following changes:

 244a902 Fix the notifier example
 a7f01d9 Fix slow notification listener tests
 da2abaa Fix formatting of TransportURL.parse() docs
 13fc9f2 Fix info method of ListenerSetupMixin
 0cfafac encoding error in file
 0102aa9 Replace usage of str() with six.text_type

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] 'retry' option

2014-06-28 Thread Mark McLoughlin

On Fri, 2014-06-27 at 17:02 +0100, Gordon Sim wrote:
 A question about the new 'retry' option. The doc says:
 
  By default, cast() and call() will block until the
  message is successfully sent.
 
 What does 'successfully sent' mean here?

Unclear, ambiguous, probably driver dependent etc.

The 'blocking' we're talking about here is establishing a connection
with the broker. If the connection has been lost, then cast() will block
until the connection has been re-established and the message 'sent'.

  Does it mean 'written to the wire' or 'accepted by the broker'?
 
 For the impl_qpid.py driver, each send is synchronous, so it means 
 accepted by the broker[1].
 
 What does the impl_rabbit.py driver do? Does it just mean 'written to 
 the wire', or is it using RabbitMQ confirmations to get notified when 
 the broker accepts it (standard 0-9-1 has no way of doing this).

I don't know, but it would be nice if someone did take the time to
figure it out and document it :)

Seriously, some docs around the subtle ways that the drivers differ from
one another would be helpful ... particularly if it exposed incorrect
assumptions API users are currently making.

 If the intention is to block until accepted by the broker that has 
 obvious performance implications. On the other hand if it means block 
 until written to the wire, what is the advantage of that? Was that a 
 deliberate feature or perhaps just an accident of implementation?
 
 The use case for the new parameter, as described in the git commit, 
 seems to be motivated by wanting to avoid the blocking when sending 
 notifications. I can certainly understand that desire.
 
 However, notifications and casts feel like inherently asynchronous 
 things to me, and perhaps having/needing the synchronous behaviour is 
 the real issue?

It's not so much about sync vs async, but a failure mode. By default, if
we lose our connection with the broker, we wait until we can
re-establish it rather than throwing exceptions (requiring the API
caller to have its own retry logic) or quietly dropping the message.

The use case for ceilometer is to allow its RPCPublisher to have a
publishing policy - block until the samples have been sent, queue (in an
in-memory, fixed-length queue) if we don't have a connection to the
broker, or drop it if we don't have a connection to the broker.

  https://review.openstack.org/77845

I do understand the ambiguity around what message delivery guarantees
are implicit in cast() isn't ideal, but that's not what adding this
'retry' parameter was about.

  Calls by contrast, are inherently synchronous, but at 
 present the retry controls only the sending of the request. If the 
 server fails, the call may timeout regardless of the value of 'retry'.
 
 Just in passing, I'd suggest that renaming the new parameter 
 max_reconnects, would make it's current behaviour and values clearer. 
 The name 'retry' sounds like a yes/no type value, and retry=0 v. retry=1 
 is the reverse of what I would intuitively expect.

Sounds reasonable. Would you like to submit a patch? Quick turnaround is
important, because if Ceilometer starts using this retry parameter
before we rename it, I'm not sure it'll be worth the hassle.

Thanks,
Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [hacking] community consensus and removing rules

2014-06-24 Thread Mark McLoughlin

On Mon, 2014-06-23 at 19:55 -0700, Joe Gordon wrote:

   * Add a new directory, contrib, for local rules that multiple
 projects use but are not generally considered acceptable to be
 enabled by default. This way we can reduce the amount of cut
 and pasted code (thank you to Ben Nemec for this idea).

All sounds good to me, apart from a pet peeve on 'contrib' directories.

What does 'contrib' mean? 'contributed'? What exactly *isn't*
contributed? Often it has connotations of 'contributed by outsiders'.

It also often has connotations of 'bucket for crap', 'unmaintained and
untested', YMMV, etc. etc.

Often the name is just chosen out of laziness - I can't think of a good
name for this, and projects often have a contrib directory with random
stuff in it, so that works.

Let's be precise - these are optional rules, right? How about calling
the directory 'optional'?

Say no to contrib directories! :-P

Thanks,
Mark.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [hacking] rules for removal

2014-06-24 Thread Mark McLoughlin

On Tue, 2014-06-24 at 09:51 -0700, Clint Byrum wrote:
 Excerpts from Monty Taylor's message of 2014-06-24 06:48:06 -0700:
  On 06/22/2014 02:49 PM, Duncan Thomas wrote:
   On 22 June 2014 14:41, Amrith Kumar amr...@tesora.com wrote:
   In addition to making changes to the hacking rules, why don't we mandate 
   also
   that perceived problems in the commit message shall not be an acceptable
   reason to -1 a change.
   
   -1.
   
   There are some /really/ bad commit messages out there, and some of us
   try to use the commit messages to usefully sort through the changes
   (i.e. I often -1 in cinder a change only affects one driver and that
   isn't clear from the summary).
   
   If the perceived problem is grammatical, I'm a bit more on board with
   it not a reason to rev a patch, but core reviewers can +2/A over the
   top of a -1 anyway...
  
  100% agree. Spelling and grammar are rude to review on - especially
  since we have (and want) a LOT of non-native English speakers. It's not
  our job to teach people better grammar. Heck - we have people from
  different English backgrounds with differing disagreements on what good
  grammar _IS_
  
 
 We shouldn't quibble over _anything_ grammatical in a commit message. If
 there is a disagreement about it, the comments should be ignored. There
 are definitely a few grammar rules that are loose and those should be
 largely ignored.
 
 However, we should correct grammar when there is a clear solution, as
 those same people who do not speak English as their first language are
 likely to be confused by poor grammar.
 
 We're not doing it to teach grammar. We're doing it to ensure readability.

The importance of clear English varies with context, but commit messages
are a place where we should try hard to just let it go, particularly
with those who do not speak English as their first language.

Commit messages stick around forever and it's important that they are
useful, but they will be read by a small number of people who are going
to be in a position to spend a small amount of time getting over
whatever dissonance is caused by a typo or imperfect grammar.

I think specs are pretty similar and don't warrant much additional
grammar nitpicking. Sure, they're longer pieces of text and slightly
more people will rely on them for information, but they're not intended
to be complete documentation.

Where grammar is so poor that readers would be easily misled in
important ways, then sure that should be fixed. But there comes a point
when we're no longer working to avoid confusion and instead just being
pendants. Taking issue[1] with this:

  whatever scaling mechanism Heat and we end up going with.

because it has a dangling preposition is an example of going way
beyond the point of productive pedantry IMHO :-)

Mark.

[1] - https://review.openstack.org/#/c/97939/5/specs/juno/remove-mergepy.rst


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [hacking] rules for removal

2014-06-24 Thread Mark McLoughlin

On Tue, 2014-06-24 at 13:56 -0700, Clint Byrum wrote:
 Excerpts from Mark McLoughlin's message of 2014-06-24 12:49:52 -0700:
  On Tue, 2014-06-24 at 09:51 -0700, Clint Byrum wrote:
   Excerpts from Monty Taylor's message of 2014-06-24 06:48:06 -0700:
On 06/22/2014 02:49 PM, Duncan Thomas wrote:
 On 22 June 2014 14:41, Amrith Kumar amr...@tesora.com wrote:
 In addition to making changes to the hacking rules, why don't we 
 mandate also
 that perceived problems in the commit message shall not be an 
 acceptable
 reason to -1 a change.
 
 -1.
 
 There are some /really/ bad commit messages out there, and some of us
 try to use the commit messages to usefully sort through the changes
 (i.e. I often -1 in cinder a change only affects one driver and that
 isn't clear from the summary).
 
 If the perceived problem is grammatical, I'm a bit more on board with
 it not a reason to rev a patch, but core reviewers can +2/A over the
 top of a -1 anyway...

100% agree. Spelling and grammar are rude to review on - especially
since we have (and want) a LOT of non-native English speakers. It's not
our job to teach people better grammar. Heck - we have people from
different English backgrounds with differing disagreements on what good
grammar _IS_

   
   We shouldn't quibble over _anything_ grammatical in a commit message. If
   there is a disagreement about it, the comments should be ignored. There
   are definitely a few grammar rules that are loose and those should be
   largely ignored.
   
   However, we should correct grammar when there is a clear solution, as
   those same people who do not speak English as their first language are
   likely to be confused by poor grammar.
   
   We're not doing it to teach grammar. We're doing it to ensure readability.
  
  The importance of clear English varies with context, but commit messages
  are a place where we should try hard to just let it go, particularly
  with those who do not speak English as their first language.
  
  Commit messages stick around forever and it's important that they are
  useful, but they will be read by a small number of people who are going
  to be in a position to spend a small amount of time getting over
  whatever dissonance is caused by a typo or imperfect grammar.
 
 
 The times that one is reading git messages are often the most stressful
 such as when a regression has occurred in production.
 
 Given that, I believe it is entirely worth it to me that the commit
 messages on my patches are accurate and understandable. I embrace all
 feedback which leads to them being more clear. I will of course stand
 back from grammar correcting and not block patches if there are many
 who disagree.
 
  I think specs are pretty similar and don't warrant much additional
  grammar nitpicking. Sure, they're longer pieces of text and slightly
  more people will rely on them for information, but they're not intended
  to be complete documentation.
 
 
 Disagree. I will only state this one more time as I think everyone knows
 how I feel: if we are going to grow beyond the english-as-a-first-language
 world we simply cannot assume that those reading specs will be native
 speakers. Good spelling and grammar helps us grow. Bad spelling and
 grammar holds us back.

There's two sides to this coin - concern about alienating
non-english-as-a-first-language speakers who feel undervalued because
their language is nitpicked to death and concern about alienating
english-as-a-first-language speakers who struggle to understand unclear
or incorrect language.

Obviously there's a balance to be struck there and different people will
judge that differently, but I'm personally far more concerned about the
former rather than the latter case.

I expect many beyond the english-as-a-first-language world are pretty
used to dealing with imperfect language but aren't so delighted with
being constantly reminded that their use language is imperfect.

  Where grammar is so poor that readers would be easily misled in
  important ways, then sure that should be fixed. But there comes a point
  when we're no longer working to avoid confusion and instead just being
  pendants. Taking issue[1] with this:
  
whatever scaling mechanism Heat and we end up going with.
  
  because it has a dangling preposition is an example of going way
  beyond the point of productive pedantry IMHO :-)
 
 I actually agree that it would not at all be a reason to block a patch.
 However, there is some ambiguity in that sentence that may not be clear
 to a native speaker. It is not 100% clear if we are going with Heat,
 or with the scaling mechanism. That is the only reason for the dangling
 preposition debate.

I'd wager you'd seriously struggle to find anyone who would interpret
that sentence as we are going with Heat, even if they were
non-english-as-a-first-language speakers who had never heard of
OpenStack or

Re: [openstack-dev] [hacking] rules for removal

2014-06-22 Thread Mark McLoughlin

On Sat, 2014-06-21 at 07:36 -0700, Clint Byrum wrote:
 Excerpts from Sean Dague's message of 2014-06-21 05:08:01 -0700:

  Pedantic reviewers that are reviewing for this kind of thing only should
  be scorned. I realistically like the idea markmc came up with -
  https://twitter.com/markmc_/status/480073387600269312
 
 
 I also agree it is really fun to think about shaming those annoying
 actions. It is also not fun _at all_ to be publicly shamed. In fact I'd
 say it is at least an order of magnitude less fun. There is an old saying,
 praise in public, punish in private. It is one reason the -1 comments I
 give always include praise for whatever is right for new contributors. Not
 everyone is a grizzled veteran.
 
 It is far more interesting to me to solve the grouping problem in a
 way that works for us long term (python 2 and 3) than it is to develop
 a culture that builds any of its core activities on negative emotional
 feedback.
 
 That's not to say we can't say hey you're doing it wrong. I mean to say
 that direct feedback like that belongs in private IRC messages or email,
 not in public everyone can see that reviews. Give people a chance to
 save face. Meanwhile, the less we have to have one on one negative
 feedback, the easier the job of reviewers is.
 
 The last thing we want to do is have more reasons for people to NOT do
 reviews.

You're right that something like I suggested could easily lead to more
negative energy in the project, not less.

What I had in mind was that we could laugh at ourselves about this.
Assuming that the reviewers called out would be fully on-board and
willing to laugh along at being the most pedantic nerd of the week.

Yeah, that's probably wishful thinking. Maybe it could be anonymous.
Maybe instead it could be a weekly mailing list discussion so that we
could all discuss as a community whether that kind of feedback on a
review is appropriate.

The main point is that this is something worth addressing as a wider
community rather than in individual reviews with a limited audience. And
that doing it with a bit of humor might help take the sting out of it.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Octavia] PTL and core team members

2014-06-20 Thread Mark McLoughlin

On Thu, 2014-06-19 at 20:36 -0700, Dustin Lundquist wrote:
 Dolph,
 
 
 I appreciate the suggestion. In the mean time how does the review
 process work without core developers to approve gerrit submissions?

If you're just getting started, have a small number (possibly just 1 to
begin with) of developers collaborate closely, with the minimum possible
process and then use that list of developers as your core review team
when you gradually start adopting some process. Aim to get from zero to
bootstrapped with that core team in a small number of weeks at most.

Minimum possible process could mean a git repo anywhere that those
initial developers have direct push access to. You could use stackforge
from the beginning and the developers just approve their own changes,
but that's a bit annoying.

Mark.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [Oslo] Translating log and exception messages in Oslo libraries

2014-06-20 Thread Mark McLoughlin

Hi

I'm not sure we've ever discussed this before, but I had previously
figured that we shouldn't translate log and exception messages in
oslo.messaging.

My thinking is:

  - it seems like an odd thing for a library to do, I don't know of 
examples of other libraries doing this .. but I haven't gone
looking

  - it involves a dependency on oslo.i18n

  - more than just marking strings for translation and using 
gettextutils, you also need to set up the infrastructure for pushing
the .pot files to transifex, pulling the .po files from .transifex 
and installing the .mo files at install time

I don't feel terribly strongly about this except that unless someone is
willing to see this through and do the transifex and install-time work,
we shouldn't be doing the use-oslo.i18n and mark-strings-for-translation
work.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] A modest proposal to reduce reviewer load

2014-06-19 Thread Mark McLoughlin

Hi Armando,

On Tue, 2014-06-17 at 14:51 +0200, Armando M. wrote:
 I wonder what the turnaround of trivial patches actually is, I bet you
 it's very very small, and as Daniel said, the human burden is rather
 minimal (I would be more concerned about slowing them down in the
 gate, but I digress).

 
 I think that introducing a two-tier level for patch approval can only
 mitigate the problem, but I wonder if we'd need to go a lot further,
 and rather figure out a way to borrow concepts from queueing theory so
 that they can be applied in the context of Gerrit. For instance
 Little's law [1] says:
 
 The long-term average number of customers (in this context reviews)
 in a stable system L is equal to the long-term average effective
 arrival rate, λ, multiplied by the average time a customer spends in
 the system, W; or expressed algebraically: L = λW.
 
 L can be used to determine the number of core reviewers that a project
 will need at any given time, in order to meet a certain arrival rate
 and average time spent in the queue. If the number of core reviewers
 is a lot less than L then that core team is understaffed and will need
 to increase.
 
 If we figured out how to model and measure Gerrit as a queuing system,
 then we could improve its performance a lot more effectively; for
 instance, this idea of privileging trivial patches over longer patches
 has roots in a popular scheduling policy [3] for  M/G/1 queues, but
 that does not really help aging of 'longer service time' patches and
 does not have a preemption mechanism built-in to avoid starvation. 
 
 Just a crazy opinion...
 Armando
 
 [1] - http://en.wikipedia.org/wiki/Little's_law
 [2] - http://en.wikipedia.org/wiki/Shortest_job_first
 [3] - http://en.wikipedia.org/wiki/M/G/1_queue

This isn't crazy at all. We do have a problem that surely could be
studied and solved/improved by applying queueing theory or lessons from
fields like lean manufacturing. Right now, we're simply applying our
intuition and the little I've read about these sorts of problems is that
your intuition can easily take you down the wrong path.

There's a bunch of things that occur just glancing through those
articles:

  - Do we have an unstable system? Would it be useful to have arrival 
and exit rate metrics to help highlight this? Over what time period 
would those rates need to be averaged to be useful? Daily, weekly, 
monthly, an entire release cycle?

  - What are we trying to optimize for? The length of time in the 
queue? The number of patches waiting in the queue? The response 
time to a new patch revision?

  - We have a single queue, with a bunch of service nodes with a wide 
variance between their service rates, very little in the way of
scheduling policy, a huge rate of service nodes sending jobs back 
for rework, a cost associated with maintaining a job while it sits 
in the queue, the tendency for some jobs to disrupt many other jobs 
with merge conflicts ... not simple.

  - Is there any sort of natural limit in our queue size that makes the 
system stable - e.g. do people naturally just stop submitting
patches at some point?

My intuition on all of this lately is that we need some way to model and
experiment with this queue, and I think we could make some interesting
progress if we could turn it into a queueing network rather than a
single, extremely complex queue.

Say we had a front-end for gerrit which tracked which queue a patch is
in, we could experiment with things like:

  - a triage queue, with non-cores signed up as triagers looking for 
obvious mistakes and choosing the next queue for a patch to enter 
into

  - queues having a small number of cores signed up as owners - e.g. 
high priority bugfix, API, scheduler, object conversion, libvirt
driver, vmware driver, etc.

  - we'd allow for a large number of queues so that cores could aim for 
an inbox zero approach on individual queues, something that would 
probably help keep cores motivated.

  - we could apply different scheduling policies to each of the 
different queues - i.e. explicit guidance for cores about which 
patches they should pick off the queue next.

  - we could track metrics on individual queues as well as the whole 
network, identifying bottlenecks and properly recognizing which 
reviewers are doing a small number of difficult reviews versus 
those doing a high number of trivial reviews.

  - we could require some queues to feed into a final approval queue 
where some people are responsible for giving an approved patch a 
final sanity check - i.e. there would be a class of reviewer with 
good instincts who quickly churn through already-reviewed patches 
looking for the kind of mistakes people tend to mistake when 
they're down in the weeds.

  - explicit queues for large, cross-cutting changes like coding style 
changes. Perhaps we could stop servicing these queues

[openstack-dev] [oslo] Paris mid-cycle sprint

2014-06-19 Thread Mark McLoughlin

Hey

I had been thinking of going to the Paris sprint:

  https://wiki.openstack.org/wiki/Sprints/ParisJuno2014

But it only just occurred to me that we could have enough Oslo
contributors in Europe to make it worthwhile for us to use the
opportunity to get some Oslo stuff done together.

For example, Victor (Stinner), Mehdi, Flavio, Victor (Sergeyev), Roman,
or others ...  perhaps some or all of you would be up for it? Julien
will be there too, but will want to focus on Ceilometer I assume.

I'll add myself to the wiki ... feel free to do so too.

Thanks,
Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] A modest proposal to reduce reviewer load

2014-06-19 Thread Mark McLoughlin

On Thu, 2014-06-19 at 09:34 +0100, Matthew Booth wrote:
 On 19/06/14 08:32, Mark McLoughlin wrote:
  Hi Armando,
  
  On Tue, 2014-06-17 at 14:51 +0200, Armando M. wrote:
  I wonder what the turnaround of trivial patches actually is, I bet you
  it's very very small, and as Daniel said, the human burden is rather
  minimal (I would be more concerned about slowing them down in the
  gate, but I digress).
  
 
  I think that introducing a two-tier level for patch approval can only
  mitigate the problem, but I wonder if we'd need to go a lot further,
  and rather figure out a way to borrow concepts from queueing theory so
  that they can be applied in the context of Gerrit. For instance
  Little's law [1] says:
 
  The long-term average number of customers (in this context reviews)
  in a stable system L is equal to the long-term average effective
  arrival rate, λ, multiplied by the average time a customer spends in
  the system, W; or expressed algebraically: L = λW.
 
  L can be used to determine the number of core reviewers that a project
  will need at any given time, in order to meet a certain arrival rate
  and average time spent in the queue. If the number of core reviewers
  is a lot less than L then that core team is understaffed and will need
  to increase.
 
  If we figured out how to model and measure Gerrit as a queuing system,
  then we could improve its performance a lot more effectively; for
  instance, this idea of privileging trivial patches over longer patches
  has roots in a popular scheduling policy [3] for  M/G/1 queues, but
  that does not really help aging of 'longer service time' patches and
  does not have a preemption mechanism built-in to avoid starvation. 
 
  Just a crazy opinion...
  Armando
 
  [1] - http://en.wikipedia.org/wiki/Little's_law
  [2] - http://en.wikipedia.org/wiki/Shortest_job_first
  [3] - http://en.wikipedia.org/wiki/M/G/1_queue
  
  This isn't crazy at all. We do have a problem that surely could be
  studied and solved/improved by applying queueing theory or lessons from
  fields like lean manufacturing. Right now, we're simply applying our
  intuition and the little I've read about these sorts of problems is that
  your intuition can easily take you down the wrong path.
  
  There's a bunch of things that occur just glancing through those
  articles:
  
- Do we have an unstable system? Would it be useful to have arrival 
  and exit rate metrics to help highlight this? Over what time period 
  would those rates need to be averaged to be useful? Daily, weekly, 
  monthly, an entire release cycle?
  
- What are we trying to optimize for? The length of time in the 
  queue? The number of patches waiting in the queue? The response 
  time to a new patch revision?
  
- We have a single queue, with a bunch of service nodes with a wide 
  variance between their service rates, very little in the way of
  scheduling policy, a huge rate of service nodes sending jobs back 
  for rework, a cost associated with maintaining a job while it sits 
  in the queue, the tendency for some jobs to disrupt many other jobs 
  with merge conflicts ... not simple.
  
- Is there any sort of natural limit in our queue size that makes the 
  system stable - e.g. do people naturally just stop submitting
  patches at some point?
  
  My intuition on all of this lately is that we need some way to model and
  experiment with this queue, and I think we could make some interesting
  progress if we could turn it into a queueing network rather than a
  single, extremely complex queue.
  
  Say we had a front-end for gerrit which tracked which queue a patch is
  in, we could experiment with things like:
  
- a triage queue, with non-cores signed up as triagers looking for 
  obvious mistakes and choosing the next queue for a patch to enter 
  into
  
- queues having a small number of cores signed up as owners - e.g. 
  high priority bugfix, API, scheduler, object conversion, libvirt
  driver, vmware driver, etc.
  
- we'd allow for a large number of queues so that cores could aim for 
  an inbox zero approach on individual queues, something that would 
  probably help keep cores motivated.
  
- we could apply different scheduling policies to each of the 
  different queues - i.e. explicit guidance for cores about which 
  patches they should pick off the queue next.
  
- we could track metrics on individual queues as well as the whole 
  network, identifying bottlenecks and properly recognizing which 
  reviewers are doing a small number of difficult reviews versus 
  those doing a high number of trivial reviews.
  
- we could require some queues to feed into a final approval queue 
  where some people are responsible for giving an approved patch a 
  final sanity check - i.e. there would be a class of reviewer with 
  good instincts who quickly churn

Re: [openstack-dev] [devstack] [zmq] [oslo.messaging] Running devstack with zeromq

2014-06-19 Thread Mark McLoughlin

On Thu, 2014-06-19 at 14:29 +0200, Mehdi Abaakouk wrote:
 Hi,
 
 Le 2014-06-19 00:30, Ben Nemec a écrit :
  On 06/18/2014 05:45 AM, Elena Ezhova wrote:
  
  So I wonder whether it is something the community is interested in 
  and, if
  yes, are there any recommendations concerning possible implementation?
  
  I can't speak to the specific implementation, but if we're going to 
  keep
  the zmq driver in oslo.messaging then IMHO it should be usable with
  devstack, so +1 to making that work.
 
 Currently the zmq driver have a really bad test coverage, the driver is 
 'I think' broken since a while.
 
 Bugs like [1] or [2] let me think that nobody can use it currently.
 
 [1] https://bugs.launchpad.net/oslo.messaging/+bug/1301723
 [2] https://bugs.launchpad.net/oslo.messaging/+bug/1330460
 
 Also, an oslo.messaging rule is a driver must not force to use a 
 eventloop library,
 but this one heavily use eventlet. So only the eventlet executor can 
 works with it, not the
 blocking one or any future executor.
 
 I guess if someone is interested in, the first step is to fix the zmq 
 driver, remove eventlet stuffs and write unit tests for it,
 before trying integration, and raise bugs that should be catch by 
 unit/functionnal testing.
 
 If nobody is interested in zmq, perhaps we should just 
 drop/deprecated/mark_as_broken it.

Yes, I agree with all of that. Unless the situation improves rapidly, I
think we should mark it as deprecated in Juno and plan to remove it in
K. That might seem like an overly rapid deprecation cycle, but it is
currently broken and unusable in Icehouse ... so no-one can be using it
in Icehouse.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [glance] Unifying configuration file

2014-06-18 Thread Mark McLoughlin

Hey

On Tue, 2014-06-17 at 17:43 +0200, Julien Danjou wrote:
 On Tue, Jun 17 2014, Arnaud Legendre wrote:
 
  @ZhiYan: I don't like the idea of removing the sample configuration file(s)
  from the git repository. Many people do not want to have to checkout the
  entire codebase and tox every time they have to verify a variable name in a
  configuration file. I know many people who were really frustrated where they
  realized that the sample config file was gone from the Nova repo.
  However, I agree with the fact that it would be better if the sample was
  100% accurate: so the way I would love to see this working is to generate
  the sample file every time there is a config change (this being totally
  automated (maybe at the gate level...)).
 
 You're a bit late on this. :)
 So what I did these last months (year?) in several project, is to check
 at gate time the configuration file that is automatically generated
 against what's in the patches.
 That turned out to be a real problem because sometimes some options
 changes from the eternal module we rely on (e.g. keystone authtoken or
 oslo.messaging). In the end many projects (like Nova) disabled this
 check altogether, and therefore removed the generated configuration file
 From the git repository.

For those that casually want to refer to the sample config, what would
help if there was Jenkins jobs to publish the generated sample config
file somewhere.

For people installing the software, it would probably be nice if pbr
added 'python setup.py sample_config' or something.

  @Julien: I would be interested to understand the value that you see of
  having only one config file? At this point, I don't see why managing one
  file is more complicated than managing several files especially when they
  are organized by categories. Also, scrolling through the registry settings
  every time I want to modify an api setting seem to add some overhead.
 
 Because there's no way to automatically generate several configuration
 files with each its own set of options using oslo.config.

I think that's a failing of oslo.config, though. Glance's layout of
config files is useful and intuitive.

 Glance is (one of?) the last project in OpenStack to manually write its
 sample configuration file, which are not up to date obviously.

Neutron too, but not split out per-service. I don't find Neutron's
config file layout as intuitive.

 So really this is mainly about following what every other projects did
 the last year(s).

There's a balance here between what makes technical sense and what helps
users. If Glance has support for generating a unified config file while
also manually maintaining the split configs, I think that's a fine
compromise.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [glance] Unifying configuration file

2014-06-18 Thread Mark McLoughlin

On Wed, 2014-06-18 at 09:29 -0400, Doug Hellmann wrote:
 On Wed, Jun 18, 2014 at 1:58 AM, Mark McLoughlin mar...@redhat.com wrote:
  Hey
 
  On Tue, 2014-06-17 at 17:43 +0200, Julien Danjou wrote:
  On Tue, Jun 17 2014, Arnaud Legendre wrote:
   @Julien: I would be interested to understand the value that you see of
   having only one config file? At this point, I don't see why managing one
   file is more complicated than managing several files especially when they
   are organized by categories. Also, scrolling through the registry 
   settings
   every time I want to modify an api setting seem to add some overhead.
 
  Because there's no way to automatically generate several configuration
  files with each its own set of options using oslo.config.
 
  I think that's a failing of oslo.config, though. Glance's layout of
  config files is useful and intuitive.
 
 The config generator lets you specify the modules, libraries, and
 files to be used to generate a config file. It even has a way to
 specify which files to ignore. So I think we have everything we need
 in the config generator, but we need to run it more than once, with
 different inputs, to generate multiple files.

Yep, except the magic way we troll through the code, loading modules,
introspecting what config options were registered, etc. will likely make
this a frustrating experience to get right.

I took a little time to hack up a much more simple and explicit approach
to config file generation and posted a draft here:

  https://review.openstack.org/100946

The docstring at the top of the file explains the approach:

  https://review.openstack.org/#/c/100946/1/oslo/config/generator.py

Thanks,
Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] An alternative approach to enforcing expected election behaviour

2014-06-16 Thread Mark McLoughlin

On Mon, 2014-06-16 at 10:56 +0100, Daniel P. Berrange wrote:
 On Mon, Jun 16, 2014 at 05:04:51AM -0400, Eoghan Glynn wrote:
  How about we rely instead on the values and attributes that
  actually make our community strong?
  
  Specifically: maturity, honesty, and a self-correcting nature.
  
  How about we simply require that each candidate for a TC or PTL
  election gives a simple undertaking in their self-nomination mail,
  along the lines of:
  
  I undertake to respect the election process, as required by
  the community code of conduct.
  
  I also undertake not to engage in campaign practices that the
  community has considered objectionable in the past, including
  but not limited to, unsolicited mail shots and private campaign
  events.
  
  If my behavior during this election period does not live up to
  those standards, please feel free to call me out on it on this
  mailing list and/or withhold your vote.
 
 I like this proposal because it focuses on the carrot rather than
 the stick, which is ultimately better for community cohesiveness
 IMHO.

I like it too. A slight tweak of that would be to require candidates to
sign the pledge publicly via an online form. We could invite the
community as a whole to sign it too in order to have candidates'
supporters covered.

  It is already part of our community ethos that we can call
 people out to publically debate / stand up  justify any  all
 issues affecting the project whether they be related to the code,
 architecture, or non-technical issues such as electioneering
 behaviour.
 
  We then rely on:
  
(a) the self-policing nature of an honest, open community
  
  and:
  
(b) the maturity and sound judgement within that community
giving us the ability to quickly spot and disregard any
frivolous reports of mis-behavior
  
  So no need for heavy-weight inquisitions, no need to interrupt the
  election process, no need for handing out of stiff penalties such
  as termination of membership.
 
 Before jumping headlong for a big stick to whack people with, I think
 I'd expect to see examples of problems we've actually faced (as opposed
 to vague hypotheticals), and a clear illustration that a self-policing
 approach to the community interaction failed to address them. I've not
 personally seen/experianced any problems that are so severe that they'd
 suggest we need the ability to kick someone out of the community for
 sending email !

Indeed. This discussion is happening in a vacuum for many people who do
not know the details of the private emails and private campaign events
which happened in the previous cycle.

The only one I know of first hand was a private email where the
recipients quickly responded saying the email was out of line and the
original sender apologized profusely. People can make mistakes in good
faith and if we can deal with it quickly and maturely as a community,
all the better.

In this example, the sender's apology could have bee followed up with
look, here's our code of conduct; sign it now, respect it in the
future, and let that be the end of the matter.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo][messaging] messaging vs. messagingv2

2014-06-16 Thread Mark McLoughlin

Hi Ihar,

On Mon, 2014-06-16 at 15:28 +0200, Ihar Hrachyshka wrote:
 Hi all,
 
 I'm currently pushing Neutron to oslo.messaging, and while at it, a
 question popped up.
 
 So in oslo-rpc, we have the following notification drivers available:
 neutron.openstack.common.notifier.log_notifier
 neutron.openstack.common.notifier.no_op_notifier
 neutron.openstack.common.notifier.rpc_notifier2
 neutron.openstack.common.notifier.rpc_notifier
 neutron.openstack.common.notifier.test_notifier
 
 And in oslo.messaging, we have:
 oslo.messaging.notify._impl_log:LogDriver
 oslo.messaging.notify._impl_noop:NoOpDriver
 oslo.messaging.notify._impl_messaging:MessagingV2Driver
 oslo.messaging.notify._impl_messaging:MessagingDriver
 oslo.messaging.notify._impl_test:TestDriver
 
 My understanding is that they map to each other as in [1].
 
 So atm Neutron uses rpc_notifier from oslo-rpc, so I'm going to
 replace it with MessagingDriver. So far so good.

So far so good, indeed.

 But then I've checked docstrings for MessagingDriver and
 MessagingV2Driver [2], and the following looks suspicious to me. For
 MessagingDriver, it's said:
 
 This driver should only be used in cases where there are existing
 consumers deployed which do not support the 2.0 message format.
 
 This sounds like MessagingDriver is somehow obsolete, and we want to
 use MessagingV2Driver unless forced to. But I don't get what those
 consumers are. Are these other projects that interact with us via
 messaging bus?
 
 Another weird thing is that it seems that no other project is actually
 using MessagingV2Driver (at least those that I've checked). Is it even
 running in wild?

The idea is that deployments should move over to the v2 on-the-wire
format, but we've never made any great efforts for that to happen.

Part of the issue here is that notifications are consumed by codebases
outside of OpenStack and, so, changing the default to v2 would likely
unnecessarily disrupt some people.

The reason no-one has pushed very hard on a firm deprecation plan for
the v1 format is that the v2 format doesn't yet offer a huge amount of
advantages. Right now it just adds a '2.0' version number to the format.
When we gain the ability to sign notification messages, this will only
be available via the v2 format and that will encourage more focus on
switching over fully.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Gate proposal - drop Postgresql configurations in the gate

2014-06-13 Thread Mark McLoughlin

On Fri, 2014-06-13 at 07:31 -0400, Sean Dague wrote:
 On 06/13/2014 02:36 AM, Mark McLoughlin wrote:
  On Thu, 2014-06-12 at 22:10 -0400, Dan Prince wrote:
  On Thu, 2014-06-12 at 08:06 -0400, Sean Dague wrote:
  We're definitely deep into capacity issues, so it's going to be time to
  start making tougher decisions about things we decide aren't different
  enough to bother testing on every commit.
 
  In order to save resources why not combine some of the jobs in different
  ways. So for example instead of:
 
   check-tempest-dsvm-full
   check-tempest-dsvm-postgres-full
 
  Couldn't we just drop the postgres-full job and run one of the Neutron
  jobs w/ postgres instead? Or something similar, so long as at least one
  of the jobs which runs most of Tempest is using PostgreSQL I think we'd
  be mostly fine. Not shooting for 100% coverage for everything with our
  limited resource pool is fine, lets just do the best we can.
 
  Ditto for gate jobs (not check).
  
  I think that's what Clark was suggesting in:
  
  https://etherpad.openstack.org/p/juno-test-maxtrices
  
  Previously we've been testing Postgresql in the gate because it has a
  stricter interpretation of SQL than MySQL. And when we didn't test
  Postgresql it regressed. I know, I chased it for about 4 weeks in grizzly.
 
  However Monty brought up a good point at Summit, that MySQL has a strict
  mode. That should actually enforce the same strictness.
 
  My proposal is that we land this change to devstack -
  https://review.openstack.org/#/c/97442/ and backport it to past devstack
  branches.
 
  Then we drop the pg jobs, as the differences between the 2 configs
  should then be very minimal. All the *actual* failures we've seen
  between the 2 were completely about this strict SQL mode interpretation.
 
 
  I suppose I would like to see us keep it in the mix. Running SmokeStack
  for almost 3 years I found many an issue dealing w/ PostgreSQL. I ran it
  concurrently with many of the other jobs and I too had limited resources
  (much less that what we have in infra today).
 
  Would MySQL strict SQL mode catch stuff like this (old bugs, but still
  valid for this topic I think):
 
   https://bugs.launchpad.net/nova/+bug/948066
 
   https://bugs.launchpad.net/nova/+bug/1003756
 
 
  Having support for and testing against at least 2 databases helps keep
  our SQL queries and migrations cleaner... and is generally a good
  practice given we have abstractions which are meant to support this sort
  of thing anyway (so by all means let us test them!).
 
  Also, Having compacted the Nova migrations 3 times now I found many
  issues by testing on multiple databases (MySQL and PostgreSQL). I'm
  quite certain our migrations would be worse off if we just tested
  against the single database.
  
  Certainly sounds like this testing is far beyond the might one day be
  useful level Sean talks about.
 
 The migration compaction is a good point. And I'm happy to see there
 were some bugs exposed as well.
 
 Here is where I remain stuck
 
 We are now at a failure rate in which it's 3 days (minimum) to land a
 fix that decreases our failure rate at all.
 
 The way we are currently solving this is by effectively building manual
 zuul and taking smart humans in coordination to end run around our
 system. We've merged 18 fixes so far -
 https://etherpad.openstack.org/p/gatetriage-june2014 this way. Merging a
 fix this way is at least an order of magnitude more expensive on people
 time because of the analysis and coordination we need to go through to
 make sure these things are the right things to jump the queue.
 
 That effort, over 8 days, has gotten us down to *only* a 24hr merge
 delay. And there are no more smoking guns. What's left is a ton of
 subtle things. I've got ~ 30 patches outstanding right now (a bunch are
 things to clarify what's going on in the build runs especially in the
 fail scenarios). Every single one of them has been failed by Jenkins at
 least once. Almost every one was failed by a different unique issue.
 
 So I'd say at best we're 25% of the way towards solving this. That being
 said, because of the deep queues, people are just recheck grinding (or
 hitting the jackpot and landing something through that then fails a lot
 after landing). That leads to bugs like this:
 
 https://bugs.launchpad.net/heat/+bug/1306029
 
 Which was seen early in the patch - https://review.openstack.org/#/c/97569/
 
 Then kind of destroyed us completely for a day -
 http://status.openstack.org/elastic-recheck/ (it's the top graph).
 
 And, predictably, a week into a long gate queue everyone is now grumpy.
 The sniping between projects, and within projects in assigning blame
 starts to spike at about day 4 of these events. Everyone assumes someone
 else is to blame for these things.
 
 So there is real community impact when we get to these states.
 
 
 
 So, I'm kind of burnt out trying to figure out how to get us out of
 this. As I do take

Re: [openstack-dev] [oslo] versioning and releases

2014-06-12 Thread Mark McLoughlin

On Thu, 2014-06-12 at 12:09 +0200, Thierry Carrez wrote:
 Doug Hellmann wrote:
  On Tue, Jun 10, 2014 at 5:19 PM, Mark McLoughlin mar...@redhat.com wrote:
  On Tue, 2014-06-10 at 12:24 -0400, Doug Hellmann wrote:
  [...]
  Background:
 
  We have two types of oslo libraries. Libraries like oslo.config and
  oslo.messaging were created by extracting incubated code, updating the
  public API, and packaging it. Libraries like cliff and taskflow were
  created as standalone packages from the beginning, and later adopted
  by the oslo team to manage their development and maintenance.
 
  Incubated libraries have been released at the end of a release cycle,
  as with the rest of the integrated packages. Adopted libraries have
  historically been released as needed during their development. We
  would like to synchronize these so that all oslo libraries are
  officially released with the rest of the software created by OpenStack
  developers.
 
 Could you outline the benefits of syncing with the integrated release ?

Sure!

http://lists.openstack.org/pipermail/openstack-dev/2012-November/003345.html

:)

 Personally I see a few drawbacks to this approach:
 
 We dump the new version on consumers usually around RC time, which is
 generally a bad time to push a new version of a   dependency and detect
 potential breakage. Consumers just seem to get the new version at the
 worst possible time.
 
 It also prevents from spreading the work all over the cycle. For example
 it may have been more successful to have the oslo.messaging new release
 by milestone-1 to make sure it's adopted by projects in milestone-2 or
 milestone-3... rather than have it ready by milestone-3 and expect all
 projects to use it by consuming alphas during the cycle.
 
 Now if *all* projects were continuously consuming alpha versions, most
 of those drawbacks would go away.

Yes, that's the plan. Those issues are acknowledged and we're reasonably
confident the alpha versions plan will address them.

  [...]
  Patch Releases:
 
  Updates to existing library releases can be made from stable branches.
  Checking out stable/icehouse of oslo.config for example would allow a
  release 1.3.1. We don't have a formal policy about whether we will
  create patch releases, or whether applications are better off using
  the latest release of the library. Do we need one?
 
  I'm not sure we need one, but if we did I'd expect them to be aligned
  with stable releases.
 
  Right now, I think they'd just be as-needed - if there's enough
  backported to the stable branch to warrant a release, we just cut one.
  
  That's pretty much what I thought, too. We shouldn't need to worry
  about alphas for patch releases, since we won't add features.
 
 Yes, I think we can be pretty flexible about it. But to come back to my
 above remark... should it be stable/icehouse or stable/1.3 ?

It's a branch for bugfix releases of the icehouse version of the
library, so I think stable/icehouse makes sense.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo][messaging] Further improvements and refactoring

2014-06-11 Thread Mark McLoughlin

Hi,

On Tue, 2014-06-10 at 15:47 +0400, Dina Belova wrote:
 Dims,
 
 
 No problem with creating the specs, we just want to understand if the
 community is OK with our suggestions in general :)
 If so, I'll create the appropriate specs and we'll discuss them :)

Personally, I find it difficult to understand the proposals as currently
described and how they address the performance problems you say you see.

The specs process should help flesh out your ideas so they are more
understandable. On the other hand, it's pretty difficult to have an
abstract conversation about code re-factoring. So, some combination of
proof-of-concept patches and specs will probably work best.

Mark.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova][cinder][ceilometer][glance][all] Loading clients from a CONF object

2014-06-11 Thread Mark McLoughlin

On Wed, 2014-06-11 at 16:57 +1200, Steve Baker wrote:
 On 11/06/14 15:07, Jamie Lennox wrote:
  Among the problems cause by the inconsistencies in the clients is that
  all the options that are required to create a client need to go into the
  config file of the service. This is a pain to configure from the server
  side and can result in missing options as servers fail to keep up.
 
  With the session object standardizing many of these options there is the
  intention to make the session be loadable directly from a CONF object. A
  spec has been proposed to this for nova-specs[1] to outline the problem
  and the approach in more detail. 
 
  The TL;DR version is that I intend to collapse all the options to load a
  client down such that each client will have one ini section that looks
  vaguely like: 
 
  [cinder]
  cafile = '/path/to/cas'
  certfile = 'path/to/cert'
  timeout = 5
  auth_name = v2password
  username = 'user'
  password = 'pass'
  
  This list of options is then managed from keystoneclient, thus servers
  will automatically have access to new transport options, authentication
  mechanisms and security fixes as they become available.
 
  The point of this email is to make people aware of this effort and that
  if accepted into nova-specs the same pattern will eventually make it to
  your service (as clients get updated and manpower allows). 
 
  The review containing the config option names is still open[2] so if you
  wish to comment on particulars, please take a look.
 
  Please leave a comment on the reviews or reply to this email with
  concerns or questions. 
 
  Thanks 
 
  Jamie
 
  [1] https://review.openstack.org/#/c/98955/
  [2] https://review.openstack.org/#/c/95015/
 Heat already needs to have configuration options for every client, and
 we've gone with the following pattern:
 http://git.openstack.org/cgit/openstack/heat/tree/etc/heat/heat.conf.sample#n612
 
 Do you have any objection to aligning with what we already have?,
 specifically:
 [clients_clientname]
 ca_file=...
 cert_file=...
 key_file=...

Sounds like there's a good case for an Oslo API for creating client
objects from configuration.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] use of the word certified

2014-06-10 Thread Mark McLoughlin

On Mon, 2014-06-09 at 20:14 -0400, Doug Hellmann wrote:
On Mon, Jun 9, 2014 at 6:11 PM, Eoghan Glynn egl...@redhat.com wrote:

Based on the discussion I'd like to propose these options:
1. Cinder-certified driver - This is an attempt to move the certification
to the project level.
2. CI-tested driver - This is probably the most accurate, at least for what
we're trying to achieve for Juno: Continuous Integration of Vendor-specific
Drivers.

Hi Ramy,

Thanks for these constructive suggestions.

The second option is certainly a very direct and specific reflection of
what is actually involved in getting the Cinder project's imprimatur.

I do like tested.

I'd like to understand what the foundation is planning for
certification as well, to know how big of an issue this really is.
Even if they aren't going to certify drivers, I have heard discussions
around training and possibly other areas so I would hate for us to
introduce confusion by having different uses of that term in similar
contexts. Mark, do you know who is working on that within the board or
foundation?

http://blogs.gnome.org/markmc/2014/05/17/may-11-openstack-foundation-board-meeting/

Boris Renski raised the possibility of the Foundation attaching the
trademark to a verified, certified or tested status for drivers. It
wasn't discussed at length because board members hadn't been briefed in
advance, but I think it's safe to say there was a knee-jerk negative
reaction from a number of members. This is in the context of the
DriverLog report:

http://stackalytics.com/report/driverlog

http://www.mirantis.com/blog/cloud-drivers-openstack-driverlog-part-1-solving-driver-problem/
http://www.mirantis.com/blog/openstack-will-open-source-vendor-certifications/

AIUI the CI tested phrase was chosen in DriverLog to avoid the
controversial area Boris describes in the last link above. I think that
makes sense. Claiming this CI testing replaces more traditional
certification programs is a sure way to bog potentially useful
collaboration down in vendor politics.

Avoiding dragging the project into those sort of politics is something
I'm really keen on, and why I think the word certification is best
avoided so we can focus on what we're actually trying to achieve.

Mark.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [marconi] Reconsidering the unified API model

2014-06-10 Thread Mark McLoughlin

On Mon, 2014-06-09 at 19:31 +, Kurt Griffiths wrote:
 Lately we have been talking about writing drivers for traditional
 message brokers that will not be able to support the message feeds
 part of the API. I’ve started to think that having a huge part of the
 API that may or may not “work”, depending on how Marconi is deployed,
 is not a good story for users, esp. in light of the push to make
 different clouds more interoperable.

Perhaps the first point to get super clear on is why drivers for
traditional message brokers are needed. What problems would such drivers
address? Who would the drivers help? Would the Marconi team recommend
using any of those drivers for a production queuing service? Would the
subset of Marconi's API which is implementable by these drivers really
be useful for application developers?

I'd like to understand that in more detail because I worry the Marconi
team is being pushed into adding these drivers without truly believing
they will be useful. And if that would not be a sane context to make a
serious architectural change.

OTOH if there are real, valid use cases for these drivers, then
understanding those would inform the architecture decision.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] use of the word certified

2014-06-10 Thread Mark McLoughlin

On Tue, 2014-06-10 at 14:06 +0100, Duncan Thomas wrote:
 On 10 June 2014 09:33, Mark McLoughlin mar...@redhat.com wrote:
 
  Avoiding dragging the project into those sort of politics is something
  I'm really keen on, and why I think the word certification is best
  avoided so we can focus on what we're actually trying to achieve.
 
 Avoiding those sorts of politics - 'XXX says it is a certified config,
 it doesn't work, cinder is junk' - is why I'd rather the cinder core
 team had a certification program, at least we've some control then and
 *other* people can't impose their idea of certification on us. I think
 politics happens, whether you will it or not, so a far more sensible
 stance is to play it out in advance.

Exposing which configurations are actively tested is a perfectly sane
thing to do. I don't see why you think calling this certification is
necessary to achieve your goals. I don't know what you mean be others
imposing their idea of certification.

Mark.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] use of the word certified

2014-06-10 Thread Mark McLoughlin

On Tue, 2014-06-10 at 16:09 +0100, Duncan Thomas wrote:
 On 10 June 2014 15:07, Mark McLoughlin mar...@redhat.com wrote:
 
  Exposing which configurations are actively tested is a perfectly sane
  thing to do. I don't see why you think calling this certification is
  necessary to achieve your goals.
 
 What is certification except a formal way of saying 'we tested it'? At
 least when you test it enough to have some degree of confidence in
 your testing.
 
 That's *exactly* what certification means.

I disagree. I think the word has substantially more connotations than
simply this has been tested.

http://lists.openstack.org/pipermail/openstack-dev/2014-June/036963.html

  I don't know what you mean be others
  imposing their idea of certification.
 
 I mean that if some company or vendor starts claiming 'Product X is
 certified for use with cinder',

On what basis would any vendor claim such certification?

  that is bad for the cinder core team,
 since we didn't define what got tested or to what degree.

That sounds like you mean Storage technology X is certified for use
with Vendor Y OpenStack?.

i.e. that Vendor Y has certified the driver for use with their version
if OpenStack but the Cinder team has no influence over what that means
in practice?

 Whether we like it or not, when something doesn't work in cinder, it
 is rare for people to blame the storage vendor in their complaints.
 'Cinder is broken' is what we hear (and I've heard it, even though
 what they meant is 'my storage vendor hasn't tested or updated their
 driver in two releases', that isn't what they /said/).

Presumably people are complaining about that driver not working with
some specific downstream version of OpenStack, right? Not e.g.
stable/icehouse devstack or something?

i.e. even aside from the driver, we're already talking about something
we as an upstream project don't control the quality of.

 Since cinder,
 and therefore cinder-core, is going to get the blame, I feel we should
 try to maintain some degree of control over the claims.

I'm starting to see where you're coming from, but I fear this
certification thing will make it even worse.

Right now you can easily shrug off any responsibility for the quality of
a third party driver or an untested in-tree driver. Sure, some people
may have unreasonable expectations about such things, but you can't stop
people being idiots. You can better communicate expectations, though,
and that's excellent.

But as soon as you certify that driver cinder-core takes on a
responsibility that I would think is unreasonable even if the driver was
tested. But you said it's certified!

Is cinder-core really ready to take on responsibility for every issue
users see with certified drivers and downstream OpenStack products?

 If we run our own minimal certification program, which is what we've
 started doing (started with a script which did a test run and tried to
 require vendors to run it, that didn't work out well so we're now
 requiring CI integration instead), then we at least have the option of
 saying 'You're running an non-certified product, go talk to your
 vendor' when dealing with the cases we have no control over. Vendors
 that don't follow the CI  cert requirements eventually get their
 driver removed, that simple.

What about issues with a certified driver? Don't talk to the vendor,
talk to us instead?

If it's an out-of-tree driver then we say talk to your vendor. 

If it's an in-tree driver, those actively maintaining the driver provide
best effort community support like anything else.

If it's an in-tree driver and isn't being actively maintained, and best
effort community support isn't being provided, then we need a way to
communicate that unmaintained status. The level of testing it receives
is what we currently see as the most important aspect, but it's not the
only aspect.

If the user is actually using a distro or other downstream product
rather than pure upstream, it's completely normal for upstream to say
talk to your distro maintainers or product vendor.

Upstream projects can only provide limited support for even motivated
and clueful users, particularly when those users are actually using
downstream variants of the project. It certainly makes sense to clarify
that, but a certification program will actually raise the expectations
users have about the level of support upstream will provide.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [marconi] Reconsidering the unified API model

2014-06-10 Thread Mark McLoughlin

On Tue, 2014-06-10 at 17:33 +, Janczuk, Tomasz wrote:
 From my perspective the key promise of Marconi is to provide a
 *multi-tenant*, *HTTP* based queuing system. Think an OpenStack equivalent
 of SQS or Azure Storage Queues.
 
 As far as I know there are no off-the-shelve message brokers out these
 that fit that bill.
 
 Note that when I say ³multi-tenant² I don¹t mean just having multi-tenancy
 concept reflected in the APIs. The key aspect of the multi-tenancy is
 security hardening against a variety of attacks absent in single-tenant
 broker deployments. For example, an authenticated DOS attack.

Nicely described.

Now why is there a desire to implement these requirements using
traditional message brokers?

And what Marconi API semantics are impossible to implement using
traditional message brokers?

Either those semantics are fundamental requirements for this API, or the
requirement to have support for traditional message brokers is the
fundamental requirement. We can't have it both ways.

My suspicion is the API semantics are seen by the Marconi team as the
fundamental requirement, and the support for message brokers is very
much a secondary concern. If that's the case, perhaps just label those
drivers as experimental and not recommended and allow them to return a
501 Not Implemented? Yes, it sucks for portability, but all you're doing
is creating space for experimenting with backing Marconi with a message
broker ... not actually recommending it for deployment.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [openstack-tc] use of the word certified

2014-06-08 Thread Mark McLoughlin

Hi John,

On Fri, 2014-06-06 at 13:59 -0600, John Griffith wrote:

 On Fri, Jun 6, 2014 at 1:55 PM, John Griffith john.griff...@solidfire.com 
 wrote:

 On Fri, Jun 6, 2014 at 1:23 PM, Mark McLoughlin mar...@redhat.com 
 wrote:
 On Fri, 2014-06-06 at 13:29 -0400, Anita Kuno wrote:
  The issue I have with the word certify is that it
 requires someone or a
  group of someones to attest to something. The thing
 attested to is only
  as credible as the someone or the group of someones
 doing the attesting.
  We have no process, nor do I feel we want to have a
 process for
  evaluating the reliability of the somones or groups
 of someones doing
  the attesting.
 
  I think that having testing in place in line with
 other programs testing
  of patches (third party ci) in cinder should be
 sufficient to address
  the underlying concern, namely reliability of
 opensource hooks to
  proprietary code and/or hardware. I would like the
 use of the word
  certificate and all its roots to no longer be used
 in OpenStack
  programs with regard to testing. This won't happen
 until we get some
  discussion and agreement on this, which I would like
 to have.
 
 
 Thanks for bringing this up Anita. I agree that
 certified driver or
 similar would suggest something other than I think we
 mean.
  
 Can you expand on the above comment?  In other words a bit
 more about what you mean.  I think from the perspective of a
 number of people that participate in Cinder the intent is in
 fact to say.  Maybe it would help clear some things up for
 folks that don't see why this has become a debatable issue.

Fair question. I didn't elaborate initially because I thought Anita
covered it pretty well.

 By running CI tests successfully that it is in fact a way of
 certifying that our device and driver is in fact 'certified'
 to function appropriately and provide the same level of API
 and behavioral compatability as the default components as
 demonstrated by running CI tests on each submitted patch.

My view is that certification is an attestation that someone can take
the certified combination of a driver and whatever vendor product it is
associated with, and the combination will be fit for purpose in any of
the configurations that it supports.

To achieve anything close to that, we'd need to be explicit about what
distros, deployment tools, OpenStack configurations and vendor
configurations must be supported. And it would be fairly strange for us
to do that considering the way OpenStack just ships tarballs currently
rather than a fully deployable thing.

Also AIUI certification implies some level of warranty or guarantee,
which goes against the pretty clear language WITHOUT WARRANTIES OR
CONDITIONS OF ANY KIND in our license :)

Basically, I think there's a world of difference between what's expected
of a certification body and what a technical community like ours should
IMHO be undertaking in terms of providing information about how
functional and maintained drivers are.

(To be clear, I love any that we're trying to surface information about
how well maintained and tested drivers are)

 Personally I believe part of the contesting of the phrases and
 terms is partly due to the fact that a number of organizations
 have their own certification programs and tests.  I think
 that's great, and they in fact provide some form of
 certification that a device works in their environment and
 to their expectations.  

Also fair, and I should be careful to be clear about my Red Hat bias on
this. I am speaking here with my upstream hat on - i.e. thinking about
what's good for the project, not necessarily Red Hat - but I'm
definitely influenced about the meaning of certification by knowing a
little about Red Hat's product certification program.

 Doing this from a general OpenStack integration perspective
 doesn't seem all that different to me.  For the record, my
 initial response to this was that I didn't have too much
 preference on what it was called (verification, certification
 etc etc), however there seems to be a large number of people
 (not product vendors for what it's worth) that feel
 differently.
 


   On Fri, Jun 6, 2014 at 1:23 PM, Mark McLoughlin mar...@redhat.com 
 wrote

Re: [openstack-dev] [Glance][TC] Glance Functional API and Cross-project API Consistency

2014-06-06 Thread Mark McLoughlin

On Fri, 2014-05-30 at 18:22 +, Hemanth Makkapati wrote:
 Hello All,
 I'm writing to notify you of the approach the Glance community has
 decided to take for doing functional API.  Also, I'm writing to
 solicit your feedback on this approach in the light of cross-project
 API consistency.
 
 At the Atlanta Summit, the Glance team has discussed introducing
 functional API in Glance so as to be able to expose operations/actions
 that do not naturally fit into the CRUD-style. A few approaches are
 proposed and discussed here. We have all converged on the approach to
 include 'action' and action type in the URL. For instance,
 'POST /images/{image_id}/actions/{action_type}'.
 
 However, this is different from the way Nova does actions. Nova
 includes action type in the payload. For instance,
 'POST /servers/{server_id}/action {type: action_type, ...}'. At
 this point, we hit a cross-project API consistency issue mentioned
 here (under the heading 'How to act on resource - cloud perform on
 resources'). Though we are differing from the way Nova does actions
 and hence another source of cross-project API inconsistency , we have
 a few reasons to believe that Glance's way is helpful in certain ways.
 
 The reasons are as following:
 1. Discoverability of operations.  It'll be easier to expose permitted
 actions through schemas a json home document living
 at /images/{image_id}/actions/.
 2. More conducive for rate-limiting. It'll be easier to rate-limit
 actions in different ways if the action type is available in the URL.
 3. Makes more sense for functional actions that don't require a
 request body (e.g., image deactivation).
 
 At this point we are curious to see if the API conventions group
 believes this is a valid and reasonable approach.

It's obviously preferable if new APIs follow conventions established by
existing APIs, but I think you've laid out pretty compelling rationale
for not following Nova's lead on this.

The question is whether Nova should plan on adopting this approach in a
future version of its API?

Mark.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Glance] [TC] Program Mission Statement and the Catalog

2014-06-06 Thread Mark McLoughlin

On Wed, 2014-06-04 at 18:03 -0700, Mark Washenberger wrote:
 Hi folks,
 
 
 I'd like to propose the Images program to adopt a mission statement
 [1] and then change it to reflect our new aspirations of acting as a
 Catalog that works with artifacts beyond just disk images [2]. 
 
 
 Since the Glance mini summit early this year, momentum has been
 building significantly behind catalog effort and I think its time we
 recognize it officially, to ensure further growth can proceed and to
 clarify the interactions the Glance Catalog will have with other
 OpenStack projects.
 
 
 Please see the linked openstack/governance changes, and provide your
 feedback either in this thread, on the changes themselves, or in the
 next TC meeting when we get a chance to discuss.
 
 
 Thanks to Georgy Okrokvertskhov for coming up with the new mission
 statement.

Just quoting the proposal here to make the idea slightly more
accessible, perhaps triggering some discussion here:

  https://review.openstack.org/98002

  Artifact Repository Service:
codename: Glance
mission:
  To provide services to store, browse, share, distribute, and manage
  artifacts consumable by OpenStack services in a unified manner. An 
artifact
  is any strongly-typed, versioned collection of document and bulk,
  unstructured data and is immutable once the artifact is published in the
  repository.

Thanks,
Mark.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [oslo] Mehdi Abaakouk added to oslo.messaging-core

2014-06-06 Thread Mark McLoughlin

Mehdi has been making great contributions and reviews on oslo.messaging
for months now, so I've added him to oslo.messaging-core.

Thank you for all your hard work Mehdi!

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] use of the word certified

2014-06-06 Thread Mark McLoughlin

On Fri, 2014-06-06 at 13:29 -0400, Anita Kuno wrote:
 The issue I have with the word certify is that it requires someone or a
 group of someones to attest to something. The thing attested to is only
 as credible as the someone or the group of someones doing the attesting.
 We have no process, nor do I feel we want to have a process for
 evaluating the reliability of the somones or groups of someones doing
 the attesting.
 
 I think that having testing in place in line with other programs testing
 of patches (third party ci) in cinder should be sufficient to address
 the underlying concern, namely reliability of opensource hooks to
 proprietary code and/or hardware. I would like the use of the word
 certificate and all its roots to no longer be used in OpenStack
 programs with regard to testing. This won't happen until we get some
 discussion and agreement on this, which I would like to have.

Thanks for bringing this up Anita. I agree that certified driver or
similar would suggest something other than I think we mean.

And, for whatever its worth, the topic did come up at a Foundation board
meeting and some board members expressed similar concerns, although I
guess that was more precisely about the prospect of the Foundation
calling drivers certified.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [horizon][infra] Plan for the splitting of Horizon into two repositories

2014-06-05 Thread Mark McLoughlin

On Thu, 2014-05-29 at 15:29 -0400, Anita Kuno wrote:
 On 05/28/2014 08:54 AM, Radomir Dopieralski wrote:
  Hello,
  
  we plan to finally do the split in this cycle, and I started some
  preparations for that. I also started to prepare a detailed plan for the
  whole operation, as it seems to be a rather big endeavor.
  
  You can view and amend the plan at the etherpad at:
  https://etherpad.openstack.org/p/horizon-split-plan
  
  It's still a little vague, but I plan to gradually get it more detailed.
  All the points are up for discussion, if anybody has any good ideas or
  suggestions, or can help in any way, please don't hesitate to add to
  this document.
  
  We still don't have any dates or anything -- I suppose we will work that
  out soonish.
  
  Oh, and great thanks to all the people who have helped me so far with
  it, I wouldn't even dream about trying such a thing without you. Also
  thanks in advance to anybody who plans to help!
  
 I'd like to confirm that we are all aware that this patch creates 16 new
 repos under the administration of horizon-ptl and horizon-core:
 https://review.openstack.org/#/c/95716/
 
 If I'm late to the party and the only one that this is news to, that is
 fine. Sixteen additional repos seems like a lot of additional reviews
 will be needed.

One slightly odd thing about this is that these repos are managed by
horizon-core, so presumably part of the Horizon program, but yet the
repos are under the stackforge/ namespace.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [horizon][infra] Plan for the splitting of Horizon into two repositories

2014-06-05 Thread Mark McLoughlin

On Thu, 2014-06-05 at 11:19 +0200, Radomir Dopieralski wrote:
 On 06/05/2014 10:59 AM, Mark McLoughlin wrote:
 
  If I'm late to the party and the only one that this is news to, that is
  fine. Sixteen additional repos seems like a lot of additional reviews
  will be needed.
  
  One slightly odd thing about this is that these repos are managed by
  horizon-core, so presumably part of the Horizon program, but yet the
  repos are under the stackforge/ namespace.
 
 What would you propose instead?
 Keeping them in repositories external to OpenStack, on github or
 bitbucket sounds wrong.
 Getting them under openstack/ doesn't sound good either, as the
 projects they are packaging are not related to OpenStack.
 
 Have them be managed by someone else? Who?

If they're to be part of the Horizon program, I'd say they should be
under openstack/. If not, perhaps create a new team to manage them.

Mark.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] status of quota class

2014-05-28 Thread Mark McLoughlin

On Wed, 2014-02-19 at 10:27 -0600, Kevin L. Mitchell wrote:
 On Wed, 2014-02-19 at 13:47 +0100, Mehdi Abaakouk wrote:
  But 'quota_class' is never set when a nova RequestContext is created.
 
 When I created quota classes, I envisioned the authentication component
 of the WSGI stack setting the quota_class on the RequestContext, but
 there was no corresponding concept in Keystone.  We need some means of
 identifying groups of tenants.
 
  So my question, what is the plan to finish the 'quota class' feature ? 
 
 I currently have no plan to work on that, and I am not aware of any such
 work.

Just for reference, we discussed the fact that this code was unused two
years ago:

  https://lists.launchpad.net/openstack/msg12200.html

and I see Joe has now completed the process of removing it again:

  https://review.openstack.org/75535
  https://review.openstack.org/91480
  https://review.openstack.org/91699
  https://review.openstack.org/91700

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] custom gerrit dashboard - per project review inbox zero

2014-05-11 Thread Mark McLoughlin

On Fri, 2014-05-09 at 08:20 -0400, Sean Dague wrote:
 Based on some of my blog posts on gerrit queries, I've built and gotten
 integrated a custom inbox zero dashboard which is per project in gerrit.
 
 ex:
 https://review.openstack.org/#/projects/openstack/nova,dashboards/important-changes:review-inbox-dashboard
 
 (replace openstack/nova with the project of your choice).
 
 This provides 3 sections.
 
 = Needs Final +2 =
 
 This is code that has an existing +2, no negative code review feedback,
 and positive jenkins score. So it's mergable if you provide the final +2.
 
 (Gerrit Query: status:open NOT label:Code-Review=0,self
 label:Verified=1,jenkins NOT label:Code-Review=-1 label:Code-Review=2
 NOT label:Workflow=-1 limit:50 )
 
 = No negative feedback =
 
 Changes that have no negative code review feedback, and positive jenkins
 score.
 
 (Gerrit Query: status:open NOT label:Code-Review=0,self
 label:Verified=1,jenkins NOT label:Code-Review=-1 NOT
 label:Workflow=-1 limit:50 )
 
 = Wayward changes =
 
 Changes that have no code review feedback at all (no one has looked at
 it), a positive jenkins score, and are older than 2 days.
 
 (Gerrit Query: status:open label:Verified=1,jenkins NOT
 label:Workflow=-1 NOT label:Code-Review=2 age:2d)
 
 
 In all cases it filters out patches that you've commented on in the most
 recently revision. So as you vote on these things they will disappear
 from your list.
 
 Hopefully people will find this dashboard also useful.

Nicely done. Any reason you've included the stable branches - i.e. not
restricted it to branch:master ?

Thanks,
Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Timestamp formats in the REST API

2014-04-30 Thread Mark McLoughlin

On Tue, 2014-04-29 at 10:39 -0400, Doug Hellmann wrote:
 On Tue, Apr 29, 2014 at 9:48 AM, Mark McLoughlin mar...@redhat.com wrote:
  Hey
 
  In this patch:
 
https://review.openstack.org/83681
 
  by Ghanshyam Mann, we encountered an unusual situation where a timestamp
  in the returned XML looked like this:
 
2014-04-08 09:00:14.399708+00:00
 
  What appeared to be unusual was that the timestamp had both sub-second
  time resolution and timezone information. It was felt that this wasn't a
  valid timestamp format and then some debate about how to 'fix' it:
 
https://review.openstack.org/87563
 
  Anyway, this lead me down a bit of a rabbit hole, so I'm going to
  attempt to document some findings.
 
  Firstly, some definitions:
 
- Python's datetime module talk about datetime objects being 'naive'
  or 'aware'
 
  https://docs.python.org/2.7/library/datetime.html
 
 A datetime object d is aware if d.tzinfo is not None and
  d.tzinfo.utcoffset(d) does not return None. If d.tzinfo is None,
  or if d.tzinfo is not None but d.tzinfo.utcoffset(d) returns
  None, d is naive.
 
  (Most people will have encountered this already, but I'm including
  it for completeness)
 
- The ISO8601 time and date format specifies timestamps like this:
 
  2014-04-29T11:37:00Z
 
  with many variations. One distinguishing aspect of the ISO8601
  format is the 'T' separating date and time. RFC3339 is very closely
  related and serves as easily accessible documentation of the format:
 
 http://www.ietf.org/rfc/rfc3339.txt
 
- The Python iso8601 library allows parsing this time format, but
  also allows subtle variations that don't conform to the standard
  like omitting the 'T' separator:
 
 import iso8601
 iso8601.parse_date('2014-04-29 11:37:00Z')
datetime.datetime(2014, 4, 29, 11, 37, tzinfo=iso8601.iso8601.Utc 
  object at 0x214b050)
 
  Presumably this is for the pragmatic reason that when you stringify
  a datetime object, the resulting string uses ' ' as a separator:
 
 import datetime
 str(datetime.datetime(2014, 4, 29, 11, 37))
'2014-04-29 11:37:00'
 
  And now some observations on what's going on in Nova:
 
- We don't store timezone information in the database, but all our
  timestamps are relative to UTC nonetheless.
 
- The objects code automatically adds the UTC to naive datetime
  objects:
 
  if value.utcoffset() is None:
  value = value.replace(tzinfo=iso8601.iso8601.Utc())
 
  so code that is ported to objects may now be using aware datetime
  objects where they were previously using naive objects.
 
- Whether we store sub-second resolution timestamps in the database
  appears to be database specific. In my quick tests, we store that
  information in sqlite but not MySQL.
 
- However, timestamps added by SQLAlchemy when you do e.g. save() do
  include sub-second information, so some DB API calls may return
  sub-second timestamps even when that information isn't stored in
  the database.
 
  In our REST APIs, you'll essentially see one of three time formats. I'm
  calling them 'isotime', 'strtime' and 'xmltime':
 
- 'isotime' - this is the result from timeutils.isotime(). It
  includes timezone information (i.e. a 'Z' prefix) but not
  microseconds. You'll see this in places where we stringify the
  datetime objects in the API layer using isotime() before passing
  them to the JSON/XML serializers.
 
- 'strtime' - this is the result from timeutils.strtime(). It doesn't
  include timezone information but does include decimal seconds. This
  is what jsonutils.dumps() uses when we're serializing API responses
 
- 'xmltime' or 'str(datetime)' format - this is just what you get
  when you stringify a datetime using str(). If the datetime is tz
  aware or includes non-zero microseconds, then that information will
  be included in the result. This is a significant different versus
  the other two formats where it is clear whether tz and microsecond
  information is included in the string.
 
  but there are some caveats:
 
- I don't know how significant it is these days, but timestamps will
  be serialized to strtime format when going over RPC, but won't be
  de-serialized on the remote end. This could lead to a situation
  where the API layer tries and stringify a strtime formatted string
  using timeutils.isotime(). (see below for a description of those
  formats)
 
- In at least one place - e.g. the 'updated' timestamp for v2
  extensions - we hardcode the timestamp as strings in the code and
  don't currently use one of the formats above.
 
 
  My conclusions from all that:
 
1) This sucks
 
2) At the very least, we should be clear in our API samples tests
   which of the three

[openstack-dev] [nova] Timestamp formats in the REST API

2014-04-29 Thread Mark McLoughlin

Hey

In this patch:

  https://review.openstack.org/83681

by Ghanshyam Mann, we encountered an unusual situation where a timestamp
in the returned XML looked like this:

  2014-04-08 09:00:14.399708+00:00

What appeared to be unusual was that the timestamp had both sub-second
time resolution and timezone information. It was felt that this wasn't a
valid timestamp format and then some debate about how to 'fix' it:

  https://review.openstack.org/87563

Anyway, this lead me down a bit of a rabbit hole, so I'm going to
attempt to document some findings.

Firstly, some definitions:

  - Python's datetime module talk about datetime objects being 'naive' 
or 'aware'

https://docs.python.org/2.7/library/datetime.html

   A datetime object d is aware if d.tzinfo is not None and
d.tzinfo.utcoffset(d) does not return None. If d.tzinfo is None,
or if d.tzinfo is not None but d.tzinfo.utcoffset(d) returns
None, d is naive.

(Most people will have encountered this already, but I'm including 
it for completeness)

  - The ISO8601 time and date format specifies timestamps like this:

2014-04-29T11:37:00Z

with many variations. One distinguishing aspect of the ISO8601 
format is the 'T' separating date and time. RFC3339 is very closely
related and serves as easily accessible documentation of the format:

   http://www.ietf.org/rfc/rfc3339.txt

  - The Python iso8601 library allows parsing this time format, but 
also allows subtle variations that don't conform to the standard 
like omitting the 'T' separator:

   import iso8601
   iso8601.parse_date('2014-04-29 11:37:00Z')
  datetime.datetime(2014, 4, 29, 11, 37, tzinfo=iso8601.iso8601.Utc object 
at 0x214b050)

Presumably this is for the pragmatic reason that when you stringify 
a datetime object, the resulting string uses ' ' as a separator:

   import datetime
   str(datetime.datetime(2014, 4, 29, 11, 37))
  '2014-04-29 11:37:00'

And now some observations on what's going on in Nova:

  - We don't store timezone information in the database, but all our 
timestamps are relative to UTC nonetheless.

  - The objects code automatically adds the UTC to naive datetime 
objects:

if value.utcoffset() is None:
value = value.replace(tzinfo=iso8601.iso8601.Utc())

so code that is ported to objects may now be using aware datetime 
objects where they were previously using naive objects.

  - Whether we store sub-second resolution timestamps in the database 
appears to be database specific. In my quick tests, we store that 
information in sqlite but not MySQL.

  - However, timestamps added by SQLAlchemy when you do e.g. save() do 
include sub-second information, so some DB API calls may return 
sub-second timestamps even when that information isn't stored in 
the database.

In our REST APIs, you'll essentially see one of three time formats. I'm
calling them 'isotime', 'strtime' and 'xmltime':

  - 'isotime' - this is the result from timeutils.isotime(). It 
includes timezone information (i.e. a 'Z' prefix) but not 
microseconds. You'll see this in places where we stringify the 
datetime objects in the API layer using isotime() before passing 
them to the JSON/XML serializers.

  - 'strtime' - this is the result from timeutils.strtime(). It doesn't 
include timezone information but does include decimal seconds. This 
is what jsonutils.dumps() uses when we're serializing API responses 

  - 'xmltime' or 'str(datetime)' format - this is just what you get 
when you stringify a datetime using str(). If the datetime is tz 
aware or includes non-zero microseconds, then that information will 
be included in the result. This is a significant different versus 
the other two formats where it is clear whether tz and microsecond 
information is included in the string.

but there are some caveats:

  - I don't know how significant it is these days, but timestamps will 
be serialized to strtime format when going over RPC, but won't be 
de-serialized on the remote end. This could lead to a situation 
where the API layer tries and stringify a strtime formatted string
using timeutils.isotime(). (see below for a description of those 
formats)

  - In at least one place - e.g. the 'updated' timestamp for v2
extensions - we hardcode the timestamp as strings in the code and 
don't currently use one of the formats above.


My conclusions from all that:

  1) This sucks

  2) At the very least, we should be clear in our API samples tests 
 which of the three formats we expect - we should only change the 
 format used in a given part of the API after considering any 
 compatibility considerations

  3) We should unify on a single format in the v3 API - IMHO, we should 
 be explicit about use of the UTC timezone and we should avoid 
 including

Re: [openstack-dev] [nova] Timestamp formats in the REST API

2014-04-29 Thread Mark McLoughlin

On Tue, 2014-04-29 at 14:48 +0100, Mark McLoughlin wrote:

 My conclusions from all that:
 
   1) This sucks
 
   2) At the very least, we should be clear in our API samples tests 
  which of the three formats we expect - we should only change the 
  format used in a given part of the API after considering any 
  compatibility considerations
 
   3) We should unify on a single format in the v3 API - IMHO, we should 
  be explicit about use of the UTC timezone and we should avoid 
  including microseconds unless there's a clear use case. In other 
  words, we should use the 'isotime' format.
 
   4) The 'xmltime' format is just a dumb historical mistake and since 
  XML support is now firmly out of favor, let's not waste time 
  improving the timestamp situation in XML.
 
   5) We should at least consider moving to a single format in the v2 
  (JSON) API. IMHO, moving from strtime to isotime for fields like 
  created_at and updated_at would be highly unlikely to cause any 
  real issues for API users.
 
 (Following up this email with some patches that I'll link to, but I want
 to link to this email from the patches themselves)

See here:

  
https://review.openstack.org/#/q/project:openstack/nova+topic:timestamp-format,n,z

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Oslo] [Ironic] Can we change rpc_thread_pool_size default value?

2014-04-24 Thread Mark McLoughlin

On Wed, 2014-04-23 at 07:25 +0100, Mark McLoughlin wrote:
On Tue, 2014-04-22 at 15:54 -0700, Devananda van der Veen wrote:
Hi!

When a project is using oslo.messaging, how can we change our default
rpc_thread_pool_size?

---
Background

Ironic has hit a bug where a flood of API requests can deplete the RPC
worker pool on the other end and cause things to break in very bad ways.
Apparently, nova-conductor hit something similar a while back too. There've
been a few long discussions on IRC about it, tracked partially here:
https://bugs.launchpad.net/ironic/+bug/1308680

tldr; a way we can fix this is to set the rpc_thread_pool_size very small
(eg, 4) and keep our conductor.worker_pool size near its current value (eg,
64). I'd like these to be the default option values, rather than require
every user to change the rpc_thread_pool_size in their local ironic.conf
file.

We're also about to switch from the RPC module in oslo-incubator to using
the oslo.messaging library.

Why are these related? Because it looks impossible for us to change the
default for this option from within Ironic, because the option is
registered when EventletExecutor is instantaited (rather than loaded).

https://github.com/openstack/oslo.messaging/blob/master/oslo/messaging/_executors/impl_eventlet.py#L76

It may have been possible for Ironic to set its own default before
oslo.messaging, but it wouldn't have been recommended because there's no
explicit API for doing so.

With oslo.messaging, we have a set_transport_defaults() which shows how
we'd approach adding this capability.

The question comes down to whether this really is a situation where we
need per-application defaults or just that the current defaults are
screwed up. If the latter, I'd much rather just change the defaults.

History is always useful :)

Soren added the threadpool with a default size of 1024:

https://code.launchpad.net/~soren/nova/rpc-threadpool/+merge/49896

Johannes changed it back to 64:

https://review.openstack.org/6792

Mark.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Oslo] [Ironic] Can we change rpc_thread_pool_size default value?

2014-04-23 Thread Mark McLoughlin

On Tue, 2014-04-22 at 15:54 -0700, Devananda van der Veen wrote:
Hi!

When a project is using oslo.messaging, how can we change our default
rpc_thread_pool_size?

---
Background

We're also about to switch from the RPC module in oslo-incubator to using
the oslo.messaging library.

https://github.com/openstack/oslo.messaging/blob/master/oslo/messaging/_executors/impl_eventlet.py#L76

It may have been possible for Ironic to set its own default before
oslo.messaging, but it wouldn't have been recommended because there's no
explicit API for doing so.

With oslo.messaging, we have a set_transport_defaults() which shows how
we'd approach adding this capability.

Mark.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] nominating Victor Stinner for the Oslo core reviewers team

2014-04-22 Thread Mark McLoughlin

On Mon, 2014-04-21 at 12:39 -0400, Doug Hellmann wrote:
 I propose that we add Victor Stinner (haypo on freenode) to the Oslo
 core reviewers team.
 
 Victor is a Python core contributor, and works on the development team
 at eNovance. He created trollius, a port of Python 3's tulip/asyncio
 module to Python 2, at least in part to enable a driver for
 oslo.messaging. He has been quite active with Python 3 porting work in
 Oslo and some other projects, and organized a sprint to work on the
 port at PyCon last week. The patches he has written for the python 3
 work have all covered backwards-compatibility so that the code
 continues to work as before under python 2.
 
 Given his background, skills, and interest, I think he would be a good
 addition to the team.

Sounds good to me!

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] starting regular meetings

2014-04-15 Thread Mark McLoughlin

On Mon, 2014-04-14 at 14:53 -0400, Doug Hellmann wrote:
 Balancing Europe and Pacific TZs is going to be a challenge. I can't
 go at 1800 or 1900, myself, and those are pushing a little late in
 Europe anyway.
 
 How about 1600?
 http://www.timeanddate.com/worldclock/converted.html?iso=20140414T16p1=0p2=2133p3=195p4=224
 
 We would need to move to another room, but that's not a big deal.

Works for me.

Thanks,
Mark.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [OSSG][OSSN] OpenSSL Heartbleed vulnerability can lead to OpenStack compromise

2014-04-10 Thread Mark McLoughlin

On Thu, 2014-04-10 at 00:23 -0700, Nathan Kinder wrote:
 OpenSSL Heartbleed vulnerability can lead to OpenStack compromise
 ---
 
 ### Summary ###
 A vulnerability in OpenSSL can lead to leaking of confidential data
 protected by SSL/TLS in an OpenStack deployment.
 
 ### Affected Services / Software ###
 Grizzly, Havana, OpenSSL
 
 ### Discussion ###
 A vulnerability in OpenSSL code-named Heartbleed was recently discovered
 that allows remote attackers limited access to data in the memory of any
 service using OpenSSL to provide encryption for network communications.
 This can include key material used for SSL/TLS, which means that any
 confidential data that has been sent over SSL/TLS may be compromised.
 For full details, see the following website that describes this
 vulnerability in detail:
 
 http://heartbleed.com/
 
 While OpenStack software itself is not directly affected, any deployment
 of OpenStack is very likely using OpenSSL to provide SSL/TLS
 functionality.
 
 ### Recommended Actions ###
 It is recommended that you immediately update OpenSSL software on the
 systems you use to run OpenStack services.

Not sure if you want to mention it in this OSSN or consider doing it
too, but clients are vulnerable to attack too.

   In most cases, you will want
 to upgrade to OpenSSL version 1.0.1g, though it is recommended that you
 review the exact affected version details on the Heartbleed website
 referenced above.
 
 After upgrading your OpenSSL software, you will need to restart any
 services that use the OpenSSL libraries.  You can get a list of all
 processes that have the old version of OpenSSL loaded by running the
 following command:
 
 lsof | grep ssl | grep DEL
 
 Any processes shown by the above command will need to be restarted, or
 you can choose to restart your entire system if desired.  In an
 OpenStack deployment, OpenSSL is commonly used to enable SSL/TLS
 protection for OpenStack API endpoints, SSL terminators, databases,
 message brokers, and Libvirt remote access.  In addition to the native
 OpenStack services, some commonly used software that may need to be
 restarted includes:
 
   Apache HTTPD
   Libvirt
   MySQL
   Nginx
   PostgreSQL
   Pound
   Qpid
   RabbitMQ
   Stud
 
 It is also recommended that you treat your existing SSL/TLS keys as
 compromised and generate new keys.  This includes keys used to enable
 SSL/TLS protection for OpenStack API endpoints, databases, message
 brokers, and libvirt remote access.

Might be worth mentioning certificate revocation too.

 In addition, any confidential data such as credentials that have been
 sent over a SSL/TLS connection may have been compromised.  It is
 recommended that cloud administrators change any passwords, tokens, or
 other credentials that may have been communicated over SSL/TLS.
 
 ### Contacts / References ###
 This OSSN : https://wiki.openstack.org/wiki/OSSN/OSSN-0012
 OpenStack Security ML : openstack-secur...@lists.openstack.org
 OpenStack Security Group : https://launchpad.net/~openstack-ossg
 Heartbleed Website: http://heartbleed.com/
 CVE: CVE-2014-0160

Very nicely done Nathan.

Not really relevant to the OSSN, but perhaps people will find it
interesting, I posted some thoughts on the wider fallout of heartbleed
this morning:

  http://blogs.gnome.org/markmc/2014/04/10/heartbleed/

Thanks,
Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [olso][neutron] proxying oslo.messaging from management network into tenant network/VMs

2014-04-09 Thread Mark McLoughlin

Hi,

On Wed, 2014-04-09 at 17:33 +0900, Isaku Yamahata wrote:
 Hello developers.
 
 
 As discussed many times so far[1], there are many projects that needs
 to propagate RPC messages into VMs running on OpenStack. Neutron in my case.
 
 My idea is to relay RPC messages from management network into tenant
 network over file-like object. By file-like object, I mean virtio-serial,
 unix domain socket, unix pipe and so on.
 I've wrote some code based on oslo.messaging[2][3] and a documentation
 on use cases.[4][5]
 Only file-like transport and proxying messages would be in oslo.messaging
 and agent side code wouldn't be a part of oslo.messaging.
 
 
 use cases:([5] for more figures)
 file-like object: virtio-serial, unix domain socket, unix pipe
 
   server - AMQP - agent in host -virtio serial- guest agent in VM
   per VM
 
   server - AMQP - agent in host -unix socket/pipe-
  agent in tenant network - guest agent in VM
 
 
 So far there are security concerns to forward oslo.messaging from management
 network into tenant network. One approach is to allow only cast-RPC from
 server to guest agent in VM so that guest agent in VM only receives messages
 and can't send anything to servers. With unix pipe, it's write-only
 for server, read-only for guest agent.
 
 
 Thoughts? comments?

Nice work. This is a pretty gnarly topic, but I think you're doing a
good job thinking through a good solution here.

The advantage this has over Marconi is that it avoids relying on
something which might not be commonplace in OpenStack deployments for a
number of releases yet.

Using vmchannel/virtio-serial to talk to an oslo.messaging proxy server
(with would have a configurable security policy) over a unix socket
oslo.messaging transport in order to allow limited bridging from the
tenant network to management network ... definitely sounds like a
reasonable proposal.

Looking forward to your session at the summit! I also hope to look at
your patches before then.

Thanks,
Mark.



 
 
 Details of Neutron NFV use case[6]:
 Neutron services so far typically runs agents in host, the host agent
 in host receives RPCs from neutron server, then it executes necessary
 operations. Sometimes the agent in host issues RPC to neutron server
 periodically.(e.g. status report etc)
 It's desirable to make such services virtualized as Network Function
 Virtualizaton(NFV), i.e. make those features run in VMs. So it's quite
 natural approach to propagate those RPC message into agents into VMs.
 
 
 [1] https://wiki.openstack.org/wiki/UnifiedGuestAgent
 [2] https://review.openstack.org/#/c/77862/
 [3] https://review.openstack.org/#/c/77863/
 [4] https://blueprints.launchpad.net/oslo.messaging/+spec/message-proxy-server
 [5] https://wiki.openstack.org/wiki/Oslo/blueprints/message-proxy-server
 [6] https://blueprints.launchpad.net/neutron/+spec/adv-services-in-vms



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] use of the oslo namespace package

2014-04-08 Thread Mark McLoughlin

On Mon, 2014-04-07 at 15:24 -0400, Doug Hellmann wrote:
 We can avoid adding to the problem by putting each new library in its
 own package. We still want the Oslo name attached for libraries that
 are really only meant to be used by OpenStack projects, and so we need
 a naming convention. I'm not entirely happy with the crammed
 together approach for oslotest and oslosphinx. At one point Dims and
 I talked about using a prefix oslo_ instead of just oslo, so we
 would have oslo_db, oslo_i18n, etc. That's also a bit ugly,
 though. Opinions?

Uggh :)

 Given the number of problems we have now (I help about 1 dev per week
 unbreak their system),

I've seen you do this - kudos on your patience.

  I think we should also consider renaming the
 existing libraries to not use the namespace package. That isn't a
 trivial change, since it will mean updating every consumer as well as
 the packaging done by distros. If we do decide to move them, I will
 need someone to help put together a migration plan. Does anyone want
 to volunteer to work on that?

One thing to note for any migration plan on this - we should use a new
pip package name for the new version so people with e.g.

   oslo.config=1.2.0

don't automatically get updated to a version which has the code in a
different place. You should need to change to e.g.

  osloconfig=1.4.0

 Before we make any changes, it would be good to know how bad this
 problem still is. Do developers still see issues on clean systems, or
 are all of the problems related to updating devstack boxes? Are people
 figuring out how to fix or work around the situation on their own? Can
 we make devstack more aggressive about deleting oslo libraries before
 re-installing them? Are there other changes we can make that would be
 less invasive?

I don't have any great insight, but hope we can figure something out.
It's crazy to think that even though namespace packages appear to work
pretty well initially, it might end up being so unworkable we would need
to switch.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] oslo.messaging 1.3.0 released

2014-04-01 Thread Mark McLoughlin

Hi

oslo.messaging 1.3.0 is now available on pypi and should be available in
our mirror shortly.

Full release notes are available here:

  http://docs.openstack.org/developer/oslo.messaging/

The master branch will soon be open for Juno targeted development and
we'll publish 1.4.0aN beta releases from master before releasing 1.4.0
for the Juno release.

A stable/icehouse branch will be created for important bugfixes that
will be released as 1.3.N.

Thanks,
Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [depfreeze] [horizon] Exception request: python-keystoneclient=0.7.0

2014-03-27 Thread Mark McLoughlin

On Thu, 2014-03-27 at 13:53 +, Julie Pichon wrote:
 Hi,
 
 I would like to request a depfreeze exception to bump up the keystone
 client requirement [1], in order to reenable the ability for users to
 update their own password with Keystone v3 in Horizon in time for
 Icehouse [2]. This capability is requested by end-users quite often but
 had to be deactivated at the end of Havana due to some issues that are
 now resolved, thanks to the latest keystone client release. Since this
 is a library we control, hopefully this shouldn't cause too much trouble
 for packagers.
 
 Thank you for your consideration.
 
 Julie
 
 
 [1] https://review.openstack.org/#/c/83287/
 [2] https://review.openstack.org/#/c/59918/

IMHO, it's hard to imagine that Icehouse requiring a more recent version
of keystoneclient being a problem or risk for anyone.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Multiple patches in one review

2014-03-25 Thread Mark McLoughlin

On Mon, 2014-03-24 at 10:49 -0400, Russell Bryant wrote:
 Gerrit support for a patch series could certainly be better.

There has long been talking about gerrit getting topic review
functionality, whereby you could e.g. approve a whole series of patches
from a topic view.

See:

  https://code.google.com/p/gerrit/issues/detail?id=51
  https://groups.google.com/d/msg/repo-discuss/5oRra_tLKMA/rxwU7pPAQE8J

My understanding is there's a fork of gerrit out there with this
functionality that some projects are using successfully.

Mark.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] OpenStack vs. SQLA 0.9

2014-03-25 Thread Mark McLoughlin

FYI, allowing 0.9 recently merged into openstack/requirements:

  https://review.openstack.org/79817

This is a good example of how we should be linking gerrit and mailing
list discussions together more. I don't think the gerrit review was
linked in this thread nor was the mailing list discussion linked in the
gerrit review.

Mark.

On Thu, 2014-03-13 at 22:45 -0700, Roman Podoliaka wrote:
 Hi all,
 
 I think it's actually not that hard to fix the errors we have when
 using SQLAlchemy 0.9.x releases.
 
 I uploaded two changes two Nova to fix unit tests:
 - https://review.openstack.org/#/c/80431/ (this one should also fix
 the Tempest test run error)
 - https://review.openstack.org/#/c/80432/
 
 Thanks,
 Roman
 
 On Thu, Mar 13, 2014 at 7:41 PM, Thomas Goirand z...@debian.org wrote:
  On 03/14/2014 02:06 AM, Sean Dague wrote:
  On 03/13/2014 12:31 PM, Thomas Goirand wrote:
  On 03/12/2014 07:07 PM, Sean Dague wrote:
  Because of where we are in the freeze, I think this should wait until
  Juno opens to fix. Icehouse will only be compatible with SQLA 0.8, which
  I think is fine. I expect the rest of the issues can be addressed during
  Juno 1.
 
  -Sean
 
  Sean,
 
  No, it's not fine for me. I'd like things to be fixed so we can move
  forward. Debian Sid has SQLA 0.9, and Jessie (the next Debian stable)
  will be released SQLA 0.9 and with Icehouse, not Juno.
 
  We're past freeze, and this requires deep changes in Nova DB to work. So
  it's not going to happen. Nova provably does not work with SQLA 0.9, as
  seen in Tempest tests.
 
-Sean
 
  I'd be nice if we considered more the fact that OpenStack, at some
  point, gets deployed on top of distributions... :/
 
  Anyway, if we can't do it because of the freeze, then I will have to
  carry the patch in the Debian package. Never the less, someone will have
  to work and fix it. If you know how to help, it'd be very nice if you
  proposed a patch, even if we don't accept it before Juno opens.
 
  Thomas Goirand (zigo)
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Marconi] Why is marconi a queue implementation vs a provisioning API?

2014-03-20 Thread Mark McLoughlin

On Thu, 2014-03-20 at 01:28 +, Joshua Harlow wrote:
 Proxying from yahoo's open source director (since he wasn't initially
 subscribed to this list, afaik he now is) on his behalf.
 
 From Gil Yehuda (Yahoo’s Open Source director).
 
 I would urge you to avoid creating a dependency between Openstack code
 and any AGPL project, including MongoDB. MongoDB is licensed in a very
 strange manner that is prone to creating unintended licensing mistakes
 (a lawyer’s dream). Indeed, MongoDB itself presents Apache licensed
 drivers – and thus technically, users of those drivers are not
 impacted by the AGPL terms. MongoDB Inc. is in the unique position to
 license their drivers this way (although they appear to violate the
 AGPL license) since MongoDB is not going to sue themselves for their
 own violation. However, others in the community create MongoDB drivers
 are licensing those drivers under the Apache and MIT licenses – which
 does pose a problem.
 
 Why? The AGPL considers 'Corresponding Source' to be defined as “the
 source code for shared libraries and dynamically linked subprograms
 that the work is specifically designed to require, such as by intimate
 data communication or control flow between those subprograms and other
 parts of the work. Database drivers *are* work that is designed to
 require by intimate data communication or control flow between those
 subprograms and other parts of the work. So anyone using MongoDB with
 any other driver now invites an unknown --  that one court case, one
 judge, can read the license under its plain meaning and decide that
 AGPL terms apply as stated. We have no way to know how far they apply
 since this license has not been tested in court yet.
 Despite all the FAQs MongoDB puts on their site indicating they don't
 really mean to assert the license terms, normally when you provide a
 license, you mean those terms. If they did not mean those terms, they
 would not use this license. I hope they intended to do something good
 (to get contributions back without impacting applications using their
 database) but, even good intentions have unintended consequences.
 Companies with deep enough pockets to be lawsuit targets, and
 companies who want to be good open source citizens face the problem
 that using MongoDB anywhere invites the future risk of legal
 catastrophe. A simple development change in an open source project can
 change the economics drastically. This is simply unsafe and unwise.
 
 OpenStack's ecosystem is fueled by the interests of many commercial
 ventures who wish to cooperate in the open source manner, but then
 leverage commercial opportunities they hope to create. I suggest that
 using MongoDB anywhere in this project will result in a loss of
 opportunity -- real or perceived, that would outweigh the benefits
 MongoDB itself provides.
 
 tl;dr version: If you want to use MongoDB in your company, that's your
 call. Please don't turn anyone who uses OpenStack components into a
 unsuspecting MongoDB users. Instead, decouple the database from the
 project. It's not worth the legal risk, nor the impact on the
 Apache-ness of this project.

Thanks for that, Josh and Gil.

Rather than cross-posting, I think this MongoDB/AGPLv3 discussion should
continue on the legal-discuss mailing list:

  http://lists.openstack.org/pipermail/legal-discuss/2014-March/thread.html#174

Bear in mind that we (OpenStack, as a project and community) need to
judge whether this is a credible concern or not. If some users said they
were only willing to deploy Apache licensed code in their organization,
we would dismiss that notion pretty quickly. Is this AGPLv3 concern
sufficiently credible that OpenStack needs to take it into account when
making important decisions? That's what I'm hoping to get to in the
legal-discuss thread.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

1 2 3 >

1 - 100 of 223 matches

Mail list logo