date:20140904

Greetings,

Last Tuesday the TC held the first graduation review for Zaqar. During
the meeting some concerns arose. I've listed those concerns below with
some comments hoping that it will help starting a discussion before the
next meeting. In addition, I've added some comments about the project
stability at the bottom and an etherpad link pointing to a list of use
cases for Zaqar.

# Concerns

- Concern on operational burden of requiring NoSQL deploy expertise to
the mix of openstack operational skills

For those of you not familiar with Zaqar, it currently supports 2 nosql
drivers - MongoDB and Redis - and those are the only 2 drivers it
supports for now. This will require operators willing to use Zaqar to
maintain a new (?) NoSQL technology in their system. Before expressing
our thoughts on this matter, let me say that:

1. By removing the SQLAlchemy driver, we basically removed the chance
for operators to use an already deployed OpenStack-technology
2. Zaqar won't be backed by any AMQP based messaging technology for
now. Here's[0] a summary of the research the team (mostly done by
Victoria) did during Juno
3. We (OpenStack) used to require Redis for the zmq matchmaker
4. We (OpenStack) also use memcached for caching and as the oslo
caching lib becomes available - or a wrapper on top of dogpile.cache -
Redis may be used in place of memcached in more and more deployments.
5. Ceilometer's recommended storage driver is still MongoDB, although
Ceilometer has now support for sqlalchemy. (Please correct me if I'm wrong).

That being said, it's obvious we already, to some extent, promote some
NoSQL technologies. However, for the sake of the discussion, lets assume
we don't.

I truly believe, with my OpenStack (not Zaqar's) hat on, that we can't
keep avoiding these technologies. NoSQL technologies have been around
for years and we should be prepared - including OpenStack operators - to
support these technologies. Not every tool is good for all tasks - one
of the reasons we removed the sqlalchemy driver in the first place -
therefore it's impossible to keep an homogeneous environment for all
services.

With this, I'm not suggesting to ignore the risks and the extra burden
this adds but, instead of attempting to avoid it completely by not
evolving the stack of services we provide, we should probably work on
defining a reasonable subset of NoSQL services we are OK with
supporting. This will help making the burden smaller and it'll give
operators the option to choose.

[0] http://blog.flaper87.com/post/marconi-amqp-see-you-later/

- Concern on should we really reinvent a queue system rather than
piggyback on one

As mentioned in the meeting on Tuesday, Zaqar is not reinventing message
brokers. Zaqar provides a service akin to SQS from AWS with an OpenStack
flavor on top. [0]

Some things that differentiate Zaqar from SQS is it's capability for
supporting different protocols without sacrificing multi-tenantcy and
other intrinsic features it provides. Some protocols you may consider
for Zaqar are: STOMP, MQTT.

As far as the backend goes, Zaqar is not re-inventing it either. It sits
on top of existing storage technologies that have proven to be fast and
reliable for this task. The choice of using NoSQL technologies has a lot
to do with this particular thing and the fact that Zaqar needs a storage
capable of scaling, replicating and good support for failover.

[0]
https://wiki.openstack.org/wiki/Zaqar/Frequently_asked_questions#Is_Zaqar_a_provisioning_service_or_a_data_API.3F

- concern on dataplane vs. controlplane, should we add more dataplane
things in the integrated release

I'm really not sure I understand the arguments against dataplane
services. What concerns do people have about these services?
As far as I can tell, we already have several services - some in the
lower layers - that provide a data plane API. For example:

* keystone (service catalogs and tokens)
* glance (image management)
* swift (object storage)
* ceilometer (metrics)
* heat (provisioning)
* barbican (key management)

Are the concerns specific to Zaqar's dataplane API?

- concern on API v2 being already planned

At the meeting, we discussed a bit about Zaqar's API and more
importantly how stable it is. During that discussion I mentioned an
hypothetical v2 of the API. I'd like to clarify that a v2 is not being
planned for Kilo, what we would like to do is to gather feedback from
the community and services consuming Zaqar about the existing API and
use that feedback to design a new version of the API if necessary.

All this has yet to be discussed but most importantly, we would first
like to get more feedback from the community. We have already gotten
some feedback, but it has been fairly limited because most people are
waiting for us to graduate before kicking the tires.

We do have some endpoints that will go away in the API v2 - getting
messages by id,

Re: [openstack-dev] Unexpected error in OpenStack Nova

2014-09-04 Thread Hossein Zabolzadeh

Hi Jesse,
Thanks for your help. I'll continue my discussion under the other related
mailing list.


On Thu, Sep 4, 2014 at 11:23 AM, Jesse Pretorius jesse.pretor...@gmail.com
wrote:

 Hi Hossein,

 openstack-dev is a development mailing list, focused around the future of
 OpenStack and the development thereof. I would recommend that you address
 your question (with appropriate debug log output) to the
 openstack-operators mailing list.

 Best regards,

 Jesse



 On 3 September 2014 21:46, Hossein Zabolzadeh zabolza...@gmail.com
 wrote:

 Any Idea?


 On Wed, Sep 3, 2014 at 6:41 PM, Hossein Zabolzadeh zabolza...@gmail.com
 wrote:

 Hi,
 After successful installation of both keystone and nova, I tried to
 execute the 'nova list' command by the folllowing env variables(My
 Deployment Model is single machine deployment):
 export OS_USERNAME=admin
 export OS_PASSWORD=...
 export OS_TENANT_NAME=service
 export OS_AUTH_URL=http://10.0.0.1:5000

  But the following unknown error was occurred:
 ERROR: attribute 'message' of 'exceptions.BaseException' objects (HTTP
 300)


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] [feature freeze exception] Move to oslo.db

2014-09-04 Thread Joshua Hesketh


Hey,

Yep, I became aware of these this afternoon. The negative votes are due 
to a bad nodepool image. I've rebuilt them and am working on clearing 
the backlog. Sorry for the issues.


Cheers,
Josh

Rackspace Australia

On 9/4/14 4:30 PM, Michael Still wrote:

I'm good with this one too, so that makes three if Joe is ok with this.

@Josh -- can you please take a look at the TH failures?

Thanks,
Michael

On Wed, Sep 3, 2014 at 8:10 PM, Matt Riedemann
mrie...@linux.vnet.ibm.com wrote:


On 9/3/2014 5:08 PM, Andrey Kurilin wrote:

Hi All!

I'd like to ask for a feature freeze exception for porting nova to use
oslo.db.

This change not only removes 3k LOC, but fixes 4 bugs(see commit message
for more details) and provides relevant, stable common db code.

Main maintainers of oslo.db(Roman Podoliaka and Victor Sergeyev) are OK
with this.

Joe Gordon and Matt Riedemann are already signing up, so we need one
more vote from Core developer.

By the way a lot of core projects are using already oslo.db for a
while:  keystone, cinder, glance, ceilometer, ironic, heat, neutron and
sahara. So migration to oslo.db won’t produce any unexpected issues.

Patch is here: https://review.openstack.org/#/c/101901/

--
Best regards,
Andrey Kurilin.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Just re-iterating my agreement to sponsor this.  I'm waiting for the latest
patch set to pass Jenkins and for Roman to review after his comments from
the previous patch set and -1.  Otherwise I think this is nearly ready to
go.

The turbo-hipster failures on the change appear to be infra issues in t-h
rather than problems with the code.

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev






___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] [infra] Alpha wheels for Python 3.x

2014-09-04 Thread Yuriy Taraday

On Wed, Sep 3, 2014 at 7:24 PM, Doug Hellmann d...@doughellmann.com wrote:

 On Sep 3, 2014, at 5:27 AM, Yuriy Taraday yorik@gmail.com wrote:

 On Tue, Sep 2, 2014 at 11:17 PM, Clark Boylan cboy...@sapwetik.org
 wrote:

 It has been pointed out to me that one case where it won't be so easy is
 oslo.messaging and its use of eventlet under python2. Messaging will
 almost certainly need python 2 and python 3 wheels to be separate. I
 think we should continue to use universal wheels where possible and only
 build python2 and python3 wheels in the special cases where necessary.


 We can make eventlet an optional dependency of oslo.messaging (through
 setuptools' extras). In fact I don't quite understand the need for eventlet
 as direct dependency there since we can just write code that uses threading
 library and it'll get monkeypatched if consumer app wants to use eventlet.


 There is code in the messaging library that makes calls directly into
 eventlet now, IIRC. It sounds like that could be changed, but that’s
 something to consider for a future version.


Yes, I hope to see unified threading/eventlet executor there
(futures-based, I guess) some day.

The last time I looked at setuptools extras they were a documented but
 unimplemented specification. Has that changed?


According to docs [1] it works in pip (and has been working in setuptools
for ages), and according to bug [2], it has been working for couple years.

[1] http://pip.readthedocs.org/en/latest/reference/pip_install.html#examples
(#6)
[2] https://github.com/pypa/pip/issues/7

-- 

Kind regards, Yuriy.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] [infra] Alpha wheels for Python 3.x

2014-09-04 Thread Yuriy Taraday

On Wed, Sep 3, 2014 at 8:21 PM, Doug Hellmann d...@doughellmann.com wrote:

  On Sep 3, 2014, at 11:57 AM, Clark Boylan cboy...@sapwetik.org wrote:
  On Wed, Sep 3, 2014, at 08:22 AM, Doug Hellmann wrote:
 
  On Sep 2, 2014, at 3:17 PM, Clark Boylan cboy...@sapwetik.org wrote:
  The setup.cfg classifiers should be able to do that for us, though PBR
  may need updating? We will also need to learn to upload potentially 1
 
  How do you see that working? We want all of the Oslo libraries to,
  eventually, support both python 2 and 3. How would we use the
 classifiers
  to tell when to build a universal wheel and when to build separate
  wheels?
 
  The classifiers provide info on the versions of python we support. By
  default we can build python2 wheel if only 2 is supported, build python3
  wheel if only 3 is supported, build a universal wheel if both are
  supported. Then we can add a setup.cfg flag to override the universal
  wheel default to build both a python2 and python3 wheel instead. Dstufft
  and mordred should probably comment on this idea before we implement
  anything.

 OK. I’m not aware of any python-3-only projects, and the flag to override
 the universal wheel is the piece I was missing. I think there’s already a
 setuptools flag related to whether or not we should build universal wheels,
 isn’t there?


I think we should rely on wheel.universal flag from setup.cfg if it's
there. If it's set, we should always build universal wheels. If it's not
set, we should look in specifiers and build wheels for Python versions that
are mentioned there.

-- 

Kind regards, Yuriy.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo] [infra] Alpha wheels for Python 3.x

2014-09-04 Thread Yuriy Taraday

On Thu, Sep 4, 2014 at 4:47 AM, Jeremy Stanley fu...@yuggoth.org wrote:

 On 2014-09-03 13:27:55 +0400 (+0400), Yuriy Taraday wrote:
 [...]
  May be we should drop 3.3 already?

 It's in progress. Search review.openstack.org for open changes in
 all projects with the topic py34. Shortly I'll also have some
 infra config changes up to switch python33 jobs out for python34,
 ready to drop once the j-3 milestone has been tagged and is finally
 behind us.


Great! Looking forward to purging python 3.3 from my system.

-- 

Kind regards, Yuriy.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Ironic] (Non-)consistency of the Ironic hash ring implementation

2014-09-04 Thread Nejc Saje




On 09/04/2014 01:37 AM, Robert Collins wrote:

On 4 September 2014 00:13, Eoghan Glynn egl...@redhat.com wrote:




On 09/02/2014 11:33 PM, Robert Collins wrote:

The implementation in ceilometer is very different to the Ironic one -
are you saying the test you linked fails with Ironic, or that it fails
with the ceilometer code today?


Disclaimer: in Ironic terms, node = conductor, key = host

The test I linked fails with Ironic hash ring code (specifically the
part that tests consistency). With 1000 keys being mapped to 10 nodes,
when you add a node:
- current ceilometer code remaps around 7% of the keys ( 1/#nodes)
- Ironic code remaps  90% of the keys


So just to underscore what Nejc is saying here ...

The key point is the proportion of such baremetal-nodes that would
end up being re-assigned when a new conductor is fired up.


That was 100% clear, but thanks for making sure.

The question was getting a proper understanding of why it was
happening in Ironic.

The ceilometer hashring implementation is good, but it uses the same
terms very differently (e.g. replicas for partitions) - I'm adapting
the key fix back into Ironic - I'd like to see us converge on a single
implementation, and making sure the Ironic one is suitable for
ceilometer seems applicable here (since ceilometer seems to need less
from the API),


I used the terms that are used in the original caching use-case, as 
described in [1] and are used in the pypi lib as well[2]. With the 
correct approach, there aren't actually any partitions, 'replicas' 
actually denotes the number of times you hash a node onto the ring. As 
for nodeskeys, what's your suggestion?


I've opened a bug[3], so you can add a Closes-Bug to your patch.

[1] http://www.martinbroadhurst.com/Consistent-Hash-Ring.html
[2] https://pypi.python.org/pypi/hash_ring
[3] https://bugs.launchpad.net/ironic/+bug/1365334



If reassigning was cheap Ironic wouldn't have bothered having a hash ring :)

-Rob





___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] can not show the soft_deleted instance

2014-09-04 Thread Eli Qiao

hi all :
I found nova can not list the soft_deleted instances(v2 an v3 no show),
but seen from novaclient help
[tagett@stack-01 devstack]$ nova help restore
usage: nova restore server

Restore a soft-deleted server.

how can I restore a soft-deleted server ? we can not list the soft
deleted instances.

[tagett@stack-01 devstack]$ nova list --all-tenants --status SOFT_DELETED
++--+++-+--+
| ID | Name | Status | Task State | Power State | Networks |
++--+++-+--+
++--+++-+--+

but from the database we can see this status.
MariaDB [nova] select hostname ,vm_state from instances;
| vm1 | soft-delete |
| vm | soft-delete |


I check v3 code, it don't support restore even, how will this go ?
I find some discussion https://review.openstack.org/#/c/35061/2
any comments on that ?
this is designed ? we abandon the restore operation ?
beside I have a WIP patch to have soft-deleted instance listed in v3 api.
https://review.openstack.org/#/c/118641/

-- 
Thanks,
Eli (Li Yong) Qiao

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [nova][api] can not show the soft_deleted instance with nova list

2014-09-04 Thread Eli Qiao


hi all :
I found nova can not list the soft_deleted instances(v2 an v3 no show),
but seen from novaclient help
/[tagett@stack-01 devstack]$ nova help restore //
//usage: nova restore server//

//Restore a soft-deleted server./

how can I restore a soft-deleted server ? we can not list the soft 
deleted instances.


/[tagett@stack-01 devstack]$ nova   list --all-tenants --status 
SOFT_DELETED//

//++--+++-+--+//
//| ID | Name | Status | Task State | Power State | Networks |//
//++--+++-+--+//
//++--+++-+--+//
/
but from the database we can see this status.
/MariaDB [nova] select hostname ,vm_state from instances;//
//| vm1| soft-delete |//
//| vm | soft-delete |/


I check v3 code, it don't support restore even, how will this go ?
I find some discussion https://review.openstack.org/#/c/35061/2
any comments on that ?
this is designed ? we abandon the restore operation ?
beside I have a WIP patch to have soft-deleted instance listed in v3 api.
https://review.openstack.org/#/c/118641/

--
Thanks,
Eli (Li Yong) Qiao

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [nova] [feature freeze exception] FFE for instance tags API extension

2014-09-04 Thread Sergey Nikitin

Hello.

I'd like to ask for a feature freeze exception for the instance tags API
extension:

https://review.openstack.org/#/c/97168/
https://review.openstack.org/#/c/103553/
https://review.openstack.org/#/c/107712/

approved spec https://review.openstack.org/#/c/91444/
blueprint was approved, but its status was changed to Pending Approval
because of FF. https://blueprints.launchpad.net/nova/+spec/tag-instances

The first of these patches has got a +2 from Jay Pipes.
The second was already approved by Jay Pipes and Matt Dietz.

This set of patches was pretty close to merge, but FF came.

In most popular REST API interfaces, objects in the domain model can be
tagged with zero or more simple strings. This feature will allow normal
users to add, remove and list tags for an instance and filter instances by
tags.
Also these changes will allow to use tags to tag other nova objects in
future because created Tag object is universal and any nova object with id
could be tagged by it.

Please consider this feature to be a part of Juno-3 release.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [OpenStack-dev][neutron] Some questions and opinions about the network model port.

2014-09-04 Thread liushixing


hello,guys


port is a very important model concept for network program. I have some doubt 
about that concept now.

Maybe it is still necessary to discuss the definition details of port and give 
some suggestions.

For instance, when we connect a network to a router, neutron will create a 
interface for router, actually

it is implemented as a port. From the topology perspective, use link concept 
is more close to real

description. There is a question here: Does the network need a port to deploy 
control policy？I think so.

We can treat  port as a property of a kind of network node. Without the node, 
port willlose its

significance.So, shell we emphase the port  as a property in the network model? 
Looking forward to

everyone's opinions.


Thanks
Shixing  Liu
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

2014-09-04 Thread Clint Byrum

Excerpts from Flavio Percoco's message of 2014-09-04 00:08:47 -0700:
 Greetings,
 
 Last Tuesday the TC held the first graduation review for Zaqar. During
 the meeting some concerns arose. I've listed those concerns below with
 some comments hoping that it will help starting a discussion before the
 next meeting. In addition, I've added some comments about the project
 stability at the bottom and an etherpad link pointing to a list of use
 cases for Zaqar.
 

Hi Flavio. This was an interesting read. As somebody whose attention has
recently been drawn to Zaqar, I am quite interested in seeing it
graduate.

 # Concerns
 
 - Concern on operational burden of requiring NoSQL deploy expertise to
 the mix of openstack operational skills
 
 For those of you not familiar with Zaqar, it currently supports 2 nosql
 drivers - MongoDB and Redis - and those are the only 2 drivers it
 supports for now. This will require operators willing to use Zaqar to
 maintain a new (?) NoSQL technology in their system. Before expressing
 our thoughts on this matter, let me say that:
 
 1. By removing the SQLAlchemy driver, we basically removed the chance
 for operators to use an already deployed OpenStack-technology
 2. Zaqar won't be backed by any AMQP based messaging technology for
 now. Here's[0] a summary of the research the team (mostly done by
 Victoria) did during Juno
 3. We (OpenStack) used to require Redis for the zmq matchmaker
 4. We (OpenStack) also use memcached for caching and as the oslo
 caching lib becomes available - or a wrapper on top of dogpile.cache -
 Redis may be used in place of memcached in more and more deployments.
 5. Ceilometer's recommended storage driver is still MongoDB, although
 Ceilometer has now support for sqlalchemy. (Please correct me if I'm wrong).
 
 That being said, it's obvious we already, to some extent, promote some
 NoSQL technologies. However, for the sake of the discussion, lets assume
 we don't.
 
 I truly believe, with my OpenStack (not Zaqar's) hat on, that we can't
 keep avoiding these technologies. NoSQL technologies have been around
 for years and we should be prepared - including OpenStack operators - to
 support these technologies. Not every tool is good for all tasks - one
 of the reasons we removed the sqlalchemy driver in the first place -
 therefore it's impossible to keep an homogeneous environment for all
 services.
 

I whole heartedly agree that non traditional storage technologies that
are becoming mainstream are good candidates for use cases where SQL
based storage gets in the way. I wish there wasn't so much FUD
(warranted or not) about MongoDB, but that is the reality we live in.

 With this, I'm not suggesting to ignore the risks and the extra burden
 this adds but, instead of attempting to avoid it completely by not
 evolving the stack of services we provide, we should probably work on
 defining a reasonable subset of NoSQL services we are OK with
 supporting. This will help making the burden smaller and it'll give
 operators the option to choose.
 
 [0] http://blog.flaper87.com/post/marconi-amqp-see-you-later/
 
 
 - Concern on should we really reinvent a queue system rather than
 piggyback on one
 
 As mentioned in the meeting on Tuesday, Zaqar is not reinventing message
 brokers. Zaqar provides a service akin to SQS from AWS with an OpenStack
 flavor on top. [0]
 

I think Zaqar is more like SMTP and IMAP than AMQP. You're not really
trying to connect two processes in real time. You're trying to do fully
asynchronous messaging with fully randomized access to any message.

Perhaps somebody should explore whether the approaches taken by large
scale IMAP providers could be applied to Zaqar.

Anyway, I can't imagine writing a system to intentionally use the
semantics of IMAP and SMTP. I'd be very interested in seeing actual use
cases for it, apologies if those have been posted before.

 Some things that differentiate Zaqar from SQS is it's capability for
 supporting different protocols without sacrificing multi-tenantcy and
 other intrinsic features it provides. Some protocols you may consider
 for Zaqar are: STOMP, MQTT.
 
 As far as the backend goes, Zaqar is not re-inventing it either. It sits
 on top of existing storage technologies that have proven to be fast and
 reliable for this task. The choice of using NoSQL technologies has a lot
 to do with this particular thing and the fact that Zaqar needs a storage
 capable of scaling, replicating and good support for failover.
 

What's odd to me is that other systems like Cassandra and Riak are not
being discussed. There are well documented large scale message storage
systems on both, and neither is encumbered by the same licensing FUD
as MongoDB.

Anyway, again if we look at this as a place to storage and retrieve
messages, and not as a queue, then talking about databases, instead of
message brokers, makes a lot more sense.

 
 - concern on the maturity of the NoQSL not AGPL backend (Redis)

Re: [openstack-dev] [neutron] Status of Neutron at Juno-3

2014-09-04 Thread Miguel Angel Ajo Pelayo

I didn't know that we could ask for FFE, so I'd like to ask (if
yet in time) for:

https://blueprints.launchpad.net/neutron/+spec/agent-child-processes-status

https://review.openstack.org/#/q/status:open+project:openstack/neutron+branch:master+topic:bp/agent-child-processes-status,n,z

To get the ProcessMonitor implemented in the l3_agent and dhcp_agent at least.

I believe the work is ready (I need to check the radvd respawn in the l3 agent).
The ProcessMonitor class is already merged.

Best regards,
Miguel Ángel.

- Original Message -
On Wed, Sep 3, 2014 at 10:19 AM, Mark McClain m...@mcclain.xyz wrote:

On Sep 3, 2014, at 11:04 AM, Brian Haley brian.ha...@hp.com wrote:

On 09/03/2014 08:17 AM, Kyle Mestery wrote:
Given how deep the merge queue is (146 currently), we've effectively
reached feature freeze in Neutron now (likely other projects as well).
So this morning I'm going to go through and remove BPs from Juno which
did not make the merge window. I'll also be putting temporary -2s in
the patches to ensure they don't slip in as well. I'm looking at FFEs
for the high priority items which are close but didn't quite make it:

https://blueprints.launchpad.net/neutron/+spec/l3-high-availability
https://blueprints.launchpad.net/neutron/+spec/add-ipset-to-security
https://blueprints.launchpad.net/neutron/+spec/security-group-rules-for-devices-rpc-call-refactor

I guess I'll be the first to ask for an exception for a Medium since the
code
was originally completed in Icehouse:

https://blueprints.launchpad.net/neutron/+spec/l3-metering-mgnt-ext

The neutronclient-side code was committed in January, and the neutron
side,
https://review.openstack.org/#/c/70090 has had mostly positive reviews
since
then. I've really just spent the last week re-basing it as things moved
along.

+1 for FFE. I think this is good community that fell through the cracks.

I agree, and I've marked it as RC1 now. I'll sort through these with
ttx on Friday and get more clarity on it's official status.

Thanks,
Kyle

mark

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] Feature Freeze Exception process for Juno

2014-09-04 Thread John Garbutt

On 2 September 2014 19:16, Michael Still mi...@stillhq.com wrote:
 We're soon to hit feature freeze, as discussed in Thierry's recent
 email. I'd like to outline the process for requesting a freeze
 exception:

 * your code must already be up for review
 * your blueprint must have an approved spec
 * you need three (3) sponsoring cores for an exception to be granted
 * exceptions must be granted before midnight, Friday this week
 (September 5) UTC
 * the exception is valid until midnight Friday next week
 (September 12) UTC when all exceptions expire

Sorry to top post on this, need to clarify this point:
your blueprint must have an approved spec

I have unapproved the *blueprints* as part of removing things from juno-3.

The reason for this is because drivers control the approved status,
but not control the milestone. I have added a dated note at the base
of each whiteboard, explaining what was happening to the blueprint.
Yes, that all kinda sucks, but its what we have right now.

This is independent of the spec having been approved, and merged into
the specs repo. So we can tell if it got approved into juno still,
from looking at the specs for juno here:
http://specs.openstack.org/openstack/nova-specs/

Thanks,
John

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] Feature Freeze Exception process for Juno

2014-09-04 Thread John Garbutt

On 2 September 2014 21:36, Dan Genin daniel.ge...@jhuapl.edu wrote:
 Just out of curiosity, what is the rational behind upping the number of core
 sponsors for feature freeze exception to 3 if only two +2 are required to
 merge? In Icehouse, IIRC, two core sponsors was deemed sufficient.

We tried having 2 cores in the past, and stuff still didn't get
reviewed. Usually, as on of the sponsors had other things crop up that
took priority, or just didn't get chance to review it.

The idea of 3 being that we can loose one person, and still have
enough people to merge the code. If that doesn't work out, then we
will try something different next time. It was discussed in the
nova-meeting around spec freeze time, and at the mid-cycle a tiny bit,
unless I totally miss-remember that.

Why do this at all? Well we want cores to focus on bug reviews, post
FF. So they need to find extra time to review any of the exceptions,
hence the opt in process.

Thanks,
John

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] Feature Freeze Exception process for Juno

2014-09-04 Thread Day, Phil

 -Original Message-
 From: Nikola Đipanov [mailto:ndipa...@redhat.com]
 Sent: 03 September 2014 10:50
 To: openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] [Nova] Feature Freeze Exception process for
 Juno

snip

 I will follow up with a more detailed email about what I believe we are
 missing, once the FF settles and I have applied some soothing creme to my
 burnout wounds, but currently my sentiment is:

 Contributing features to Nova nowadays SUCKS!!1 (even as a core
 reviewer) We _have_ to change that!

 N.

While agreeing with your overall sentiment, what worries me a tad is implied 
perception that contributing as a core should somehow be easier that as a 
mortal.While I might expect cores to produce better initial code, I though 
the process and standards were intended to be a level playing field.

Has anyone looked at the review bandwidth issue from the perspective of whether 
there has been a change in the amount of time cores now spend contributing vs 
reviewing ?
Maybe there's an opportunity to get cores to mentor non-cores to do the code 
production, freeing up review cycles ?

Phil

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

Hey Clint,

Thanks for reading, some comments in-line:

On 09/04/2014 10:30 AM, Clint Byrum wrote:
 Excerpts from Flavio Percoco's message of 2014-09-04 00:08:47 -0700:

[snip]

 - Concern on should we really reinvent a queue system rather than
 piggyback on one

 As mentioned in the meeting on Tuesday, Zaqar is not reinventing message
 brokers. Zaqar provides a service akin to SQS from AWS with an OpenStack
 flavor on top. [0]

 
 I think Zaqar is more like SMTP and IMAP than AMQP. You're not really
 trying to connect two processes in real time. You're trying to do fully
 asynchronous messaging with fully randomized access to any message.
 
 Perhaps somebody should explore whether the approaches taken by large
 scale IMAP providers could be applied to Zaqar.
 
 Anyway, I can't imagine writing a system to intentionally use the
 semantics of IMAP and SMTP. I'd be very interested in seeing actual use
 cases for it, apologies if those have been posted before.
 
 Some things that differentiate Zaqar from SQS is it's capability for
 supporting different protocols without sacrificing multi-tenantcy and
 other intrinsic features it provides. Some protocols you may consider
 for Zaqar are: STOMP, MQTT.

 As far as the backend goes, Zaqar is not re-inventing it either. It sits
 on top of existing storage technologies that have proven to be fast and
 reliable for this task. The choice of using NoSQL technologies has a lot
 to do with this particular thing and the fact that Zaqar needs a storage
 capable of scaling, replicating and good support for failover.

 
 What's odd to me is that other systems like Cassandra and Riak are not
 being discussed. There are well documented large scale message storage
 systems on both, and neither is encumbered by the same licensing FUD
 as MongoDB.

FWIW, they both have been discussed. As far as Cassandra goes, we raised
the red flag after reading reading this post[0]. The post itself may be
obsolete already but I don't think I have enough knowledge about
Cassandra to actually figure this out. Some folks have come to us asking
for a Cassandra driver and they were interested in contributing/working
on one. I really hope that will happen someday, although it'll certainly
happen as an external driver. Riak, on the other hand, was certainly a
good candidate. What made us go with MongoDB and Redis is they're both
good for the job, they are both likely already deployed in OpenStack
clouds and we have enough knowledge to provide support and maintenance
for both drivers.

As a curious note, ElasticSearch and Swift have also been brought up
several times as valid stores for Zaqar. I haven't thought about this
throughly but I think they'd do a good job.

[0]
http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets

 Anyway, again if we look at this as a place to storage and retrieve
 messages, and not as a queue, then talking about databases, instead of
 message brokers, makes a lot more sense.
 

 - concern on the maturity of the NoQSL not AGPL backend (Redis)

 Redis backend just landed and I've been working on a gate job for it
 today. Although it hasn't been tested in production, if Zaqar graduates,
 it still has a full development cycle to be tested and improved before
 the first integrated release happens.

 
 I'd be quite interested to see how it is expected to scale. From my very
 quick reading of the driver, it only supports a single redis server. No
 consistent hash ring or anything like that.

Indeed, support for redis-cluster is in our roadmap[0]. As of now, it
can be scaled by using Zaqar pools. You can create several pools of
redis nodes, that you can balance between queues.

The next series of benchmarks will be done on this new Redis driver. I
hope those will be ready soon.

[0] https://blueprints.launchpad.net/zaqar/+spec/redis-pool


 # Use Cases

 In addition to the aforementioned concerns and comments, I also would
 like to share an etherpad that contains some use cases that other
 integrated projects have for Zaqar[0]. The list is not exhaustive and
 it'll contain more information before the next meeting.

 [0] https://etherpad.openstack.org/p/zaqar-integrated-projects-use-cases

 
 Just taking a look, there are two basic applications needed:
 
 1) An inbox. Horizon wants to know when snapshots are done. Heat wants
 to know what happened during a stack action. Etc.
 
 2) A user-focused message queue. Heat wants to push data to agents.
 Swift wants to synchronize processes when things happen.
 
 To me, #1 is Zaqar as it is today. #2 is the one that I worry may not
 be served best by bending #1 onto it.

Push semantics are being developed. We've had enough discussions that
have helped preparing the ground for it. However, I believe both use
cases could be covered by Zaqar as-is.

Could you elaborate a bit more on #2? Especially on why you think Zaqar
as is can't serve this specific case?

Also, feel free to add use-cases to that etherpad if

Re: [openstack-dev] [Nova] Feature Freeze Exception process for Juno

2014-09-04 Thread Day, Phil

 
  One final note: the specs referenced above didn't get approved until
  Spec Freeze, which seemed to leave me with less time to implement
  things.  In fact, it seemed that a lot of specs didn't get approved
  until spec freeze.  Perhaps if we had more staggered approval of
  specs, we'd have more staggered submission of patches, and thus less of a
 sudden influx of patches in the couple weeks before feature proposal
 freeze.
 
 Yeah I think the specs were getting approved too late into the cycle, I was
 actually surprised at how far out the schedules were going in allowing things
 in and then allowing exceptions after that.
 
 Hopefully the ideas around priorities/slots/runways will help stagger some of
 this also.
 
I think there is a problem with the pattern that seemed to emerge in June where 
the J.1 period was taken up with spec review  (a lot of good reviews happened 
early in that period, but the approvals kind of came in a lump at the end)  
meaning that the implementation work itself only seemed to really kick in 
during J.2 - and not surprisingly given the complexity of some of the changes 
ran late into J.3.   

We also has previously noted didn’t do any prioritization between those specs 
that were approved - so it was always going to be a race to who managed to get 
code up for review first.  

It kind of feels to me as if the ideal model would be if we were doing spec 
review for K now (i.e during the FF / stabilization period) so that we hit 
Paris with a lot of the input already registered and a clear idea of the range  
of things folks want to do.We shouldn't really have to ask for session 
suggestions for the summit  - they should be something that can be extracted 
from the proposed specs (maybe we do voting across the specs or something like 
that).In that way the summit would be able to confirm the list of specs for 
K and the priority order.

With the current state of the review queue maybe we can’t quite hit this 
pattern for K, but would be worth aspiring to for I ?

Phil

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] Feature Freeze Exception process for Juno

2014-09-04 Thread John Garbutt

Sorry for another top post, but I like how Nikola has pulled this
problem apart, and wanted to respond directly to his response.

On 3 September 2014 10:50, Nikola Đipanov ndipa...@redhat.com wrote:
 The reason many features including my own may not make the FF is not
 because there was not enough buy in from the core team (let's be
 completely honest - I have 3+ other core members working for the same
 company that are by nature of things easier to convince), but because of
 any of the following:

 * Crippling technical debt in some of the key parts of the code

+1

We have problems that need solving.

One of the ideas behind the slots proposal is to encourage work on
the urgent technical debt, before related features are even approved.

 * that we have not been acknowledging as such for a long time

-1

We keep saying thats cool, but we have to fix/finish XXX first.

But... we have been very bad at:
* remembering that, and recording that
* actually fixing those problems

 * which leads to proposed code being arbitrarily delayed once it makes
 the glaring flaws in the underlying infra apparent

Sometimes we only spot this stuff in code reviews, where you throw up
reading all the code around the change, and see all the extra
complexity being added to a fragile bit of the code, and well, then
you really don't want to be the person who clicks approve on that.

We need to track this stuff better. Every time it happens, we should
try make a not to go back there and do more tidy ups.

 * and that specs process has been completely and utterly useless in
 helping uncover (not that process itself is useless, it is very useful
 for other things)

Yeah, it hasn't helped for this.

I don't think we should do this, but I keep thinking about making
specs two step:
* write generally direction doc
* go write the code, maybe upload as WIP
* write the documentation part of the spec
* get docs merged before any code

 I am almost positive we can turn this rather dire situation around
 easily in a matter of months, but we need to start doing it! It will not
 happen through pinning arbitrary numbers to arbitrary processes.

+1

This is ongoing, but there are some major things, I feel we should
stop and fix in kilo.

...and that will make getting features in much worse for a little
while, but it will be much better on the other side.

 I will follow up with a more detailed email about what I believe we are
 missing, once the FF settles and I have applied some soothing creme to
 my burnout wounds

Awesome, please catch up with jogo who was also trying to build this
list. I would love to continue to contribute to that too.

Might be working moving into here:
https://etherpad.openstack.org/p/kilo-nova-summit-topics

The idea was/is to use that list to decide what fills up the majority
of code slots in Juno.

 but currently my sentiment is:
 Contributing features to Nova nowadays SUCKS!!1 (even as a core
 reviewer) We _have_ to change that!

Agreed.


In addition, our bug list would suggest our users are seeing the
impact of this technical debt.


My personal feeling is we also need to tidy up our testing debt too:
* document major bits that are NOT tested, so users are clear
* document what combinations and features we actually see tested up stream
* support different levels of testing: on-demand+daily vs every commit
* making it easier to interested parties to own and maintain some testing
* plan for removing the untested code paths in L
* allow for untested code to enter the tree, as experimental, with the
expectation it gets removed in the following release if not tested,
and architected so that is possible (note this means supporting
experimental APIs that can be ripped out at a later date.)

We have started doing some of the above work. But I think we need to
hold ALL code to the same standard. It seems it will take time to
agree on that standard, but the above is an attempt to compromise
between speed of innovation and stability.


Thanks,
John

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [neutron][IPv6] Neighbor Discovery for HA

2014-09-04 Thread Xu Han Peng


Carl,

Thanks a lot for your reply!

If I understand correctly, in VRRP case, keepalived will be responsible 
for sending out GARPs? By checking the code you provided, I can see all 
the _send_gratuitous_arp_packet call are wrapped by if not is_ha 
condition.


Xu Han


On 09/04/2014 06:06 AM, Carl Baldwin wrote:

It should be noted that send_arp_for_ha is a configuration option
that preceded the more recent in-progress work to add VRRP controlled
HA to Neutron's router.  The option was added, I believe, to cause the
router to send (default) 3 GARPs to the external gateway if the router
was removed from one network node and added to another by some
external script or manual intervention.  It did not send anything on
the internal network ports.

VRRP is a different story and the code in review [1] sends GARPs on
internal and external ports.

Hope this helps avoid confusion in this discussion.

Carl

[1] https://review.openstack.org/#/c/70700/37/neutron/agent/l3_ha_agent.py

On Mon, Sep 1, 2014 at 8:52 PM, Xu Han Peng pengxu...@gmail.com wrote:

Anthony,

Thanks for your reply.

If HA method like VRRP are used for IPv6 router, according to the VRRP RFC
with IPv6 included, the servers should be auto-configured with the active
router's LLA as the default route before the failover happens and still
remain that route after the failover. In other word, there should be no need
to use two LLAs for default route of a subnet unless load balance is
required.

When the backup router become the master router, the backup router should be
responsible for sending out an unsolicited ND neighbor advertisement with
the associated LLA (the previous master's LLA) immediately to update the
bridge learning state and sending out router advertisement with the same
options with the previous master to maintain the route and bridge learning.

This is shown in http://tools.ietf.org/html/rfc5798#section-4.1 and the
actions backup router should take after failover is documented here:
http://tools.ietf.org/html/rfc5798#section-6.4.2. The need for immediate
messaging sending and periodic message sending is documented here:
http://tools.ietf.org/html/rfc5798#section-2.4

Since the keepalived manager support for L3 HA is merged:
https://review.openstack.org/#/c/68142/43. And keepalived release 1.2.0
supports VRRP IPv6 features ( http://www.keepalived.org/changelog.html, see
Release 1.2.0 | VRRP IPv6 Release). I think we can check if keepalived can
satisfy our requirement here and if that will cause any conflicts with
RADVD.

Thoughts?

Xu Han


On 08/28/2014 10:11 PM, Veiga, Anthony wrote:



Anthony and Robert,

Thanks for your reply. I don't know if the arping is there for NAT, but I am
pretty sure it's for HA setup to broadcast the router's own change since the
arping is controlled by send_arp_for_ha config. By checking the man page
of arping, you can find the arping -A we use in code is sending out ARP
REPLY instead of ARP REQUEST. This is like saying I am here instead of
where are you. I didn't realized this either until Brain pointed this out
at my code review below.


That’s what I was trying to say earlier.  Sending out the RA is the same
effect.  RA says “I’m here, oh and I’m also a router” and should supersede
the need for an unsolicited NA.  The only thing to consider here is that RAs
are from LLAs.  If you’re doing IPv6 HA, you’ll need to have two gateway IPs
for the RA of the standby to work.  So far as I know, I think there’s still
a bug out on this since you can only have one gateway per subnet.



http://linux.die.net/man/8/arping

https://review.openstack.org/#/c/114437/2/neutron/agent/l3_agent.py

Thoughts?

Xu Han


On 08/27/2014 10:01 PM, Veiga, Anthony wrote:


Hi Xuhan,

What I saw is that GARP is sent to the gateway port and also to the router
ports, from a neutron router. I’m not sure why it’s sent to the router ports
(internal network). My understanding for arping to the gateway port is that
it is needed for proper NAT operation. Since we are not planning to support
ipv6 NAT, so this is not required/needed for ipv6 any more?


I agree that this is no longer necessary.


There is an abandoned patch that disabled the arping for ipv6 gateway port:
https://review.openstack.org/#/c/77471/3/neutron/agent/l3_agent.py

thanks,
Robert

On 8/27/14, 1:03 AM, Xuhan Peng pengxu...@gmail.com wrote:

As a follow-up action of yesterday's IPv6 sub-team meeting, I would like to
start a discussion about how to support l3 agent HA when IP version is IPv6.

This problem is triggered by bug [1] where sending gratuitous arp packet for
HA doesn't work for IPv6 subnet gateways. This is because neighbor discovery
instead of ARP should be used for IPv6.

My thought to solve this problem turns into how to send out neighbor
advertisement for IPv6 routers just like sending ARP reply for IPv4 routers
after reading the comments on code review [2].

I searched for utilities which can do this and only find a utility called
ndsend [3] as part of

[openstack-dev] [Glance][FFE] glance_store switch-over and random access to image data

Greetings,

I'd like to request a FFE for 2 features I've been working on during
Juno which, unfortunately, haven been delayed for different reasons
during this time.

The first feature is the switch-over to glance_store. Glance store, for
those not familiar with it, is a library containing the code that used
to live under `glance/store`. During Icehouse, this idea came up and I
started working on it right away. By the time of Icehouse was released,
the library was not mature enough and we (glance-core) were a bit
concerned about the risks of rushing this work. At the Juno summit, I
led a session on this library were we discussed the current status and
agreed on a path forward for this library and the other feature below.

The library contains glance's old store code with really few changes to
the API in order to make it decent enough for a library. As you can see
in the review[0], which has been around since June 17th, the amount of
code changed is small.

Once the rename of the library[1] happened, the gate tests started
passing. This is to say that the risks related to the library itself are
low.

One bit that worries me is the alignment between the current
glance_store library and the stores that still exist in glance. I
believe some patches have landed that we need to port to this new
library. However, I'm less worried about that because we can still
backport them and release a new version of the library and it'll still
be consumed by Glance - sorry if it seems I'm oversimplifying the issue.

The second feature that I'd like to get a FFE exception for is the
random access to image data. This feature was approved and agreed on for
Juno. Instead of implementing it in Glance directly, I decided -
genuinely -  to implement it on top of glance_store to avoid the
re-implementation once the library was done. Unfortunately, due to the
delays the library had, this patch got stuck in the review queue.

The feature has been in the review[2] queue since Jun 25. The feature
was implemented on top of the API v2 and users have to opt-in to access
random parts of the image data. This feature has to be backed by
glance_store and it depends on the store support for random access. I
believe the risk related to this feature is low.

Both blueprints[3][4] were discussed and agreed on, although the later
doesn't reflect so.

Cheers,
Flavio

[0] https://review.openstack.org/#/c/100636/
[1]
http://lists.openstack.org/pipermail/openstack-dev/2014-August/044203.html
[2] https://review.openstack.org/#/c/103068/
[3] https://blueprints.launchpad.net/glance/+spec/create-store-package
[4] https://blueprints.launchpad.net/glance/+spec/restartable-image-download

-- 
@flaper87
Flavio Percoco

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [rally][iperf] Benchmarking network performance

2014-09-04 Thread Boris Pavlovic

Hi Ajay,

Thank you for your work on this. Could you please send on review your code?

Here is the instruction:
https://wiki.openstack.org/wiki/Rally/Develop#How_to_contribute

Thanks

Best regards,
Boris Pavlovic


On Wed, Sep 3, 2014 at 10:47 PM, Ajay Kalambur (akalambu) 
akala...@cisco.com wrote:

  Hi
 Looking into the following blueprint which requires that network
 performance tests be done as part of a scenario
 I plan to implement this using iperf and basically a scenario which
 includes a client/server VM pair

  The client than sends out TCP traffic using iperf to server and the VM
 throughput is recorded
 I have a WIP patch attached to this email

  The patch has a dependency on following 2 review
 https://review.openstack.org/#/c/103306/
  https://review.openstack.org/#/c/96300
 https://review.openstack.org/#/c/96300




  On top of this it creates a new VM performance scenario and uses
 floating ips to access the VM and download iperf to the VM and than run
 throughout tests
 The code will  be made more modular but this patch will give you a good
 idea of whats in store.
 We also need to handle the case next where no floating ip is available and
 we assume direct access. We need to have ssh to install the tool and drive
 the tests

  Please look at the attached diff and let me know if overall the flow
 looks fine
 If it does I can make the code more modular and proceed. Note this is Work
 in progress still


  Ajay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [neutron] New meeting rotation starting next week

2014-09-04 Thread Thierry Carrez

Kevin Benton wrote:
 How is the master list compiled into a calendar? Is it possible to use
 that same system to filter by project?

It's manual. I susbscribe to the wikipage and reflect the change in the
Google Cal. It's painful and error-prone. If anyone wants to do it, I'm
happy to give the keys and delegate the responsibility. But frankly,
what we really need is this:

http://lists.openstack.org/pipermail/openstack-infra/2013-December/000517.html

There was a group of students working on it:

http://git.openstack.org/cgit/openstack-infra/gerrit-powered-agenda/

Lance, any news on that? Should we reboot the project?

-- 
Thierry Carrez (ttx)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Ironic] (Non-)consistency of the Ironic hash ring implementation

2014-09-04 Thread Robert Collins

On 4 September 2014 19:53, Nejc Saje ns...@redhat.com wrote:

 I used the terms that are used in the original caching use-case, as
 described in [1] and are used in the pypi lib as well[2]. With the correct
 approach, there aren't actually any partitions, 'replicas' actually denotes
 the number of times you hash a node onto the ring. As for nodeskeys, what's
 your suggestion?

So - we should change the Ironic terms then, I suspect (but lets check
with Deva who wrote the original code where he got them from).

The parameters we need to create a ring are:
 - how many fallback positions we use for data (currently referred to
as replicas)
 - how many times we hash the servers hosting data into the ring
(currently inferred via the hash_partition_exponent / server count)
 - the servers

and then we probe data items as we go.

The original paper isn't
http://www.martinbroadhurst.com/Consistent-Hash-Ring.html -
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.147.1879 is
referenced by it, and that paper doesn't include the term replica
count at all. In other systems like cassandra, replicas generally
refers to how many servers end up holding a copy of the data: Martin
Broadhurts paper uses replica there in quite a different sense - I
much prefer the Ironic use, which says how many servers will be
operating on the data: its externally relevant.

I've no objection talking about keys, but 'node' is an API object in
Ironic, so I'd rather we talk about hosts - or make it something
clearly not node like 'bucket' (which the 1997 paper talks about in
describing consistent hash functions).

So proposal:
 - key - a stringifyable thing to be mapped to buckets
 - bucket a worker/store that wants keys mapped to it
 - replicas - number of buckets a single key wants to be mapped to
 - partitions - number of total divisions of the hash space (power of
2 required)

 I've opened a bug[3], so you can add a Closes-Bug to your patch.

Thanks!

-Rob


-- 
Robert Collins rbtcoll...@hp.com
Distinguished Technologist
HP Converged Cloud

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TripleO] Review metrics - what do we want to measure?

2014-09-04 Thread Derek Higgins

On 14/08/14 00:03, James Polley wrote:
 In recent history, we've been looking each week at stats
 from http://russellbryant.net/openstack-stats/tripleo-openreviews.html
 to get a gauge on how our review pipeline is tracking.
 
 The main stats we've been tracking have been the since the last
 revision without -1 or -2. I've included some history at [1], but the
 summary is that our 3rd quartile has slipped from 13 days to 16 days
 over the last 4 weeks or so. Our 1st quartile is fairly steady lately,
 around 1 day (down from 4 a month ago) and median is unchanged around 7
 days.
 
 There was lots of discussion in our last meeting about what could be
 causing this[2]. However, the thing we wanted to bring to the list for
 the discussion is:
 
 Are we tracking the right metric? Should we be looking to something else
 to tell us how well our pipeline is performing?
 
 The meeting logs have quite a few suggestions about ways we could tweak
 the existing metrics, but if we're measuring the wrong thing that's not
 going to help.
 
 I think that what we are looking for is a metric that lets us know
 whether the majority of patches are getting feedback quickly. Maybe
 there's some other metric that would give us a good indication?

Bring back auto abandon...

Gerrit at one stage not so long ago used to auto abandon patches that
had negative feedback and was over a week without activity, I believe
this was removed when a gerrit upgrade gave core reviewers the ability
to abandon other peoples patches.

This was the single best part of the entire process to keep things moving
o submitters were forced to keep patches current
o reviewers were not looking at stale or already known broken patches
o If something wasn't important it got abandoned and was never heard of
again, if it was important it would be reopened
o patch submitters were forced to engage with the reviewers quickly on
negative feedback instead of leaving a patch sitting there indefinitely

Here is the number I think we should be looking at
http://russellbryant.net/openstack-stats/tripleo-reviewers-30.txt
  Queue growth in the last 30 days: 72 (2.4/day)

http://russellbryant.net/openstack-stats/tripleo-reviewers-90.txt
  Queue growth in the last 90 days: 132 (1.5/day)

Obviously this isn't sustainable

re enabling auto abandon will ensure the majority of the patches we are
looking at are current/good quality and not lost in a sea of -1's.

How would people feel about turning it back on? Can it be done on a per
project basis?

To make the whole process a little friendlier we could increase the time
frame from 1 week to 2.

Derek.

 
 
 
 
 [1] Current Stats since the last revision without -1 or -2 :
 
 Average wait time: 10 days, 17 hours, 6 minutes
 1st quartile wait time: 1 days, 1 hours, 36 minutes
 Median wait time: 7 days, 5 hours, 33 minutes
 3rd quartile wait time: 16 days, 8 hours, 16 minutes
 
 At last week's meeting we had: 3rd quartile wait time: 15 days, 13
 hours, 47 minutes
 A week before that: 3rd quartile wait time: 13 days, 9 hours, 11 minutes
 The week before that was the mid-cycle, but the week before that:
 
 19:53:38 lifeless Stats since the last revision without -1 or -2 :
 19:53:38 lifeless Average wait time: 10 days, 17 hours, 49 minutes
 19:53:38 lifeless 1st quartile wait time: 4 days, 7 hours, 57 minutes
 19:53:38 lifeless Median wait time: 7 days, 10 hours, 52 minutes
 19:53:40 lifeless 3rd quartile wait time: 13 days, 13 hours, 25 minutes
 
 [2] Some of the things suggested as potential causes of the long 3rd
 median times:
 
 * We have small number of really old reviews that have only positive
 scores but aren't being landed
 * Some reviews get a -1 but then sit for a long time waiting for the
 author to reply
 * We have some really old reviews that suddenly get revived after a long
 period being in WIP or abandoned, which reviewstats seems to miscount
 * Reviewstats counts weekends, we don't (so a change that gets pushed at
 5pm US Friday and gets reviewed at 9am Aus Monday would be seen by us as
 having no wait time, but by reviewstats as ~36 hours)
 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Ironic] (Non-)consistency of the Ironic hash ring implementation

2014-09-04 Thread Eoghan Glynn



   The implementation in ceilometer is very different to the Ironic one -
   are you saying the test you linked fails with Ironic, or that it fails
   with the ceilometer code today?
 
  Disclaimer: in Ironic terms, node = conductor, key = host
 
  The test I linked fails with Ironic hash ring code (specifically the
  part that tests consistency). With 1000 keys being mapped to 10 nodes,
  when you add a node:
  - current ceilometer code remaps around 7% of the keys ( 1/#nodes)
  - Ironic code remaps  90% of the keys
 
  So just to underscore what Nejc is saying here ...
 
  The key point is the proportion of such baremetal-nodes that would
  end up being re-assigned when a new conductor is fired up.
 
 That was 100% clear, but thanks for making sure.
 
 The question was getting a proper understanding of why it was
 happening in Ironic.
 
 The ceilometer hashring implementation is good, but it uses the same
 terms very differently (e.g. replicas for partitions) - I'm adapting
 the key fix back into Ironic - I'd like to see us converge on a single
 implementation, and making sure the Ironic one is suitable for
 ceilometer seems applicable here (since ceilometer seems to need less
 from the API),

Absolutely +1 on converging on a single implementation. That was
our intent on the ceilometer side from the get-go, to promote a
single implementation to oslo that both projects could share.

This turned out not to be possible in the short term when the
non-consistent aspect of the Ironic implementation was discovered
by Nejc, with the juno-3 deadline looming. However for kilo, we
would definitely be interested in leveraging a best-of-breed
implementation from oslo.

 If reassigning was cheap Ironic wouldn't have bothered having a hash ring :)

Fair enough, I was just allowing for the possibility that avoidance
of needless re-mapping hasn't as high a priority on the ironic side
as it was for ceilometer.

Cheers,
Eoghan

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

Position statement
==

Over the past year I've increasingly come to the conclusion that
Nova is heading for (or probably already at) a major crisis. If
steps are not taken to avert this, the project is likely to loose
a non-trivial amount of talent, both regular code contributors and
core team members. That includes myself. This is not good for
Nova's long term health and so should be of concern to anyone
involved in Nova and OpenStack.

For those who don't want to read the whole mail, the executive
summary is that the nova-core team is an unfixable bottleneck
in our development process with our current project structure.
The only way I see to remove the bottleneck is to split the virt
drivers out of tree and let them all have their own core teams
in their area of code, leaving current nova core to focus on
all the common code outside the virt driver impls. I, now, none
the less urge people to read the whole mail.


Background information
==

I see many factors coming together to form the crisis

 - Burn out of core team members from over work 
 - Difficulty bringing new talent into the core team
 - Long delay in getting code reviewed  merged
 - Marginalization of code areas which aren't popular
 - Increasing size of nova code through new drivers
 - Exclusion of developers without corporate backing

Each item on their own may not seem too bad, but combined they
add up to a big problem.

Core team burn out
--

Having been involved in Nova for several dev cycles now, it is clear
that the backlog of code up for review never goes away. Even
intensive code review efforts at various points in the dev cycle
makes only a small impact on the backlog. This has a pretty
significant impact on core team members, as their work is never
done. At best, the dial is sometimes set to 10, instead of 11.

Many people, myself included, have built tools to help deal with
the reviews in a more efficient manner than plain gerrit allows
for. These certainly help, but they can't ever solve the problem
on their own - just make it slightly more bearable. And this is
not even considering that core team members might have useful
contributions to make in ways beyond just code review. Ultimately
the workload is just too high to sustain the levels of review
required, so core team members will eventually burn out (as they
have done many times already).

Even if one person attempts to take the initiative to heavily
invest in review of certain features it is often to no avail.
Unless a second dedicated core reviewer can be found to 'tag
team' it is hard for one person to make a difference. The end
result is that a patch is +2d and then sits idle for weeks or
more until a merge conflict requires it to be reposted at which
point even that one +2 is lost. This is a pretty demotivating
outcome for both reviewers  the patch contributor.


New core team talent


It can't escape attention that the Nova core team does not grow
in size very often. When Nova was younger and its code base was
smaller, it was easier for contributors to get onto core because
the base level of knowledge required was that much smaller. To
get onto core today requires a major investment in learning Nova
over a year or more. Even people who potentially have the latent
skills may not have the time available to invest in learning the
entire of Nova.

With the number of reviews proposed to Nova, the core team should
probably be at least double its current size[1]. There is plenty of
expertize in the project as a whole but it is typically focused
into specific areas of the codebase. There is nowhere we can find
20 more people with broad knowledge of the codebase who could be
promoted even over the next year, let alone today. This is ignoring
that many existing members of core are relatively inactive due to
burnout and so need replacing. That means we really need another
25-30 people for core. That's not going to happen.


Code review delays
--

The obvious result of having too much work for too few reviewers
is that code contributors face major delays in getting their work
reviewed and merged. From personal experience, during Juno, I've
probably spent 1 week in aggregate on actual code development vs
8 weeks on waiting on code review. You have to constantly be on
alert for review comments because unless you can respond quickly
(and repost) while you still have the attention of the reviewer,
they may not be look again for days/weeks.

The length of time to get work merged serves as a demotivator to
actually do work in the first place. I've personally avoided doing
alot of code refactoring  cleanup work that would improve the
maintainability of the libvirt driver in the long term, because
I can't face the battle to get it reviewed  merged. Other people
have told me much the same. It is not uncommon to see changes that
have been pending for 2 dev cycles, not because the code was bad
but because

Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

2014-09-04 Thread Chris Dent

On Thu, 4 Sep 2014, Flavio Percoco wrote:

Thanks for writing this up, interesting read.

5. Ceilometer's recommended storage driver is still MongoDB, although
Ceilometer has now support for sqlalchemy. (Please correct me if I'm wrong).

For sake of reference: Yes, MongoDB is currently the recommended
store and yes, sqlalchemy support is present. Until recently only
sqlalchemy support was tested in the gate. Two big changes being
developed in Juno related to storage:

* Improved read and write performance in the sqlalchemy setup.
* time series storage and Gnocchi:

https://julien.danjou.info/blog/2014/openstack-ceilometer-the-gnocchi-experiment

+1. Ain't that the truth.

As mentioned in the meeting on Tuesday, Zaqar is not reinventing message
brokers. Zaqar provides a service akin to SQS from AWS with an OpenStack
flavor on top. [0]

In my efforts to track this stuff I remain confused on the points in
these two questions:

https://wiki.openstack.org/wiki/Zaqar/Frequently_asked_questions#How_does_Zaqar_compare_to_oslo.messaging.3F
https://wiki.openstack.org/wiki/Zaqar/Frequently_asked_questions#Is_Zaqar_an_under-cloud_or_an_over-cloud_service.3F

What or where is the boundary between Zaqar and existing messaging
infrastructure? Not just in terms of technology but also use cases?
The answers above suggest its not super solid on the use case side,
notably: In addition, several projects have expressed interest in
integrating with Zaqar in order to surface events...

Instead of Zaqar doing what it does and instead of oslo.messaging
abstracting RPC, why isn't the end goal a multi-tenant, multi-protocol
event pool? Wouldn't that have the most flexibility in terms of
ecosystem and scalability?

In addition to the aforementioned concerns and comments, I also would
like to share an etherpad that contains some use cases that other
integrated projects have for Zaqar[0]. The list is not exhaustive and
it'll contain more information before the next meeting.

[0] https://etherpad.openstack.org/p/zaqar-integrated-projects-use-cases

For these, what is Zaqar providing that oslo.messaging (and its
still extant antecedents) does not? I'm not asking to naysay Zaqar,
but to understand more clearly what's going on. My interest here comes
from a general interest in now events and notifications are handled
throughout OpenStack.

Thanks.

--
Chris Dent tw:@anticdent freenode:cdent
https://tank.peermore.com/tanks/cdent

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Fuel] Goals for 5.1.1 6.0

2014-09-04 Thread Mike Scherbakov

Thanks, Dmitry.
Let's get short status on these items during Fuel Weekly Meeting today [1].


[1] https://etherpad.openstack.org/p/fuel-weekly-meeting-agenda


On Wed, Sep 3, 2014 at 7:52 PM, Dmitry Pyzhov dpyz...@mirantis.com wrote:

 Feature blockers:
 Versioning
 https://blueprints.launchpad.net/fuel/+spec/nailgun-versioning for REST
 API https://blueprints.launchpad.net/fuel/+spec/nailgun-versioning-api,
 UI, serialization
 https://blueprints.launchpad.net/fuel/+spec/nailgun-versioning-rpc

 Ongoing activities:
 Nailgun plugins
 https://blueprints.launchpad.net/fuel/+spec/nailgun-plugins

 Stability and Reliability:
 Docs for serialization data
 Docs for REST API data
 https://blueprints.launchpad.net/fuel/+spec/documentation-on-rest-api-input-output
 Nailgun unit tests restructure
 Image based provisioning
 https://blueprints.launchpad.net/fuel/+spec/image-based-provisioning
 Granular deployment
 https://blueprints.launchpad.net/fuel/+spec/granular-deployment-based-on-tasks
 Artifact-based build system
 Power management
 Fencing https://blueprints.launchpad.net/fuel/+spec/ha-fencing

 Features:
 Advanced networking
 https://blueprints.launchpad.net/fuel/+spec/advanced-networking (blocked
 by Multi L2 support)

 Some of this items will not fit 6.0, I guess. But we should work on them
 now.



 On Thu, Aug 28, 2014 at 4:26 PM, Mike Scherbakov mscherba...@mirantis.com
  wrote:

 Hi Fuelers,
 while we are busy with last bugs which block us from releasing 5.1, we
 need to start thinking about upcoming releases. Some of you already started
 POC, some - specs, and I see discussions in ML and IRC.

 From overall strategy perspective, focus for 6.0 is:

- OpenStack Juno release
- Certificate 100-node deployment. In terms of OpenStack, if not
possible for Juno, let's do for Icehouse
- Send anonymous stats about deployment (deployment modes, features
used, etc.)
- Stability and Reliability

 Let's take a little break and think, in a first order, about features,
 sustaining items and bugs which block us from releasing whether 5.1.1 or
 6.0.
 We have to start creating blueprints (and moving them to 6.0 milestone)
 and make sure there are critical bugs assigned to appropriate milestone, if
 there are any.

 Examples which come to my mind immediately:

- Use service token to auth in Keystone for upgrades (affects 5.1.1),
instead of plain admin login / pass. Otherwise it affects security, and
user should keep password in plain text
- Decrease upgrade tarball size

 Please come up with blueprints and LP bugs links, and short explanation
 why it's a blocker for upcoming releases.

 Thanks,
 --
 Mike Scherbakov
 #mihgen


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




-- 
Mike Scherbakov
#mihgen
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [nova] FFE request v2-on-v3-api

2014-09-04 Thread Christopher Yeoh

Hi,

I'd like to request a FFE for 4 changesets from the v2-on-v3-api
blueprint:

https://review.openstack.org/#/c/113814/
https://review.openstack.org/#/c/115515/
https://review.openstack.org/#/c/115576/
https://review.openstack.org/#/c/11/

They have all already been approved and were in the gate for a while
but just didn't quite make it through in time. So they shouldn't put any
load on reviewers.

Sponsoring cores:
Kenichi Ohmichi
John Garbutt
Me

Regards,

Chris

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Ironic] (Non-)consistency of the Ironic hash ring implementation

2014-09-04 Thread Nejc Saje




On 09/04/2014 11:51 AM, Robert Collins wrote:

On 4 September 2014 19:53, Nejc Saje ns...@redhat.com wrote:


I used the terms that are used in the original caching use-case, as
described in [1] and are used in the pypi lib as well[2]. With the correct
approach, there aren't actually any partitions, 'replicas' actually denotes
the number of times you hash a node onto the ring. As for nodeskeys, what's
your suggestion?


So - we should change the Ironic terms then, I suspect (but lets check
with Deva who wrote the original code where he got them from).

The parameters we need to create a ring are:
  - how many fallback positions we use for data (currently referred to
as replicas)
  - how many times we hash the servers hosting data into the ring
(currently inferred via the hash_partition_exponent / server count)
  - the servers

and then we probe data items as we go.

The original paper isn't
http://www.martinbroadhurst.com/Consistent-Hash-Ring.html -
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.147.1879 is
referenced by it, and that paper doesn't include the term replica
count at all. In other systems like cassandra, replicas generally
refers to how many servers end up holding a copy of the data: Martin
Broadhurts paper uses replica there in quite a different sense - I
much prefer the Ironic use, which says how many servers will be
operating on the data: its externally relevant.


It doesn't contain that term precisely, but it does talk about 
replicating the buckets. What about using a descriptive name for this 
parameter, like 'distribution_quality', where the higher the value, 
higher the distribution evenness (and higher memory usage)?




I've no objection talking about keys, but 'node' is an API object in
Ironic, so I'd rather we talk about hosts - or make it something
clearly not node like 'bucket' (which the 1997 paper talks about in
describing consistent hash functions).

So proposal:
  - key - a stringifyable thing to be mapped to buckets

What about using the term 'item' from the original paper as well?


  - bucket a worker/store that wants keys mapped to it
  - replicas - number of buckets a single key wants to be mapped to
Can we keep this as an Ironic-internal parameter? Because it doesn't 
really affect the hash ring. If you want multiple buckets for your item, 
you just continue your journey along the ring and keep returning new 
buckets. Check out how the pypi lib does it: 
https://github.com/Doist/hash_ring/blob/master/hash_ring/ring.py#L119



  - partitions - number of total divisions of the hash space (power of
2 required)
I don't think there are any divisions of the hash space in the correct 
implementation, are there? I think that in the current Ironic 
implementation this tweaks the distribution quality, just like 
'replicas' parameter in Ceilo implementation.


Cheers,
Nejc




I've opened a bug[3], so you can add a Closes-Bug to your patch.


Thanks!

-Rob




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] Feature Freeze Exception process for Juno

On 09/04/2014 02:07 AM, Joe Gordon wrote:
 
 
 
 On Wed, Sep 3, 2014 at 2:50 AM, Nikola Đipanov ndipa...@redhat.com
 mailto:ndipa...@redhat.com wrote:
 
 On 09/02/2014 09:23 PM, Michael Still wrote:
  On Tue, Sep 2, 2014 at 1:40 PM, Nikola Đipanov
 ndipa...@redhat.com mailto:ndipa...@redhat.com wrote:
  On 09/02/2014 08:16 PM, Michael Still wrote:
  Hi.
 
  We're soon to hit feature freeze, as discussed in Thierry's recent
  email. I'd like to outline the process for requesting a freeze
  exception:
 
  * your code must already be up for review
  * your blueprint must have an approved spec
  * you need three (3) sponsoring cores for an exception to be
 granted
 
  Can core reviewers who have features up for review have this number
  lowered to two (2) sponsoring cores, as they in reality then need
 four
  (4) cores (since they themselves are one (1) core but cannot really
  vote) making it an order of magnitude more difficult for them to hit
  this checkbox?
 
  That's a lot of numbers in that there paragraph.
 
  Let me re-phrase your question... Can a core sponsor an exception they
  themselves propose? I don't have a problem with someone doing that,
  but you need to remember that does reduce the number of people who
  have agreed to review the code for that exception.
 
 
 Michael has correctly picked up on a hint of snark in my email, so let
 me explain where I was going with that:
 
 The reason many features including my own may not make the FF is not
 because there was not enough buy in from the core team (let's be
 completely honest - I have 3+ other core members working for the same
 company that are by nature of things easier to convince), but because of
 any of the following:
 
 
 I find the statement about having multiple cores at the same company
 very concerning. To quote Mark McLoughlin, It is assumed that all core
 team members are wearing their upstream hat and aren't there merely to
 represent their employers interests [0]. Your statement appears to be
 in direct conflict with Mark's idea of what core reviewer is, and idea
 that IMHO is one of the basic tenants of OpenStack development.
 

This is of course taking my words completely out of context - I was
making a point of how arbitrary changing the number of reviewers needed
is, and how it completely misses the real issues IMHO.

I have no interest in continuing this particular debate further, and
would appreciate if people could refraining from resorting to such
straw-man type arguments, as it can be very damaging to the overall
level of conversation we need to maintain.

 [0] http://lists.openstack.org/pipermail/openstack-dev/2013-July/012073.html
 
  
 
 
 * Crippling technical debt in some of the key parts of the code
 * that we have not been acknowledging as such for a long time
 * which leads to proposed code being arbitrarily delayed once it makes
 the glaring flaws in the underlying infra apparent
 * and that specs process has been completely and utterly useless in
 helping uncover (not that process itself is useless, it is very useful
 for other things)
 
 I am almost positive we can turn this rather dire situation around
 easily in a matter of months, but we need to start doing it! It will not
 happen through pinning arbitrary numbers to arbitrary processes.
 
 
 Nova is big and complex enough that I don't think any one person is able
 to identify what we need to work on to make things better. That is one
 of the reasons why I have the project priorities patch [1] up. I would
 like to see nova as a team discuss and come up with what we think we
 need to focus on to get us back on track.
 
 
 [1] https://review.openstack.org/#/c/112733/
 

Yes - I was thinking along similar lines as what you propose on that
patch, too bad if the above sentence came across as implying I had some
kind of cowboy one-man crusade in mind :) it is totally not what I meant.

We need strong consensus on what is important for the project, and we
need hands behind that (both hackers and reviewers). Having a good chunk
of core devs not actually writing critical bits of code is a bad sign IMHO.

I have some additions to your list of priorities which I will add as
comments on the review above (with some other comments of my own), and
we can discuss from there - sorry I missed this! I will likely do that
instead of spamming further with another email as the baseline seems
sufficiently similar to where I stand.

 
 
 I will follow up with a more detailed email about what I believe we are
 missing, once the FF settles and I have applied some soothing creme to
 my burnout wounds, but currently my sentiment is:
 
 Contributing features to Nova nowadays SUCKS!!1 (even as a core
 reviewer) We _have_ to change that!
 
 
 Yes, I can agree with you on

[openstack-dev] [Nova][FFE] Feature freeze exception for virt-driver-numa-placement

Hi team,

I am requesting the exception for the feature from the subject (find
specs at [1] and outstanding changes at [2]).

Some reasons why we may want to grant it:

First of all all patches have been approved in time and just lost the
gate race.

Rejecting it makes little sense really, as it has been commented on by a
good chunk of the core team, most of the invasive stuff (db migrations
for example) has already merged, and the few parts that may seem
contentious have either been discussed and agreed upon [3], or can
easily be addressed in subsequent bug fixes.

It would be very beneficial to merge it so that we actually get real
testing on the feature ASAP (scheduling features are not tested in the
gate so we need to rely on downstream/3rd party/user testing for those).

Thanks,

Nikola

[1]
http://git.openstack.org/cgit/openstack/nova-specs/tree/specs/juno/virt-driver-numa-placement.rst
[2]
https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/virt-driver-numa-placement,n,z
[3] https://review.openstack.org/#/c/111782/

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] Feature Freeze Exception process for Juno

On Thu, Sep 04, 2014 at 09:05:57AM +, Day, Phil wrote:

  -Original Message-
  From: Nikola Đipanov [mailto:ndipa...@redhat.com]
  Sent: 03 September 2014 10:50
  To: openstack-dev@lists.openstack.org
  Subject: Re: [openstack-dev] [Nova] Feature Freeze Exception process for
  Juno

 snip

  I will follow up with a more detailed email about what I believe we are
  missing, once the FF settles and I have applied some soothing creme to my
  burnout wounds, but currently my sentiment is:

  Contributing features to Nova nowadays SUCKS!!1 (even as a core
  reviewer) We _have_ to change that!

[snip]

 Has anyone looked at the review bandwidth issue from the perspective of
 whether there has been a change in the amount of time cores now spend
 contributing vs reviewing ?

I've certainly spent more time reviewing code in the last 2 dev cycles,
not least because I need something todo while waiting for my own code
submissions to get reviewed  merged (which feels like it is taking
longer  longer). Despite the huge efforts in review we're barely
denting the flow and are having to get ever better at saying no to
proposed features to cope.

 Maybe there's an opportunity to get cores to mentor non-cores to do 
 the code production, freeing up review cycles ?

As a core dev I want to feel that I'm still able to do valuable code
submission myself, while also doing the important code review work.
IOW, I don't want to end up with core team job requiring 100% of time
to be spent on review cycles, as from my POV that ends up with little
to no job satisfaction. Core needs to be able to maintain a balance
between doing review and being able to scratch the itch in their own
areas of coding interest.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova][FFE] Feature freeze exception for virt-driver-numa-placement

On Thu, Sep 04, 2014 at 01:58:58PM +0200, Nikola Đipanov wrote:
 Hi team,
 
 I am requesting the exception for the feature from the subject (find
 specs at [1] and outstanding changes at [2]).
 
 Some reasons why we may want to grant it:
 
 First of all all patches have been approved in time and just lost the
 gate race.
 
 Rejecting it makes little sense really, as it has been commented on by a
 good chunk of the core team, most of the invasive stuff (db migrations
 for example) has already merged, and the few parts that may seem
 contentious have either been discussed and agreed upon [3], or can
 easily be addressed in subsequent bug fixes.
 
 It would be very beneficial to merge it so that we actually get real
 testing on the feature ASAP (scheduling features are not tested in the
 gate so we need to rely on downstream/3rd party/user testing for those).

I think this NUMA work is a very important step forwards for Nova in
general, whch will benefit our entire userbase of KVM deployments,
and be especially useful to the NFV user group's needs.

As such, I'll be one sponsor for the FFE

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] Feature Freeze Exception process for Juno

On 09/04/2014 03:36 AM, Dean Troyer wrote:
 On Wed, Sep 3, 2014 at 7:07 PM, Joe Gordon joe.gord...@gmail.com
 mailto:joe.gord...@gmail.com wrote:
 
 On Wed, Sep 3, 2014 at 2:50 AM, Nikola Đipanov ndipa...@redhat.com
 mailto:ndipa...@redhat.com wrote:
 
 The reason many features including my own may not make the FF is not
 because there was not enough buy in from the core team (let's be
 completely honest - I have 3+ other core members working for the
 same
 company that are by nature of things easier to convince), but
 because of
 any of the following:
 
 
 I find the statement about having multiple cores at the same company
 very concerning. To quote Mark McLoughlin, It is assumed that all
 core team members are wearing their upstream hat and aren't there
 merely to represent their employers interests [0]. Your statement
 appears to be in direct conflict with Mark's idea of what core
 reviewer is, and idea that IMHO is one of the basic tenants of
 OpenStack development.
 
 
 FWIW I read Nikola's 'by nature of things' statement to be more of a
 representation of the higher-bandwith communication and relationships
 with co-workers rather than for the company.  I hope my reading is not
 wrong.
 

Thanks for not reading too much into that sentence - yes, this is quite
close to what I meant, and used it to make a point of how I think we are
focusing on the wrong thing (as already mentioned on the direct response
to Joe).

N.

 I know a while back some of the things I was trying to land in multiple
 projects really benefited from having both the relationships and
 high-bandwidth communication to 4 PTLs, three of whom were in the same
 room at the time.
 
 There is the perception problem, exactly what Mark also wrote about,
 when that happens off-line, and I think it is our responsibility (those
 advocating the reviews, and those responding to them) to note the
 outcome of those discussions on the record somewhere, IMO preferably in
 Gerrit.
 
 dt
 
 -- 
 
 Dean Troyer
 dtro...@gmail.com mailto:dtro...@gmail.com
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

On 09/04/2014 03:08 AM, Flavio Percoco wrote:
 Greetings,
 
 Last Tuesday the TC held the first graduation review for Zaqar. During
 the meeting some concerns arose. I've listed those concerns below with
 some comments hoping that it will help starting a discussion before the
 next meeting. In addition, I've added some comments about the project
 stability at the bottom and an etherpad link pointing to a list of use
 cases for Zaqar.
 
 # Concerns
 
 - Concern on operational burden of requiring NoSQL deploy expertise to
 the mix of openstack operational skills
 
 For those of you not familiar with Zaqar, it currently supports 2 nosql
 drivers - MongoDB and Redis - and those are the only 2 drivers it
 supports for now. This will require operators willing to use Zaqar to
 maintain a new (?) NoSQL technology in their system. Before expressing
 our thoughts on this matter, let me say that:
 
   1. By removing the SQLAlchemy driver, we basically removed the chance
 for operators to use an already deployed OpenStack-technology
   2. Zaqar won't be backed by any AMQP based messaging technology for
 now. Here's[0] a summary of the research the team (mostly done by
 Victoria) did during Juno
   3. We (OpenStack) used to require Redis for the zmq matchmaker
   4. We (OpenStack) also use memcached for caching and as the oslo
 caching lib becomes available - or a wrapper on top of dogpile.cache -
 Redis may be used in place of memcached in more and more deployments.
   5. Ceilometer's recommended storage driver is still MongoDB, although
 Ceilometer has now support for sqlalchemy. (Please correct me if I'm wrong).
 
 That being said, it's obvious we already, to some extent, promote some
 NoSQL technologies. However, for the sake of the discussion, lets assume
 we don't.
 
 I truly believe, with my OpenStack (not Zaqar's) hat on, that we can't
 keep avoiding these technologies. NoSQL technologies have been around
 for years and we should be prepared - including OpenStack operators - to
 support these technologies. Not every tool is good for all tasks - one
 of the reasons we removed the sqlalchemy driver in the first place -
 therefore it's impossible to keep an homogeneous environment for all
 services.
 
 With this, I'm not suggesting to ignore the risks and the extra burden
 this adds but, instead of attempting to avoid it completely by not
 evolving the stack of services we provide, we should probably work on
 defining a reasonable subset of NoSQL services we are OK with
 supporting. This will help making the burden smaller and it'll give
 operators the option to choose.
 
 [0] http://blog.flaper87.com/post/marconi-amqp-see-you-later/

I've been one of the consistent voices concerned about a hard
requirement on adding NoSQL into the mix. So I'll explain that thinking
a bit more.

I feel like when the TC makes an integration decision previously this
has been about evaluating the project applying for integration, and if
they met some specific criteria they were told about some time in the
past. I think that's the wrong approach. It's a locally optimized
approach that fails to ask the more interesting question.

Is OpenStack better as a whole if this is a mandatory component of
OpenStack? Better being defined as technically better (more features,
less janky code work arounds, less unexpected behavior from the stack).
Better from the sense of easier or harder to run an actual cloud by our
Operators (taking into account what kinds of moving parts they are now
expected to manage). Better from the sense of a better user experience
in interacting with OpenStack as whole. Better from a sense that the
OpenStack release will experience less bugs, less unexpected cross
project interactions, an a greater overall feel of consistency so that
the OpenStack API feels like one thing.

https://dague.net/2014/08/26/openstack-as-layers/

One of the interesting qualities of Layers 1  2 is they all follow an
AMQP + RDBMS pattern (excepting swift). You can have a very effective
IaaS out of that stack. They are the things that you can provide pretty
solid integration testing on (and if you look at where everything stood
before the new TC mandates on testing / upgrade that was basically what
was getting integration tested). (Also note, I'll accept Barbican is
probably in the wrong layer, and should be a Layer 2 service.)

While large shops can afford to have a dedicated team to figure out how
to make mongo or redis HA, provide monitoring, have a DR plan for when a
huricane requires them to flip datacenters, that basically means
OpenStack heads further down the path of only for the big folks. I
don't want OpenStack to be only for the big folks, I want OpenStack to
be for all sized folks. I really do want to have all the local small
colleges around here have OpenStack clouds, because it's something that
people believe they can do and manage. I know the people that work in
this places, they all come out to the LUG I run.

Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

2014-09-04 Thread Day, Phil

Hi Daniel,

Thanks for putting together such a thoughtful piece - I probably need to 
re-read it  few times to take in everything you're saying, but  a couple of 
thoughts that did occur to me:

- I can see how this could help where a change is fully contained within a virt 
driver, but I wonder how many of those there really are ?   Of the things that 
I've see go through recently nearly all also seem to touch the compute manager 
in someway, and a lot (like the Numa changes) also have impacts into the 
scheduler. Isn't it going to make it harder to get any of those changes in 
if they have to be co-ordinated across two or more repos ?  

- I think you hit the nail on the head in terms of the scope of Nova and how 
few people probably really understand all of it, but given the amount of trust 
that goes with being a core wouldn't it also be able to make people cores on 
the understanding that they will only approve code in the areas they are expert 
in ?It kind of feels that this happens to a large extent already, for 
example I don't see Chris or Ken'ichi  taking on work outside of the API layer. 
   It kind of feels as if given a small amount of trust we could have 
additional core reviewers focused on specific parts of the system without 
having to split up the code base if that's where the problem is.

Phil




 -Original Message-
 From: Daniel P. Berrange [mailto:berra...@redhat.com]
 Sent: 04 September 2014 11:24
 To: OpenStack Development
 Subject: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt
 drivers
 
 Position statement
 ==
 
 Over the past year I've increasingly come to the conclusion that Nova is
 heading for (or probably already at) a major crisis. If steps are not taken to
 avert this, the project is likely to loose a non-trivial amount of talent, 
 both
 regular code contributors and core team members. That includes myself. This
 is not good for Nova's long term health and so should be of concern to
 anyone involved in Nova and OpenStack.
 
 For those who don't want to read the whole mail, the executive summary is
 that the nova-core team is an unfixable bottleneck in our development
 process with our current project structure.
 The only way I see to remove the bottleneck is to split the virt drivers out 
 of
 tree and let them all have their own core teams in their area of code, leaving
 current nova core to focus on all the common code outside the virt driver
 impls. I, now, none the less urge people to read the whole mail.
 
 
 Background information
 ==
 
 I see many factors coming together to form the crisis
 
  - Burn out of core team members from over work
  - Difficulty bringing new talent into the core team
  - Long delay in getting code reviewed  merged
  - Marginalization of code areas which aren't popular
  - Increasing size of nova code through new drivers
  - Exclusion of developers without corporate backing
 
 Each item on their own may not seem too bad, but combined they add up to
 a big problem.
 
 Core team burn out
 --
 
 Having been involved in Nova for several dev cycles now, it is clear that the
 backlog of code up for review never goes away. Even intensive code review
 efforts at various points in the dev cycle makes only a small impact on the
 backlog. This has a pretty significant impact on core team members, as their
 work is never done. At best, the dial is sometimes set to 10, instead of 11.
 
 Many people, myself included, have built tools to help deal with the reviews
 in a more efficient manner than plain gerrit allows for. These certainly help,
 but they can't ever solve the problem on their own - just make it slightly
 more bearable. And this is not even considering that core team members
 might have useful contributions to make in ways beyond just code review.
 Ultimately the workload is just too high to sustain the levels of review
 required, so core team members will eventually burn out (as they have done
 many times already).
 
 Even if one person attempts to take the initiative to heavily invest in review
 of certain features it is often to no avail.
 Unless a second dedicated core reviewer can be found to 'tag team' it is hard
 for one person to make a difference. The end result is that a patch is +2d and
 then sits idle for weeks or more until a merge conflict requires it to be
 reposted at which point even that one +2 is lost. This is a pretty 
 demotivating
 outcome for both reviewers  the patch contributor.
 
 
 New core team talent
 
 
 It can't escape attention that the Nova core team does not grow in size very
 often. When Nova was younger and its code base was smaller, it was easier
 for contributors to get onto core because the base level of knowledge
 required was that much smaller. To get onto core today requires a major
 investment in learning Nova over a year or more. Even people who
 potentially have the latent skills may not

Re: [openstack-dev] [nova] FFE request v2-on-v3-api

On 09/04/2014 07:34 AM, Christopher Yeoh wrote:
 Hi,
 
 I'd like to request a FFE for 4 changesets from the v2-on-v3-api
 blueprint:
 
 https://review.openstack.org/#/c/113814/
 https://review.openstack.org/#/c/115515/
 https://review.openstack.org/#/c/115576/
 https://review.openstack.org/#/c/11/
 
 They have all already been approved and were in the gate for a while
 but just didn't quite make it through in time. So they shouldn't put any
 load on reviewers.
 
 Sponsoring cores:
 Kenichi Ohmichi
 John Garbutt
 Me

Sign me up as a sponsor as well. I think the scope is highly constrained
here, and risk to the rest of the project is low.

-Sean

-- 
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova][FFE] Feature freeze exception for virt-driver-numa-placement

On 09/04/2014 07:58 AM, Nikola Đipanov wrote:
 Hi team,
 
 I am requesting the exception for the feature from the subject (find
 specs at [1] and outstanding changes at [2]).
 
 Some reasons why we may want to grant it:
 
 First of all all patches have been approved in time and just lost the
 gate race.
 
 Rejecting it makes little sense really, as it has been commented on by a
 good chunk of the core team, most of the invasive stuff (db migrations
 for example) has already merged, and the few parts that may seem
 contentious have either been discussed and agreed upon [3], or can
 easily be addressed in subsequent bug fixes.
 
 It would be very beneficial to merge it so that we actually get real
 testing on the feature ASAP (scheduling features are not tested in the
 gate so we need to rely on downstream/3rd party/user testing for those).

This statement bugs me. It seems kind of backwards to say we should
merge a thing that we don't have a good upstream test plan on and put it
in a release so that the testing will happen only in the downstream case.

Anyway, not enough to -1 it, but enough to at least say something.

-Sean

-- 
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [nova] FFE request serial-ports

2014-09-04 Thread Sahid Orentino Ferdjaoui

Hello,

I would like to request a FFE for 4 changesets to complete the
blueprint serial-ports.

Topic on gerrit:
  
https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/serial-ports,n,z

Blueprint on launchpad.net:
  https://blueprints.launchpad.net/nova/+spec/serial-ports

They have already been approved but didn't get enough time to be merged
by the gate.

Sponsored by:
Daniel Berrange
Nikola Dipanov

s.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] FFE request serial-ports

On Thu, Sep 04, 2014 at 02:42:11PM +0200, Sahid Orentino Ferdjaoui wrote:
 Hello,
 
 I would like to request a FFE for 4 changesets to complete the
 blueprint serial-ports.
 
 Topic on gerrit:
   
 https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/serial-ports,n,z
 
 Blueprint on launchpad.net:
   https://blueprints.launchpad.net/nova/+spec/serial-ports
 
 They have already been approved but didn't get enough time to be merged
 by the gate.
 
 Sponsored by:
 Daniel Berrange
 Nikola Dipanov

ACK, this has my blessing.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

On 09/04/2014 01:15 PM, Chris Dent wrote:
On Thu, 4 Sep 2014, Flavio Percoco wrote:

Thanks for writing this up, interesting read.

Thank you for your feedback :)

Some comments in-line.

5. Ceilometer's recommended storage driver is still MongoDB, although
Ceilometer has now support for sqlalchemy. (Please correct me if I'm
wrong).

* Improved read and write performance in the sqlalchemy setup.
* time series storage and Gnocchi:

https://julien.danjou.info/blog/2014/openstack-ceilometer-the-gnocchi-experiment

Awesome, thanks for clarifying this.

[snip]

As mentioned in the meeting on Tuesday, Zaqar is not reinventing message
brokers. Zaqar provides a service akin to SQS from AWS with an OpenStack
flavor on top. [0]

In my efforts to track this stuff I remain confused on the points in
these two questions:

https://wiki.openstack.org/wiki/Zaqar/Frequently_asked_questions#How_does_Zaqar_compare_to_oslo.messaging.3F

https://wiki.openstack.org/wiki/Zaqar/Frequently_asked_questions#Is_Zaqar_an_under-cloud_or_an_over-cloud_service.3F

If we put both features, multi-tenancy and multi-protocol, aside for a
bit, we can simplify Zaqars goal down to a messaging service for the
cloud. I believe this is exactly where the line between Zaqar and other
*queuing* technologies should be drawn. Zaqar is, at the very end, a
messaging service thought for the cloud whereas existing queuing
technologies were not designed for it. By cloud I don't mean
performance, scalability nor anything like that. I'm talking about
providing a service that end-users of the cloud can consume.

The fact that Zaqar is also ideal for the under-cloud is a plus. The
service has been designed to suffice a set of messaging features that
serve perfectly use-cases in both the under-cloud and over-cloud.

If we add to that a multi-protocol transport layer with support for
multi-tenancy, you'll get a queuing service that fits the need of cloud
providers and covers a broader set of use cases like, say, IoT.

I forgot to add this link[0] to my previous email. Does the overview of
the service, the key features and scope help clearing things out a bit?

Please, let me know if they don't. I'm happy to provide more info if needed.

[0] https://wiki.openstack.org/wiki/Zaqar#Overview

[0] https://etherpad.openstack.org/p/zaqar-integrated-projects-use-cases

One of the reasons you would want to use Zaqar instead of oslo.messaging
for, say, guest-agents is that you don't want guest-agents to talk to
your main messaging layer. Zaqar will help guest-agents to communicate
with the main service in a more secure, authenticated and isolated way.

If you were going to do that with oslo.messaging, you'd need to have
separate virtual_hosts, exchanges and probably even users. This things
cannot be easily configured without manual intervention. With Zaqar you
can easily rely on your deployed cloud services - keystone, Barbican and
Zaqar, for example - to achieve such isolation and security.

There are also other aspects that are worrisome of relying on the main
messaging infrastructure for the use cases mentioned in that etherpad.
For example, using OpenStack's main rabbitmq instance to communicate
with guest-agents would increase the workload on the infrastructure,
which would require a better scaling strategy for it.

I hope the above clears your doubts. Thanks a lot for your feedback,
it's useful to keep the discussion going and helps everyone to keep
re-evaluating the goals and scopes of the project.

I hope other folks from the team will also chime in and share their
thoughts.

Cheers,
Flavio

Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

On Thu, Sep 04, 2014 at 12:14:39PM +, Day, Phil wrote:
 Hi Daniel,
 
 Thanks for putting together such a thoughtful piece - I probably need to
 re-read it  few times to take in everything you're saying, but  a couple
 of thoughts that did occur to me:
 
 - I can see how this could help where a change is fully contained within
 a virt driver, but I wonder how many of those there really are ?   Of the
 things that I've see go through recently nearly all also seem to touch the
 compute manager in someway, and a lot (like the Numa changes) also have 
 impacts into the scheduler. Isn't it going to make it harder to get
 any of those changes in if they have to be co-ordinated across two or
 more repos ?  

Actually, in my experiance of reviewing code this past cycle or two
I see a fairly significant portion of code that is entirely within
the scope of a virt driver. I'm also seeing that people are refraining
from actually doing changes to the virt drivers because of the burden
of getting code past review, so what we see today is probably not even
representative of the potential.

There are certainly some high profile exceptions such as the NUMA
work, or the new serial console work where you're going to cross the
repos. In such work we already try to break patches into isolated
pieces, so the stuff touching common code is a separate commit from
the stuff touching virt code. This is general good practice to be
encouraging. So, yes, it would need coordination across the repos
to get the full work submitted, but I don't think that burden is
unduly large compared to current practice. We do in fact already
see this need for co-ordination in other ways, For example, API
changes have parts that affect python-novaclient, and perhaps
horizon too. Storage  network changes often cross Neutron /
Cinder and Nova. If we can reduce the burden on nova-core the
stuff going into common codebase shoudl stand more chance of
getting review too.

So overall yes, this is a valid point, but I'm not particularly
concerned about the negatives impacts of it, because we're already
dealing with them today to a large extent.

 - I think you hit the nail on the head in terms of the scope of
 Nova and how few people probably really understand all of it,
 but given the amount of trust that goes with being a core wouldn't
 it also be able to make people cores on the understanding that
 they will only approve code in the areas they are expert in ?
   It kind of feels that this happens to a large extent already,
 for example I don't see Chris or Ken'ichi  taking on work outside
 of the API layer.It kind of feels as if given a small amount
 of trust we could have additional core reviewers focused on
 specific parts of the system without having to split up the
 code base if that's where the problem is.

Yes, you are right that it happens to some extent but I think it
is quite a big jump to effectively scale it up that amount of
trust to a team that realistically would need to be 40+ people in
size.

Also this isn't soley about review bandwidth. One of the things
I raised was about how there's certain standards required for
being part of nova, such as CI testing. If you can't meet that
you're forced into  a sub-optimal development practice compared
to the rest of nova where you are out of tree at subject to be
broken by Nova changes at any time, which is what Docker and
Ironic have been facing.  Separate repos will also facilitate
more targetted application of our testing resources, so vmware
repo changes wouldn't need to suffer false failures from libvirt
tempest jobs, and similarly vmware CI could be made gating for
vmware without causing libvirt code to suffer instability.

  -Original Message-
  From: Daniel P. Berrange [mailto:berra...@redhat.com]
  Sent: 04 September 2014 11:24
  To: OpenStack Development
  Subject: [openstack-dev] [nova] Averting the Nova crisis by splitting out 
  virt
  drivers
  
  Position statement
  ==
  
  Over the past year I've increasingly come to the conclusion that Nova is
  heading for (or probably already at) a major crisis. If steps are not taken 
  to
  avert this, the project is likely to loose a non-trivial amount of talent, 
  both
  regular code contributors and core team members. That includes myself. This
  is not good for Nova's long term health and so should be of concern to
  anyone involved in Nova and OpenStack.
  
  For those who don't want to read the whole mail, the executive summary is
  that the nova-core team is an unfixable bottleneck in our development
  process with our current project structure.
  The only way I see to remove the bottleneck is to split the virt drivers 
  out of
  tree and let them all have their own core teams in their area of code, 
  leaving
  current nova core to focus on all the common code outside the virt driver
  impls. I, now, none the less urge people to read the whole mail.
  
  
  Background information

Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

2014-09-04 Thread Thierry Carrez

Like I mentioned before, I think the only way out of the Nova death
spiral is to split code and give control over it to smaller dedicated
review teams. This is one way to do it. Thanks Dan for pulling this
together :)

A couple comments inline:

Daniel P. Berrange wrote:
 [...]
 This is a crisis. A large crisis. In fact, if you got a moment, it's
 a twelve-storey crisis with a magnificent entrance hall, carpeting
 throughout, 24-hour portage, and an enormous sign on the roof,
 saying 'This Is a Large Crisis'. A large crisis requires a large
 plan.
 [...]

I totally agree. We need a plan now, because we can't go through another
cycle without a solution in sight.

 [...]
 This has quite a few implications for the way development would
 operate.
 
  - The Nova core team at least, would be voluntarily giving up a big
amount of responsibility over the evolution of virt drivers. Due
to human nature, people are not good at giving up power, so this
may be painful to swallow. Realistically current nova core are
not experts in most of the virt drivers to start with, and more
important we clearly do not have sufficient time to do a good job
of review with everything submitted. Much of the current need
for core review of virt drivers is to prevent the mis-use of a
poorly defined virt driver API...which can be mitigated - See
later point(s)
 
  - Nova core would/should not have automatic +2 over the virt driver
repositories since it is unreasonable to assume they have the
suitable domain knowledge for all virt drivers out there. People
would of course be able to be members of multiple core teams. For
example John G would naturally be nova-core and nova-xen-core. I
would aim for nova-core and nova-libvirt-core, and so on. I do not
want any +2 responsibility over VMWare/HyperV/Docker drivers since
they're not my area of expertize - I only look at them today because
they have no other nova-core representation.
 
  - Not sure if it implies the Nova PTL would be solely focused on
Nova common. eg would there continue to be one PTL over all virt
driver implementation projects, or would each project have its
own PTL. Maybe this is irrelevant if a Czars approach is chosen
by virt driver projects for their work. I'd be inclined to say
that a single PTL should stay as a figurehead to represent all
the virt driver projects, acting as a point of contact to ensure
we keep communication / co-operation between the drivers in sync.
 [...]

At this point it may look like our current structure (programs, one PTL,
single core teams...) prevents us from implementing that solution. I
just want to say that in OpenStack, organizational structure reflects
how we work, not the other way around. If we need to reorganize
official project structure to work in smarter and long-term healthy
ways, that's a really small price to pay.

-- 
Thierry Carrez (ttx)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] FFE request serial-ports

2014-09-04 Thread Pádraig Brady

On 09/04/2014 01:42 PM, Sahid Orentino Ferdjaoui wrote:
 Hello,
 
 I would like to request a FFE for 4 changesets to complete the
 blueprint serial-ports.
 
 Topic on gerrit:
   
 https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/serial-ports,n,z
 
 Blueprint on launchpad.net:
   https://blueprints.launchpad.net/nova/+spec/serial-ports
 
 They have already been approved but didn't get enough time to be merged
 by the gate.
 
 Sponsored by:
 Daniel Berrange
 Nikola Dipanov

I'll sponsor this too, as I originally reviewed the set and approved


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] Feature Freeze Exception process for Juno

2014-09-03 20:31 GMT+09:00 Gary Kotton gkot...@vmware.com:

 On 9/3/14, 12:50 PM, Nikola Đipanov ndipa...@redhat.com wrote:

On 09/02/2014 09:23 PM, Michael Still wrote:
 On Tue, Sep 2, 2014 at 1:40 PM, Nikola Đipanov ndipa...@redhat.com
wrote:
 On 09/02/2014 08:16 PM, Michael Still wrote:
 Hi.

 We're soon to hit feature freeze, as discussed in Thierry's recent
 email. I'd like to outline the process for requesting a freeze
 exception:

 * your code must already be up for review
 * your blueprint must have an approved spec
 * you need three (3) sponsoring cores for an exception to be
granted

 Can core reviewers who have features up for review have this number
 lowered to two (2) sponsoring cores, as they in reality then need four
 (4) cores (since they themselves are one (1) core but cannot really
 vote) making it an order of magnitude more difficult for them to hit
 this checkbox?

 That's a lot of numbers in that there paragraph.

 Let me re-phrase your question... Can a core sponsor an exception they
 themselves propose? I don't have a problem with someone doing that,
 but you need to remember that does reduce the number of people who
 have agreed to review the code for that exception.


Michael has correctly picked up on a hint of snark in my email, so let
me explain where I was going with that:

The reason many features including my own may not make the FF is not
because there was not enough buy in from the core team (let's be
completely honest - I have 3+ other core members working for the same
company that are by nature of things easier to convince), but because of
any of the following:

* Crippling technical debt in some of the key parts of the code
* that we have not been acknowledging as such for a long time
* which leads to proposed code being arbitrarily delayed once it makes
the glaring flaws in the underlying infra apparent
* and that specs process has been completely and utterly useless in
helping uncover (not that process itself is useless, it is very useful
for other things)

I am almost positive we can turn this rather dire situation around
easily in a matter of months, but we need to start doing it! It will not
happen through pinning arbitrary numbers to arbitrary processes.

I will follow up with a more detailed email about what I believe we are
missing, once the FF settles and I have applied some soothing creme to
my burnout wounds, but currently my sentiment is:

Contributing features to Nova nowadays SUCKS!!1 (even as a core
reviewer) We _have_ to change that!

 +1

 Sadly what you have written above is true. The current process does not
 encourage new developers in Nova. I really think that we need to work on
 improving our community. I really think that maybe we should sit as a
 community at the summit and talk about this.

That is important point.
I also have the similar feeling to many people said.
I have a patch series which has started since 2013-03-22, and some patches
were not merged in Juno-3 again because of the review bandwidth.
When I started this work as one new contributor, I could not imagine I needed
much time for it.

After that, through code reviews, sometimes I feel unbalance between each patch.
Some patches are very easy like fixing typo, removing unused method.
On the other hand, some patches are very difficult like some frameworks
which affect long-living features. However, we are requiring two +2s for all
patches. Then, easy patches also need much time for reviewing.
I think most new contributors post easy patches as the first step, but they
might feel frustrations now. I think the number of the merged good
patches is more
important than the number of code reviews.
Cannot we consider a single +2 for merging patches case by case?

Thanks
IKen'ichi Ohmichi

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

On 09/04/2014 02:14 PM, Sean Dague wrote:
 On 09/04/2014 03:08 AM, Flavio Percoco wrote:
 Greetings,

 Last Tuesday the TC held the first graduation review for Zaqar. During
 the meeting some concerns arose. I've listed those concerns below with
 some comments hoping that it will help starting a discussion before the
 next meeting. In addition, I've added some comments about the project
 stability at the bottom and an etherpad link pointing to a list of use
 cases for Zaqar.

 # Concerns

 - Concern on operational burden of requiring NoSQL deploy expertise to
 the mix of openstack operational skills

 For those of you not familiar with Zaqar, it currently supports 2 nosql
 drivers - MongoDB and Redis - and those are the only 2 drivers it
 supports for now. This will require operators willing to use Zaqar to
 maintain a new (?) NoSQL technology in their system. Before expressing
 our thoughts on this matter, let me say that:

  1. By removing the SQLAlchemy driver, we basically removed the chance
 for operators to use an already deployed OpenStack-technology
  2. Zaqar won't be backed by any AMQP based messaging technology for
 now. Here's[0] a summary of the research the team (mostly done by
 Victoria) did during Juno
  3. We (OpenStack) used to require Redis for the zmq matchmaker
  4. We (OpenStack) also use memcached for caching and as the oslo
 caching lib becomes available - or a wrapper on top of dogpile.cache -
 Redis may be used in place of memcached in more and more deployments.
  5. Ceilometer's recommended storage driver is still MongoDB, although
 Ceilometer has now support for sqlalchemy. (Please correct me if I'm wrong).

 That being said, it's obvious we already, to some extent, promote some
 NoSQL technologies. However, for the sake of the discussion, lets assume
 we don't.

 I truly believe, with my OpenStack (not Zaqar's) hat on, that we can't
 keep avoiding these technologies. NoSQL technologies have been around
 for years and we should be prepared - including OpenStack operators - to
 support these technologies. Not every tool is good for all tasks - one
 of the reasons we removed the sqlalchemy driver in the first place -
 therefore it's impossible to keep an homogeneous environment for all
 services.

 With this, I'm not suggesting to ignore the risks and the extra burden
 this adds but, instead of attempting to avoid it completely by not
 evolving the stack of services we provide, we should probably work on
 defining a reasonable subset of NoSQL services we are OK with
 supporting. This will help making the burden smaller and it'll give
 operators the option to choose.

 [0] http://blog.flaper87.com/post/marconi-amqp-see-you-later/
 
 I've been one of the consistent voices concerned about a hard
 requirement on adding NoSQL into the mix. So I'll explain that thinking
 a bit more.
 
 I feel like when the TC makes an integration decision previously this
 has been about evaluating the project applying for integration, and if
 they met some specific criteria they were told about some time in the
 past. I think that's the wrong approach. It's a locally optimized
 approach that fails to ask the more interesting question.
 
 Is OpenStack better as a whole if this is a mandatory component of
 OpenStack? Better being defined as technically better (more features,
 less janky code work arounds, less unexpected behavior from the stack).
 Better from the sense of easier or harder to run an actual cloud by our
 Operators (taking into account what kinds of moving parts they are now
 expected to manage). Better from the sense of a better user experience
 in interacting with OpenStack as whole. Better from a sense that the
 OpenStack release will experience less bugs, less unexpected cross
 project interactions, an a greater overall feel of consistency so that
 the OpenStack API feels like one thing.
 
 https://dague.net/2014/08/26/openstack-as-layers/
 
 One of the interesting qualities of Layers 1  2 is they all follow an
 AMQP + RDBMS pattern (excepting swift). You can have a very effective
 IaaS out of that stack. They are the things that you can provide pretty
 solid integration testing on (and if you look at where everything stood
 before the new TC mandates on testing / upgrade that was basically what
 was getting integration tested). (Also note, I'll accept Barbican is
 probably in the wrong layer, and should be a Layer 2 service.)
 
 While large shops can afford to have a dedicated team to figure out how
 to make mongo or redis HA, provide monitoring, have a DR plan for when a
 huricane requires them to flip datacenters, that basically means
 OpenStack heads further down the path of only for the big folks. I
 don't want OpenStack to be only for the big folks, I want OpenStack to
 be for all sized folks. I really do want to have all the local small
 colleges around here have OpenStack clouds, because it's something that
 people believe they can do and manage. I know the

Re: [openstack-dev] [Nova][FFE] Feature freeze exception for virt-driver-numa-placement

2014-09-04 Thread Pádraig Brady

On 09/04/2014 12:58 PM, Nikola Đipanov wrote:
 Hi team,
 
 I am requesting the exception for the feature from the subject (find
 specs at [1] and outstanding changes at [2]).
 
 Some reasons why we may want to grant it:
 
 First of all all patches have been approved in time and just lost the
 gate race.
 
 Rejecting it makes little sense really, as it has been commented on by a
 good chunk of the core team, most of the invasive stuff (db migrations
 for example) has already merged, and the few parts that may seem
 contentious have either been discussed and agreed upon [3], or can
 easily be addressed in subsequent bug fixes.
 
 It would be very beneficial to merge it so that we actually get real
 testing on the feature ASAP (scheduling features are not tested in the
 gate so we need to rely on downstream/3rd party/user testing for those).
 
 Thanks,
 
 Nikola
 
 [1]
 http://git.openstack.org/cgit/openstack/nova-specs/tree/specs/juno/virt-driver-numa-placement.rst
 [2]
 https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/virt-driver-numa-placement,n,z
 [3] https://review.openstack.org/#/c/111782/

I'll sponsor this too, and I've already reviewed this set a few times


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova][FFE] Feature freeze exception for virt-driver-numa-placement

On 09/04/2014 02:31 PM, Sean Dague wrote:
 On 09/04/2014 07:58 AM, Nikola Đipanov wrote:
 Hi team,

 I am requesting the exception for the feature from the subject (find
 specs at [1] and outstanding changes at [2]).

 Some reasons why we may want to grant it:

 First of all all patches have been approved in time and just lost the
 gate race.

 Rejecting it makes little sense really, as it has been commented on by a
 good chunk of the core team, most of the invasive stuff (db migrations
 for example) has already merged, and the few parts that may seem
 contentious have either been discussed and agreed upon [3], or can
 easily be addressed in subsequent bug fixes.

 It would be very beneficial to merge it so that we actually get real
 testing on the feature ASAP (scheduling features are not tested in the
 gate so we need to rely on downstream/3rd party/user testing for those).
 
 This statement bugs me. It seems kind of backwards to say we should
 merge a thing that we don't have a good upstream test plan on and put it
 in a release so that the testing will happen only in the downstream case.
 

The objective reality is that many other things have not had upstream
testing for a long time (anything that requires more than 1 compute node
in Nova for example, and any scheduling feature - as I mention clearly
above), so not sure how that is backwards from any reasonable point.

Thanks to folks using them, it is still kept working and bugs get fixed.
Getting features into the hands of users is extremely important...

 Anyway, not enough to -1 it, but enough to at least say something.
 

.. but I do not want to get into the discussion about software testing
here, not the place really.

However, I do think it is very harmful to respond to FFE request with
such blanket statements and generalizations, if only for the message it
sends to the contributors (that we really care more about upholding our
own myths as a community than users and features).

N.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Kilo Cycle Goals Exercise

On 09/03/2014 11:37 AM, Joe Gordon wrote:
 As you all know, there has recently been several very active discussions
 around how to improve assorted aspects of our development process. One idea
 that was brought up is to come up with a list of cycle goals/project
 priorities for Kilo [0].
 
 To that end, I would like to propose an exercise as discussed in the TC
 meeting yesterday [1]:
 Have anyone interested (especially TC members) come up with a list of
 what they think the project wide Kilo cycle goals should be and post
 them on this thread by end of day Wednesday, September 10th. After which
 time we can begin discussing the results.
 The goal of this exercise is to help us see if our individual world
 views align with the greater community, and to get the ball rolling on a
 larger discussion of where as a project we should be focusing more time.
 
 
 best,
 Joe Gordon
 
 [0]
 http://lists.openstack.org/pipermail/openstack-dev/2014-August/041929.html
 [1]
 http://eavesdrop.openstack.org/meetings/tc/2014/tc.2014-09-02-20.04.log.html

Here is my top 5 list:

1. Functional Testing in Integrated projects

The justification for this is here -
http://lists.openstack.org/pipermail/openstack-dev/2014-July/041057.html. We
need projects to take more ownership of their functional testing so that
by the time we get to integration testing we're not exposing really
fundamental bugs like being unable to handle 2 requests at the same time.

For Kilo: I think we can and should be able to make progress on this on
all integrated projects, as well as the python clients (which are
basically untested and often very broken).

2. Consistency in southbound interfaces (Logging first)

Logging and notifications are south bound interfaces from OpenStack
providing information to people, or machines, about what is going on.
There is also a 3rd proposed south bound with osprofiler.

For Kilo: I think it's reasonable to complete the logging standards and
implement them. I expect notifications (which haven't quite kicked off)
are going to take 2 cycles.

I'd honestly *really* love to see a unification path for all the the
southbound parts, logging, osprofiler, notifications, because there is
quite a bit of overlap in the instrumentation/annotation inside the main
code for all of these.

3. API micro version path forward

We have Cinder v2, Glance v2, Keystone v3. We've had them for a long
time. When we started Juno cycle Nova used *none* of them. And with good
reason, as the path forward was actually pretty bumpy. Nova has been
trying to create a v3 for 3 cycles, and that effort collapsed under it's
own weight. I think major API revisions in OpenStack are not actually
possible any more, as there is too much intertia on existing interfaces.

How to sanely and gradually evolve the OpenStack API is tremendously
important, especially as a bunch of new projects are popping up that
implement parts of it. We have the beginnings of a plan here in Nova,
which now just needs a bunch of heavy lifting.

For Kilo: A working microversion stack in at least one OpenStack
service. Nova is probably closest, though Mark McClain wants to also
take a spin on this in Neutron. I think if we could come up with a model
that worked in both of those projects, we'd pick up some steam in making
this long term approach across all of OpenStack.

4. Post merge testing

As explained here -
http://lists.openstack.org/pipermail/openstack-dev/2014-July/041057.html
we could probably get a lot more bang for our buck if we had a smaller #
of integration configurations in the pre merge gate, and a much more
expansive set of post merge jobs.

For Kilo: I think this could be implemented, it probably needs more
hands than it has right now.

5. Consistent OpenStack python SDK / clients

I think the client projects being inside the server programs has not
served us well, especially as the # of servers has expanded. We as a
project need to figure out how to get the SDK / unified client effort
moving forward faster.

For Kilo: I'm not sure how close to done we could take this, but this
needs to become a larger overall push for the project as a whole, as I
think our use exposed interface here is inhibiting adoption.

-Sean

-- 
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] FFE request serial-ports

On 09/04/2014 02:42 PM, Sahid Orentino Ferdjaoui wrote:
 Hello,
 
 I would like to request a FFE for 4 changesets to complete the
 blueprint serial-ports.
 
 Topic on gerrit:
   
 https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/serial-ports,n,z
 
 Blueprint on launchpad.net:
   https://blueprints.launchpad.net/nova/+spec/serial-ports
 
 They have already been approved but didn't get enough time to be merged
 by the gate.
 
 Sponsored by:
 Daniel Berrange
 Nikola Dipanov
 

This is also one of the ones that simply lost the gate race in the end,
and I've reviewed several iterations of it, so +1 from me.

N.

 s.
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [neutron] Status of Neutron at Juno-3

2014-09-04 Thread Kyle Mestery

On Thu, Sep 4, 2014 at 3:38 AM, Miguel Angel Ajo Pelayo
mangel...@redhat.com wrote:

I didn't know that we could ask for FFE, so I'd like to ask (if
yet in time) for:

https://blueprints.launchpad.net/neutron/+spec/agent-child-processes-status

https://review.openstack.org/#/q/status:open+project:openstack/neutron+branch:master+topic:bp/agent-child-processes-status,n,z

To get the ProcessMonitor implemented in the l3_agent and dhcp_agent at least.

I believe the work is ready (I need to check the radvd respawn in the l3
agent).
The ProcessMonitor class is already merged.

The two remaining patches for this BP are about 65 and 200 LOC, so
this is a relatively small change. In addition, since the initial
patches merged in Juno-3, adding the code to monitor and restart the
agents in the next two patches makes some sense. I'll add this to the
list of BPs to discuss with ttx tomorrow.

Thanks,
Kyle

Best regards,
Miguel Ángel.

- Original Message -
On Wed, Sep 3, 2014 at 10:19 AM, Mark McClain m...@mcclain.xyz wrote:

On Sep 3, 2014, at 11:04 AM, Brian Haley brian.ha...@hp.com wrote:

I guess I'll be the first to ask for an exception for a Medium since the
code
was originally completed in Icehouse:

https://blueprints.launchpad.net/neutron/+spec/l3-metering-mgnt-ext

+1 for FFE. I think this is good community that fell through the cracks.

I agree, and I've marked it as RC1 now. I'll sort through these with
ttx on Friday and get more clarity on it's official status.

Thanks,
Kyle

mark

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [vmware][nova][FFE] vmware-spawn-refactor

2014-09-04 Thread Matthew Booth

I'd like to request a FFE for the remaining changes from
vmware-spawn-refactor. They are:

https://review.openstack.org/#/c/109754/
https://review.openstack.org/#/c/109755/
https://review.openstack.org/#/c/114817/
https://review.openstack.org/#/c/117467/
https://review.openstack.org/#/c/117283/

https://review.openstack.org/#/c/98322/

All but the last had +A, and were in the gate at the time it was closed.
The last had not yet been approved, but is ready for core review. It has
recently had some orthogonal changes split out to simplify it
considerably. It is largely a code motion patch, and has been given +1
by VMware CI multiple times.

Matt
-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] FFE request v2-on-v3-api

2014-09-04 20:34 GMT+09:00 Christopher Yeoh cbky...@gmail.com:
 Hi,

 I'd like to request a FFE for 4 changesets from the v2-on-v3-api
 blueprint:

 https://review.openstack.org/#/c/113814/
 https://review.openstack.org/#/c/115515/
 https://review.openstack.org/#/c/115576/
 https://review.openstack.org/#/c/11/

 They have all already been approved and were in the gate for a while
 but just didn't quite make it through in time. So they shouldn't put any
 load on reviewers.

 Sponsoring cores:
 Kenichi Ohmichi
 John Garbutt
 Me

Yeah, I am happy to support this work.

Thanks
Ken'ichi Ohmichi

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] Treating notifications as a contract (CADF)

2014-09-04 Thread Sandy Walsh

Yesterday, we had a great conversation with Matt Rutkowski from IBM, one
of the authors of the CADF spec.

I was having a disconnect on what CADF offers and got it clarified.

My assumption was CADF was a set of transformation/extraction rules for
taking data from existing data structures and defining them as
well-known things. For example, CADF needs to know who sent this
notification. I thought CADF would give us a means to point at an
existing data structure and say that's where you find it.

But I was wrong. CADF is a full-on schema/data structure of its own. It
would be a fork-lift replacement for our existing notifications.
However, if your service hasn't really adopted notifications yet (green
field) or you can handle a fork-lift replacement, CADF is a good option.
There are a few gotcha's though. If you have required data that is
outside of the CADF spec, it would need to go in the attachment
section of the notification and that still needs a separate schema to
define it. Matt's team is very receptive to extending the spec to
include these special cases though.

Anyway, I've written up all the options (as I see them) [1] with the
advantages/disadvantages of each approach. It's just a strawman, so
bend/spindle/mutilate.

Look forward to feedback!
-S


[1] https://wiki.openstack.org/wiki/NotificationsAndCADF




On 9/3/2014 12:30 PM, Sandy Walsh wrote:
 On 9/3/2014 11:32 AM, Chris Dent wrote:
 On Wed, 3 Sep 2014, Sandy Walsh wrote:

 We're chatting with IBM about CADF and getting down to specifics on
 their applicability to notifications. Once I get StackTach.v3 into
 production I'm keen to get started on revisiting the notification
 format and olso.messaging support for notifications.

 Perhaps a hangout for those keenly interested in doing something about this?
 That seems like a good idea. I'd like to be a part of that.
 Unfortunately I won't be at summit but would like to contribute what
 I can before and after.

 I took some notes on this a few weeks ago and extracted what seemed
 to be the two main threads or ideas the were revealed by the
 conversation that happened in this thread:

  * At the micro level have versioned schema for notifications such that
one end can declare I am sending version X of notification
foo.bar.Y and the other end can effectively deal.
 Yes, that's table-stakes I think. Putting structure around the payload
 section.

 Beyond type and version we should be able to attach meta information
 like public/private visibility and perhaps hints for external mapping
 (this trait - that trait in CADF, for example).

  * At the macro level standardize a packaging or envelope of all
notifications so that they can be consumed by very similar code.
That is: constrain the notifications in some way so we can also
constrain the consumer code.
 That's the intention of what we have now. The top level traits are
 standard, the payload is open. We really only require: message_id,
 timestamp and event_type. For auditing we need to cover Who, What, When,
 Where, Why, OnWhat, OnWhere, FromWhere.

  These ideas serve two different purposes: One is to ensure that
  existing notification use cases are satisfied with robustness and
  provide a contract between two endpoints. The other is to allow a
  fecund notification environment that allows and enables many
  participants.
 Good goals. When Producer and Consumer know what to expect, things are
 good ... I know to find the Instance ID here. When the consumer
 wants to deal with a notification as a generic object, things get tricky
 (find the instance ID in the payload, What is the image type?, Is
 this an error notification?)

 Basically, how do we define the principle artifacts for each service and
 grant the consumer easy/consistent access to them? (like the 7-W's above)

 I'd really like to find a way to solve that problem.

 Is that a good summary? What did I leave out or get wrong?

 Great start! Let's keep it simple and do-able.

 We should also review the oslo.messaging notification api ... I've got
 some concerns we've lost our way there.

 -S


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova][FFE] Feature freeze exception for virt-driver-numa-placement

On Thu, Sep 04, 2014 at 03:07:24PM +0200, Nikola Đipanov wrote:
 On 09/04/2014 02:31 PM, Sean Dague wrote:
  On 09/04/2014 07:58 AM, Nikola Đipanov wrote:
  Hi team,
 
  I am requesting the exception for the feature from the subject (find
  specs at [1] and outstanding changes at [2]).
 
  Some reasons why we may want to grant it:
 
  First of all all patches have been approved in time and just lost the
  gate race.
 
  Rejecting it makes little sense really, as it has been commented on by a
  good chunk of the core team, most of the invasive stuff (db migrations
  for example) has already merged, and the few parts that may seem
  contentious have either been discussed and agreed upon [3], or can
  easily be addressed in subsequent bug fixes.
 
  It would be very beneficial to merge it so that we actually get real
  testing on the feature ASAP (scheduling features are not tested in the
  gate so we need to rely on downstream/3rd party/user testing for those).
  
  This statement bugs me. It seems kind of backwards to say we should
  merge a thing that we don't have a good upstream test plan on and put it
  in a release so that the testing will happen only in the downstream case.
  
 
 The objective reality is that many other things have not had upstream
 testing for a long time (anything that requires more than 1 compute node
 in Nova for example, and any scheduling feature - as I mention clearly
 above), so not sure how that is backwards from any reasonable point.

More critically with NUMA feature, AFAIK, there is no public cloud in
existance which exposes NUMA to the guest. So unless someone is willing
to pay for 100's of bare metal servers to run tempest on, I don't know
of any infrastructure on which we can test NUMA today.

Of course once we include NUMA features in Nova and release Nova, then
the Rackspace and/or HP clouds will be in a position to start considering
how  when they might expose NUMA features for instances they host. So by
including it in Nova today, we would be helping move towards a future
where we will be able to run tempest against NUMA features.

Blocking NUMA from Nova for lack of automated testing will leave us trapped
in a chicken and egg scenario, potentially forever. That's not in anyones
best interests IMHO

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [nova] [feature freeze exception] FFE for libvirt-start-lxc-from-block-devices

2014-09-04 Thread Vladik Romanovsky

Hello,

I would like to ask for an extension for libvirt-start-lxc-from-block-devices 
feature. It has been previously pushed from Ice house to Juno.
The spec [1] has been approved. One of the patches is a bug fix. Another patch 
has been already approved and failed in the gate.
All patches has a +2 from Daniel Berrange.

The list of the remaining patches are in [2].


[1] https://review.openstack.org/#/c/88062
[2] 
https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/libvirt-start-lxc-from-block-devices,n,z

Thank you,
Vladik

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] [feature freeze exception] FFE for libvirt-start-lxc-from-block-devices

On Thu, Sep 04, 2014 at 03:22:14PM +0200, Vladik Romanovsky wrote:
 Hello,
 
 I would like to ask for an extension for libvirt-start-lxc-from-block-devices 
 feature. It has been previously pushed from Ice house to Juno.
 The spec [1] has been approved. One of the patches is a bug fix. Another 
 patch has been already approved and failed in the gate.
 All patches has a +2 from Daniel Berrange.
 
 The list of the remaining patches are in [2].
 
 
 [1] https://review.openstack.org/#/c/88062
 [2] 
 https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/libvirt-start-lxc-from-block-devices,n,z

The first two patches there are really both just bug fixes, so should
not be -2'd at all right now.

The last patch is sufficiently trivial that I'm happy to sponsor FFE.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova] Feature Freeze Exception process for Juno

2014-09-04 Thread Matt Riedemann




On 9/4/2014 4:21 AM, Day, Phil wrote:


One final note: the specs referenced above didn't get approved until
Spec Freeze, which seemed to leave me with less time to implement
things.  In fact, it seemed that a lot of specs didn't get approved
until spec freeze.  Perhaps if we had more staggered approval of
specs, we'd have more staggered submission of patches, and thus less of a

sudden influx of patches in the couple weeks before feature proposal
freeze.

Yeah I think the specs were getting approved too late into the cycle, I was
actually surprised at how far out the schedules were going in allowing things
in and then allowing exceptions after that.

Hopefully the ideas around priorities/slots/runways will help stagger some of
this also.


I think there is a problem with the pattern that seemed to emerge in June where 
the J.1 period was taken up with spec review  (a lot of good reviews happened 
early in that period, but the approvals kind of came in a lump at the end)  
meaning that the implementation work itself only seemed to really kick in 
during J.2 - and not surprisingly given the complexity of some of the changes 
ran late into J.3.

We also has previously noted didn’t do any prioritization between those specs 
that were approved - so it was always going to be a race to who managed to get 
code up for review first.

It kind of feels to me as if the ideal model would be if we were doing spec 
review for K now (i.e during the FF / stabilization period) so that we hit 
Paris with a lot of the input already registered and a clear idea of the range  
of things folks want to do.We shouldn't really have to ask for session 
suggestions for the summit  - they should be something that can be extracted 
from the proposed specs (maybe we do voting across the specs or something like 
that).In that way the summit would be able to confirm the list of specs for 
K and the priority order.

With the current state of the review queue maybe we can’t quite hit this 
pattern for K, but would be worth aspiring to for I ?

Phil

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I like the idea of having our ducks somewhat in a row for the summit so 
we can hash out details in design sessions on high-priority specs and 
reserve time for figuring out what the priorities are.  I think that 
would go a long way in fixing some of the frustrations in the other 
thread about the mid-cycle meetups being the place where blueprint 
issues are hashed out rather than the summit, and the design sessions at 
the summit not feeling productive.


But as noted, there is also a feeling right now of focusing on Juno to 
get that out the door before anyone starts getting distracted with 
reviewing Kilo specs.  And I suppose once Juno is finished no one is 
going to want to talk about Kilo for awhile due to burnout.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova][FFE] Feature freeze exception for virt-driver-numa-placement

On 09/04/2014 09:21 AM, Daniel P. Berrange wrote:
 On Thu, Sep 04, 2014 at 03:07:24PM +0200, Nikola Đipanov wrote:
 On 09/04/2014 02:31 PM, Sean Dague wrote:
 On 09/04/2014 07:58 AM, Nikola Đipanov wrote:
 Hi team,

 I am requesting the exception for the feature from the subject (find
 specs at [1] and outstanding changes at [2]).

 Some reasons why we may want to grant it:

 First of all all patches have been approved in time and just lost the
 gate race.

 Rejecting it makes little sense really, as it has been commented on by a
 good chunk of the core team, most of the invasive stuff (db migrations
 for example) has already merged, and the few parts that may seem
 contentious have either been discussed and agreed upon [3], or can
 easily be addressed in subsequent bug fixes.

 It would be very beneficial to merge it so that we actually get real
 testing on the feature ASAP (scheduling features are not tested in the
 gate so we need to rely on downstream/3rd party/user testing for those).

 This statement bugs me. It seems kind of backwards to say we should
 merge a thing that we don't have a good upstream test plan on and put it
 in a release so that the testing will happen only in the downstream case.


 The objective reality is that many other things have not had upstream
 testing for a long time (anything that requires more than 1 compute node
 in Nova for example, and any scheduling feature - as I mention clearly
 above), so not sure how that is backwards from any reasonable point.
 
 More critically with NUMA feature, AFAIK, there is no public cloud in
 existance which exposes NUMA to the guest. So unless someone is willing
 to pay for 100's of bare metal servers to run tempest on, I don't know
 of any infrastructure on which we can test NUMA today.
 
 Of course once we include NUMA features in Nova and release Nova, then
 the Rackspace and/or HP clouds will be in a position to start considering
 how  when they might expose NUMA features for instances they host. So by
 including it in Nova today, we would be helping move towards a future
 where we will be able to run tempest against NUMA features.
 
 Blocking NUMA from Nova for lack of automated testing will leave us trapped
 in a chicken and egg scenario, potentially forever. That's not in anyones
 best interests IMHO

The spec specifically calls out the scheduler piece being the part that
probably most needs to be tested, especially at large scales here. Those
pieces don't need Tempest to test them, they need more solid functional
tests around the scheduler under those circumstances.

There are interesting (and not all that difficult) ways to do this given
the resources we have, which don't seem to be being explored, which is
my concern.

-Sean

-- 
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [Nova][FFE] v3-api-schema

Hi

I'd like to request FFE for patches of v3-api-schema.
The list  is the following:

https://review.openstack.org/#/c/67428/
https://review.openstack.org/#/c/103437/
https://review.openstack.org/#/c/103436/
https://review.openstack.org/#/c/66783/

The one of them has already approved, but it stops merging with temporary -2.
The other ones have gotten one +2 on each PS.
This work will make v2.1 API strong.

Thanks
Ken'ichi Ohmichi

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova][FFE] v3-api-schema

On 09/04/2014 09:30 AM, Ken'ichi Ohmichi wrote:
 Hi
 
 I'd like to request FFE for patches of v3-api-schema.
 The list  is the following:
 
 https://review.openstack.org/#/c/67428/
 https://review.openstack.org/#/c/103437/
 https://review.openstack.org/#/c/103436/
 https://review.openstack.org/#/c/66783/
 
 The one of them has already approved, but it stops merging with temporary -2.
 The other ones have gotten one +2 on each PS.
 This work will make v2.1 API strong.
 
 Thanks
 Ken'ichi Ohmichi
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 

Happy to co-sponsor these, they have very minimal risk to the rest of
Nova. I just went and reviewed the patches and added my +2 to them, so
they are ready to merge should the FFE be approved.

-Sean

-- 
Sean Dague
http://dague.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

2014-09-04 Thread Gary Kotton

Hi,
I do not think that Nova is in a death spiral. I just think that the
current way of working at the moment is strangling the project. I do not
understand why we need to split drivers out of the core project. Why not
have the ability to provide Œcore review¹ status to people for reviewing
those parts of the code? We have enough talented people in OpenStack to be
able to write a driver above gerrit to enable that.
Fragmenting the project will be very unhealthy.
For what it is worth having a release date at the end of a vacation is
really bad. Look at the numbers:
http://stackalytics.com/report/contribution/nova-group/30
Thanks
Gary

On 9/4/14, 3:59 PM, Thierry Carrez thie...@openstack.org wrote:

Like I mentioned before, I think the only way out of the Nova death
spiral is to split code and give control over it to smaller dedicated
review teams. This is one way to do it. Thanks Dan for pulling this
together :)

A couple comments inline:

Daniel P. Berrange wrote:
 [...]
 This is a crisis. A large crisis. In fact, if you got a moment, it's
 a twelve-storey crisis with a magnificent entrance hall, carpeting
 throughout, 24-hour portage, and an enormous sign on the roof,
 saying 'This Is a Large Crisis'. A large crisis requires a large
 plan.
 [...]

I totally agree. We need a plan now, because we can't go through another
cycle without a solution in sight.

 [...]
 This has quite a few implications for the way development would
 operate.
 
  - The Nova core team at least, would be voluntarily giving up a big
amount of responsibility over the evolution of virt drivers. Due
to human nature, people are not good at giving up power, so this
may be painful to swallow. Realistically current nova core are
not experts in most of the virt drivers to start with, and more
important we clearly do not have sufficient time to do a good job
of review with everything submitted. Much of the current need
for core review of virt drivers is to prevent the mis-use of a
poorly defined virt driver API...which can be mitigated - See
later point(s)
 
  - Nova core would/should not have automatic +2 over the virt driver
repositories since it is unreasonable to assume they have the
suitable domain knowledge for all virt drivers out there. People
would of course be able to be members of multiple core teams. For
example John G would naturally be nova-core and nova-xen-core. I
would aim for nova-core and nova-libvirt-core, and so on. I do not
want any +2 responsibility over VMWare/HyperV/Docker drivers since
they're not my area of expertize - I only look at them today because
they have no other nova-core representation.
 
  - Not sure if it implies the Nova PTL would be solely focused on
Nova common. eg would there continue to be one PTL over all virt
driver implementation projects, or would each project have its
own PTL. Maybe this is irrelevant if a Czars approach is chosen
by virt driver projects for their work. I'd be inclined to say
that a single PTL should stay as a figurehead to represent all
the virt driver projects, acting as a point of contact to ensure
we keep communication / co-operation between the drivers in sync.
 [...]

At this point it may look like our current structure (programs, one PTL,
single core teams...) prevents us from implementing that solution. I
just want to say that in OpenStack, organizational structure reflects
how we work, not the other way around. If we need to reorganize
official project structure to work in smarter and long-term healthy
ways, that's a really small price to pay.

-- 
Thierry Carrez (ttx)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [vmware][nova][FFE] vmware-spawn-refactor

On Thu, Sep 04, 2014 at 02:09:26PM +0100, Matthew Booth wrote:
 I'd like to request a FFE for the remaining changes from
 vmware-spawn-refactor. They are:
 
 https://review.openstack.org/#/c/109754/
 https://review.openstack.org/#/c/109755/
 https://review.openstack.org/#/c/114817/
 https://review.openstack.org/#/c/117467/
 https://review.openstack.org/#/c/117283/
 
 https://review.openstack.org/#/c/98322/
 
 All but the last had +A, and were in the gate at the time it was closed.
 The last had not yet been approved, but is ready for core review. It has
 recently had some orthogonal changes split out to simplify it
 considerably. It is largely a code motion patch, and has been given +1
 by VMware CI multiple times.

They're all internal to the VMWare driver, have multiple ACKs from VMWare
maintainers as well as core, so don't require extra review time. So I think
it is reasonable request.

ACK, I'll sponsor it.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova][FFE] Feature freeze exception for virt-driver-numa-placement

2014-09-04 Thread Gary Kotton



On 9/4/14, 4:30 PM, Sean Dague s...@dague.net wrote:

On 09/04/2014 09:21 AM, Daniel P. Berrange wrote:
 On Thu, Sep 04, 2014 at 03:07:24PM +0200, Nikola Đipanov wrote:
 On 09/04/2014 02:31 PM, Sean Dague wrote:
 On 09/04/2014 07:58 AM, Nikola Đipanov wrote:
 Hi team,

 I am requesting the exception for the feature from the subject (find
 specs at [1] and outstanding changes at [2]).

 Some reasons why we may want to grant it:

 First of all all patches have been approved in time and just lost the
 gate race.

 Rejecting it makes little sense really, as it has been commented on
by a
 good chunk of the core team, most of the invasive stuff (db
migrations
 for example) has already merged, and the few parts that may seem
 contentious have either been discussed and agreed upon [3], or can
 easily be addressed in subsequent bug fixes.

 It would be very beneficial to merge it so that we actually get real
 testing on the feature ASAP (scheduling features are not tested in
the
 gate so we need to rely on downstream/3rd party/user testing for
those).

 This statement bugs me. It seems kind of backwards to say we should
 merge a thing that we don't have a good upstream test plan on and put
it
 in a release so that the testing will happen only in the downstream
case.


 The objective reality is that many other things have not had upstream
 testing for a long time (anything that requires more than 1 compute
node
 in Nova for example, and any scheduling feature - as I mention clearly
 above), so not sure how that is backwards from any reasonable point.
 
 More critically with NUMA feature, AFAIK, there is no public cloud in
 existance which exposes NUMA to the guest. So unless someone is willing
 to pay for 100's of bare metal servers to run tempest on, I don't know
 of any infrastructure on which we can test NUMA today.
 
 Of course once we include NUMA features in Nova and release Nova, then
 the Rackspace and/or HP clouds will be in a position to start
considering
 how  when they might expose NUMA features for instances they host. So
by
 including it in Nova today, we would be helping move towards a future
 where we will be able to run tempest against NUMA features.
 
 Blocking NUMA from Nova for lack of automated testing will leave us
trapped
 in a chicken and egg scenario, potentially forever. That's not in
anyones
 best interests IMHO

The spec specifically calls out the scheduler piece being the part that
probably most needs to be tested, especially at large scales here. Those
pieces don't need Tempest to test them, they need more solid functional
tests around the scheduler under those circumstances.

There are interesting (and not all that difficult) ways to do this given
the resources we have, which don't seem to be being explored, which is
my concern.

I share your concern with this feature. I stated it on review
https://review.openstack.org/#/c/115007/ in PS 16. I think that we have
well known scheduling issues and these will be accentuated by a feature
like this. My feeling is that this feature and the PCI feature are both
going to be problematic under scale.

My reservations are when the feature is not enabled that a lot of
unnecessary data will be passed between hosts and the scheduler (this is
why we should have gone with the extensible resources (but that is opening
a can of worms)).

Having said that I think that Nova needs features like this. I am in favor
of moving ahead with this for a number of reasons:
1. The filter is not enabled by default
2. We can fix things moving forwards

So I am +1 on this. If we can document that it is experimental or use at
your own risk then I am +2. But I think that the fact that the admin needs
to configure the filter she/he knows it is at their own risk.

A luta continua



   -Sean

-- 
Sean Dague
https://urldefense.proofpoint.com/v1/url?u=http://dague.net/k=oIvRg1%2BdG
AgOoM1BIlLLqw%3D%3D%0Ar=eH0pxTUZo8NPZyF6hgoMQu%2BfDtysg45MkPhCZFxPEq8%3D%
0Am=Vr9ci4W1jJwlMVh7NJWsxGeY52I2AJ113JDTFO2CluA%3D%0As=45070dc04c1c3bb93
93b6273d23a8310ea404b716cf40c299b487e24ba5a8552

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TripleO] Review metrics - what do we want to measure?

2014-09-04 Thread Jeremy Stanley

On 2014-09-04 11:01:55 +0100 (+0100), Derek Higgins wrote:
[...]
 How would people feel about turning [auto-abandon] back on?

A lot of reviewers (myself among them) feel auto-abandon was a
cold and emotionless way to provide feedback on a change. Especially
on high-change-volume projects where core reviewers may at times get
sucked into triaging other problems for long enough that the
auto-abandoner kills lots of legitimate changes (possibly from
new contributors who will get even more disgusted by this than the
silence itself and walk away indefinitely with the impression that
we really aren't a welcoming development community at all).

 Can it be done on a per project basis?

It can, by running your own... but again it seems far better for
core reviewers to decide if a change has potential or needs to be
abandoned--that way there's an accountable human making that
deliberate choice rather than the review team hiding behind an
automated process so that no one is to blame for hurt feelings
besides the infra operators who are enforcing this draconian measure
for you.

 To make the whole process a little friendlier we could increase
 the time frame from 1 week to 2.

snarkHow about just automatically abandon any new change as soon
as it's published, and if the contributor really feels it's
important they'll unabandon it./snark
-- 
Jeremy Stanley

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [vmware][nova][FFE] vmware-spawn-refactor

2014-09-04 Thread Matthew Booth

On 04/09/14 14:46, Daniel P. Berrange wrote:
 On Thu, Sep 04, 2014 at 02:09:26PM +0100, Matthew Booth wrote:
 I'd like to request a FFE for the remaining changes from
 vmware-spawn-refactor. They are:

 https://review.openstack.org/#/c/109754/
 https://review.openstack.org/#/c/109755/
 https://review.openstack.org/#/c/114817/
 https://review.openstack.org/#/c/117467/
 https://review.openstack.org/#/c/117283/

 https://review.openstack.org/#/c/98322/

 All but the last had +A, and were in the gate at the time it was closed.
 The last had not yet been approved, but is ready for core review. It has
 recently had some orthogonal changes split out to simplify it
 considerably. It is largely a code motion patch, and has been given +1
 by VMware CI multiple times.
 
 They're all internal to the VMWare driver, have multiple ACKs from VMWare
 maintainers as well as core, so don't require extra review time. So I think
 it is reasonable request.
 
 ACK, I'll sponsor it.

Thanks, Dan. John Garbutt has also said he'll sponsor the previously
approved patches, so that's 2.

Matt
-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] Design Summit reloaded

2014-09-04 Thread Eoghan Glynn



 Hi everyone,
 
 I've been thinking about what changes we can bring to the Design Summit
 format to make it more productive. I've heard the feedback from the
 mid-cycle meetups and would like to apply some of those ideas for Paris,
 within the constraints we have (already booked space and time). Here is
 something we could do:
 
 Day 1. Cross-project sessions / incubated projects / other projects
 
 I think that worked well last time. 3 parallel rooms where we can
 address top cross-project questions, discuss the results of the various
 experiments we conducted during juno. Don't hesitate to schedule 2 slots
 for discussions, so that we have time to come to the bottom of those
 issues. Incubated projects (and maybe other projects, if space allows)
 occupy the remaining space on day 1, and could occupy pods on the
 other days.
 
 Day 2 and Day 3. Scheduled sessions for various programs
 
 That's our traditional scheduled space. We'll have a 33% less slots
 available. So, rather than trying to cover all the scope, the idea would
 be to focus those sessions on specific issues which really require
 face-to-face discussion (which can't be solved on the ML or using spec
 discussion) *or* require a lot of user feedback. That way, appearing in
 the general schedule is very helpful. This will require us to be a lot
 stricter on what we accept there and what we don't -- we won't have
 space for courtesy sessions anymore, and traditional/unnecessary
 sessions (like my traditional release schedule one) should just move
 to the mailing-list.
 
 Day 4. Contributors meetups
 
 On the last day, we could try to split the space so that we can conduct
 parallel midcycle-meetup-like contributors gatherings, with no time
 boundaries and an open agenda. Large projects could get a full day,
 smaller projects would get half a day (but could continue the discussion
 in a local bar). Ideally that meetup would end with some alignment on
 release goals, but the idea is to make the best of that time together to
 solve the issues you have. Friday would finish with the design summit
 feedback session, for those who are still around.
 
 
 I think this proposal makes the best use of our setup: discuss clear
 cross-project issues, address key specific topics which need
 face-to-face time and broader attendance, then try to replicate the
 success of midcycle meetup-like open unscheduled time to discuss
 whatever is hot at this point.
 
 There are still details to work out (is it possible split the space,
 should we use the usual design summit CFP website to organize the
 scheduled time...), but I would first like to have your feedback on
 this format. Also if you have alternative proposals that would make a
 better use of our 4 days, let me know.

Apologies for jumping on this thread late.

I'm all for the idea of accommodating a more fluid form of project-
specific discussion, with the schedule emerging in a dynamic way. 

But one aspect of the proposed summit redesign that isn't fully clear
to me is the cross-over between the new Contributors meetups and the
Project pods that we tried out for the first time in Atlanta.

That seemed, to me at least, to be a very useful experiment. In fact:

 parallel midcycle-meetup-like contributors gatherings, with no time
  boundaries and an open agenda

sounds like quite a good description of how some projects used their
pods in ATL.

The advantage of the pods approach in my mind, included:

 * no requirement for reducing the number of design sessions slots,
   as the pod time ran in parallel with the design session tracks
   of other projects

 * depending on where in the week the project track occurred, the
   pod time could include a chunk of scene-setting/preparation 
   discussion *in advance of* the more structured design sessions

 * on a related theme, the pods did not rely on the graveyard shift
   at the backend of the summit when folks tend to hit their Friday
   afternoon brain-full state

Am I missing some compelling advantage of moving all these emergent
project-specific meetups to the Friday?

Cheers,
Eoghan

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [vmware][nova][FFE] vmware-spawn-refactor

On 09/04/2014 03:46 PM, Daniel P. Berrange wrote:
 On Thu, Sep 04, 2014 at 02:09:26PM +0100, Matthew Booth wrote:
 I'd like to request a FFE for the remaining changes from
 vmware-spawn-refactor. They are:

 https://review.openstack.org/#/c/109754/
 https://review.openstack.org/#/c/109755/
 https://review.openstack.org/#/c/114817/
 https://review.openstack.org/#/c/117467/
 https://review.openstack.org/#/c/117283/

 https://review.openstack.org/#/c/98322/

 All but the last had +A, and were in the gate at the time it was closed.
 The last had not yet been approved, but is ready for core review. It has
 recently had some orthogonal changes split out to simplify it
 considerably. It is largely a code motion patch, and has been given +1
 by VMware CI multiple times.
 
 They're all internal to the VMWare driver, have multiple ACKs from VMWare
 maintainers as well as core, so don't require extra review time. So I think
 it is reasonable request.
 
 ACK, I'll sponsor it.
 

+1 here - I've already looked at a number of those.

N.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [Nova] [FFE] alternative request for v2-on-v3-api

Hi

I'd like to request FFE for v2.1 API patches.

This request is different from Christopher's one.
His request is for the approved patches, but this is
for some patches which are not approved yet.

https://review.openstack.org/#/c/113169/ : flavor-manage API
https://review.openstack.org/#/c/114979/ : quota-sets API
https://review.openstack.org/#/c/115197/ : security_groups API

I think these API are used in many cases and important, so I'd like
to test v2.1 API with them together on RC phase.
Two of them have gotten one +2 on each PS and the other one
have gotten one +1.


Thanks
Ken'ichi Ohmichi

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [all] [glance] do NOT ever sort requirements.txt

2014-09-04 Thread Andreas Jaeger

On 09/03/2014 09:09 PM, Clark Boylan wrote:

 On Wed, Sep 3, 2014, at 11:51 AM, Kuvaja, Erno wrote:
 -Original Message-
 From: Sean Dague [mailto:s...@dague.net]
 Sent: 03 September 2014 13:37
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: [openstack-dev] [all] [glance] do NOT ever sort requirements.txt

 I'm not sure why people keep showing up with sort requirements patches
 like - https://review.openstack.org/#/c/76817/6, however, they do.

 All of these need to be -2ed with predjudice.

 requirements.txt is not a declarative interface. The order is important as 
 pip
 processes it in the order it is. Changing the order has impacts on the 
 overall
 integration which can cause wedges later.

 So please stop.

 -Sean

 --
 Sean Dague
 http://dague.net

 Hi Sean  all,

 Could you please open this up a little bit? What are we afraid breaking
 regarding the order of these requirements? I tried to go through pip
 documentation but I could not find reason of specific order of the lines,
 references to keep the order there was 'though.

 I'm now assuming one thing here as I do not know if that's the case. None
 of the packages enables/disables functionality depending of what has been
 installed on the system before, but they have their own dependencies to
 provide those. Based on this assumption I can think of only one scenario
 causing us issues. That is us abusing the example in point 2 of
 https://pip.pypa.io/en/latest/user_guide.html#requirements-files meaning;
 we install package X depending on package Y=1.0,2.0 before installing
 package Z depending on Y=1.0 to ensure that package Y2.0 without
 pinning package Y in our requirements.txt. I certainly hope that this is
 not the case as depending 3rd party vendor providing us specific version
 of dependency package would be extremely stupid.

 Other than that I really don't know how the order could cause us issues,
 but I would be really happy to learn something new today if that is the
 case or if my assumption went wrong.

 Best Regards,
 Erno (jokke_) Kuvaja

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

 The issue is described in the bug that Josh linked
 (https://github.com/pypa/pip/issues/988). Basically pip doesn't do
 dependency resolution in a way that lets you treat requirements as order
 independent. For that to be the case pip would have to evaluate all
 dependencies together then install the intersection of those
 dependencies. Instead it iterates over the list(s) in order and
 evaluates each dependency as it is found.

 Your example basically describes where this breaks. You can both depend
 on the same dependency at different versions and pip will install a
 version that satisfies only one of the dependencies and not the other
 leading to a failed install. However I think a more common case is that
 openstack will pin a dependency and say Y=1.0,2.0 and the X dependency
 will say Y=1.0. If the X dependency comes first you get version 2.5
 which is not valid for your specification of Y=1.0,2.0 and pip fails.
 You fix this by listing Y before X dependency that installs Y with less
 restrictive boundaries.

 Another example of a slightly different failure would be hacking,
 flake8, pep8, and pyflakes. Hacking installs a specific version of
 flake8, pep8, and pyflakes so that we do static lint checking with
 consistent checks each release. If you sort this list alphabetically
 instead of allowing hacking to install its deps flake8 will come first
 and you can get a different version of pep8. Different versions of pep8
 check different things and now the gate has broken.

 The most problematic thing is you can't count on your dependencies from
 not breaking you if they come first (because they are evaluated first).
 So in cases where we know order is important (hacking and pbr and
 probably a handful of others) we should be listing them as early as
 possible in the requirements.

So, is there a specific order to look out for?

AFAIU requirements should have pbr as first requirement and
test-requirements should have hacking as first one. Is there anything
else? What's the best place to document this?

Andreas
-- 
 Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi
  SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
   GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg)
GPG fingerprint = 93A3 365E CE47 B889 DF7F  FED1 389A 563C C272 A126

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

2014-09-04 Thread Dugger, Donald D

Basically +1 with what Daniel is saying (note that, as mentioned, a side effect 
of our effort to split out the scheduler will help but not solve this problem).

My only question is about the need to separate out each virt driver into a 
separate project, wouldn't you accomplish a lot of the benefit by creating a 
single virt project that includes all of the drivers?  I wouldn't necessarily 
expect a VMware guy to understand the specifics of the HyperV implementation 
but both people should understand what a virt driver does, how it interfaces to 
Nova and they should be able to intelligently review each other's code.

--
Don Dugger
Censeo Toto nos in Kansa esse decisse. - D. Gale
Ph: 303/443-3786

-Original Message-
From: Daniel P. Berrange [mailto:berra...@redhat.com] 
Sent: Thursday, September 4, 2014 4:24 AM
To: OpenStack Development
Subject: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt 
drivers

Position statement
==

Over the past year I've increasingly come to the conclusion that Nova is 
heading for (or probably already at) a major crisis. If steps are not taken to 
avert this, the project is likely to loose a non-trivial amount of talent, both 
regular code contributors and core team members. That includes myself. This is 
not good for Nova's long term health and so should be of concern to anyone 
involved in Nova and OpenStack.

For those who don't want to read the whole mail, the executive summary is that 
the nova-core team is an unfixable bottleneck in our development process with 
our current project structure.
The only way I see to remove the bottleneck is to split the virt drivers out of 
tree and let them all have their own core teams in their area of code, leaving 
current nova core to focus on all the common code outside the virt driver 
impls. I, now, none the less urge people to read the whole mail.


Background information
==

I see many factors coming together to form the crisis

 - Burn out of core team members from over work
 - Difficulty bringing new talent into the core team
 - Long delay in getting code reviewed  merged
 - Marginalization of code areas which aren't popular
 - Increasing size of nova code through new drivers
 - Exclusion of developers without corporate backing

Each item on their own may not seem too bad, but combined they add up to a big 
problem.

Core team burn out
--

Having been involved in Nova for several dev cycles now, it is clear that the 
backlog of code up for review never goes away. Even intensive code review 
efforts at various points in the dev cycle makes only a small impact on the 
backlog. This has a pretty significant impact on core team members, as their 
work is never done. At best, the dial is sometimes set to 10, instead of 11.

Many people, myself included, have built tools to help deal with the reviews in 
a more efficient manner than plain gerrit allows for. These certainly help, but 
they can't ever solve the problem on their own - just make it slightly more 
bearable. And this is not even considering that core team members might have 
useful contributions to make in ways beyond just code review. Ultimately the 
workload is just too high to sustain the levels of review required, so core 
team members will eventually burn out (as they have done many times already).

Even if one person attempts to take the initiative to heavily invest in review 
of certain features it is often to no avail.
Unless a second dedicated core reviewer can be found to 'tag team' it is hard 
for one person to make a difference. The end result is that a patch is +2d and 
then sits idle for weeks or more until a merge conflict requires it to be 
reposted at which point even that one +2 is lost. This is a pretty demotivating 
outcome for both reviewers  the patch contributor.


New core team talent


It can't escape attention that the Nova core team does not grow in size very 
often. When Nova was younger and its code base was smaller, it was easier for 
contributors to get onto core because the base level of knowledge required was 
that much smaller. To get onto core today requires a major investment in 
learning Nova over a year or more. Even people who potentially have the latent 
skills may not have the time available to invest in learning the entire of Nova.

With the number of reviews proposed to Nova, the core team should probably be 
at least double its current size[1]. There is plenty of expertize in the 
project as a whole but it is typically focused into specific areas of the 
codebase. There is nowhere we can find
20 more people with broad knowledge of the codebase who could be promoted even 
over the next year, let alone today. This is ignoring that many existing 
members of core are relatively inactive due to burnout and so need replacing. 
That means we really need another
25-30 people for core. That's not going to happen.


Code review delays

Re: [openstack-dev] [Nova] [feature freeze exception] Move to oslo.db

2014-09-04 Thread Joe Gordon

On Wed, Sep 3, 2014 at 11:30 PM, Michael Still mi...@stillhq.com wrote:

 I'm good with this one too, so that makes three if Joe is ok with this.


I am ok with this, I hope the move to oslo.db will fix a few bugs for us
and the nova patch to review isn't too bad.



 @Josh -- can you please take a look at the TH failures?

 Thanks,
 Michael

 On Wed, Sep 3, 2014 at 8:10 PM, Matt Riedemann
 mrie...@linux.vnet.ibm.com wrote:
 
 
  On 9/3/2014 5:08 PM, Andrey Kurilin wrote:
 
  Hi All!
 
  I'd like to ask for a feature freeze exception for porting nova to use
  oslo.db.
 
  This change not only removes 3k LOC, but fixes 4 bugs(see commit message
  for more details) and provides relevant, stable common db code.
 
  Main maintainers of oslo.db(Roman Podoliaka and Victor Sergeyev) are OK
  with this.
 
  Joe Gordon and Matt Riedemann are already signing up, so we need one
  more vote from Core developer.
 
  By the way a lot of core projects are using already oslo.db for a
  while:  keystone, cinder, glance, ceilometer, ironic, heat, neutron and
  sahara. So migration to oslo.db won’t produce any unexpected issues.
 
  Patch is here: https://review.openstack.org/#/c/101901/
 
  --
  Best regards,
  Andrey Kurilin.
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
  Just re-iterating my agreement to sponsor this.  I'm waiting for the
 latest
  patch set to pass Jenkins and for Roman to review after his comments from
  the previous patch set and -1.  Otherwise I think this is nearly ready to
  go.
 
  The turbo-hipster failures on the change appear to be infra issues in t-h
  rather than problems with the code.
 
  --
 
  Thanks,
 
  Matt Riedemann
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 --
 Rackspace Australia

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] WARNING: upcoming dependency change to oslotest

2014-09-04 Thread Doug Hellmann

Next week the Oslo team will be releasing a new version of oslotest that 
replaces its use of the “mox” library with “mox3. This will allow us to 
prepare a packaged version of oslotest that works on both python 2 and 3, which 
is necessary for porting some of the other Oslo libraries as well as 
applications which are trying to use Oslo and support python 3.

mox3 has the same API as mox, so if your test suite uses oslotest.moxstubout 
you shouldn’t notice any difference.

If you are using oslotest but also import mox directly in some of your test 
modules and do not have an explicit dependency on mox, your tests will break. 
There are two ways to fix them: change them to use the moxstubout module to get 
a mox instance or add mox to your test-requirements.txt list. The first 
solution, using moxstubout from oslotest, is preferred because it means your 
test suite is one step closer to being python 3 ready. However, updating 
test-requirements.txt may be a less invasive change and so it might be more 
expedient to use that approach for now.

Doug


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [qa][all][Heat] Packaging of functional tests

2014-09-04 Thread Jay Pipes


On 08/29/2014 05:15 PM, Zane Bitter wrote:

On 29/08/14 14:27, Jay Pipes wrote:

On 08/26/2014 10:14 AM, Zane Bitter wrote:

Steve Baker has started the process of moving Heat tests out of the
Tempest repository and into the Heat repository, and we're looking for
some guidance on how they should be packaged in a consistent way.
Apparently there are a few projects already packaging functional tests
in the package projectname.tests.functional (alongside
projectname.tests.unit for the unit tests).

That strikes me as odd in our context, because while the unit tests run
against the code in the package in which they are embedded, the
functional tests run against some entirely different code - whatever
OpenStack cloud you give it the auth URL and credentials for. So these
tests run from the outside, just like their ancestors in Tempest do.

There's all kinds of potential confusion here for users and packagers.
None of it is fatal and all of it can be worked around, but if we
refrain from doing the thing that makes zero conceptual sense then there
will be no problem to work around :)

I suspect from reading the previous thread about In-tree functional
test vision that we may actually be dealing with three categories of
test here rather than two:

* Unit tests that run against the package they are embedded in
* Functional tests that run against the package they are embedded in
* Integration tests that run against a specified cloud

i.e. the tests we are now trying to add to Heat might be qualitatively
different from the projectname.tests.functional suites that already
exist in a few projects. Perhaps someone from Neutron and/or Swift can
confirm?

I'd like to propose that tests of the third type get their own top-level
package with a name of the form projectname-integrationtests (second
choice: projectname-tempest on the principle that they're essentially
plugins for Tempest). How would people feel about standardising that
across OpenStack?


By its nature, Heat is one of the only projects that would have
integration tests of this nature. For Nova, there are some functional
tests in nova/tests/integrated/ (yeah, badly named, I know) that are
tests of the REST API endpoints and running service daemons (the things
that are RPC endpoints), with a bunch of stuff faked out (like RPC
comms, image services, authentication and the hypervisor layer itself).
So, the integrated tests in Nova are really not testing integration
with other projects, but rather integration of the subsystems and
processes inside Nova.

I'd support a policy that true integration tests -- tests that test the
interaction between multiple real OpenStack service endpoints -- be left
entirely to Tempest. Functional tests that test interaction between
internal daemons and processes to a project should go into
/$project/tests/functional/.

For Heat, I believe tests that rely on faked-out other OpenStack
services but stress the interaction between internal Heat
daemons/processes should be in /heat/tests/functional/ and any tests the
rely on working, real OpenStack service endpoints should be in Tempest.


Well, the problem with that is that last time I checked there was
exactly one Heat scenario test in Tempest because tempest-core doesn't
have the bandwidth to merge all (any?) of the other ones folks submitted.

So we're moving them to openstack/heat for the pure practical reason
that it's the only way to get test coverage at all, rather than concerns
about overloading the gate or theories about the best venue for
cross-project integration testing.


Hmm, speaking of passive aggressivity...

Where can I see a discussion of the Heat integration tests with Tempest 
QA folks? If you give me some background on what efforts have been made 
already and what is remaining to be reviewed/merged/worked on, then I 
can try to get some resources dedicated to helping here.


I would greatly prefer just having a single source of integration 
testing in OpenStack, versus going back to the bad ol' days of everybody 
under the sun rewriting their own.


Note that I'm not talking about functional testing here, just the 
integration testing...


Best,
-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova][FFE] Feature freeze exception for virt-driver-numa-placement

2014-09-04 Thread Murray, Paul (HP Cloud)


On 4 September 2014 14:07, Nikola Đipanov ndipa...@redhat.com wrote:
On 09/04/2014 02:31 PM, Sean Dague wrote:
 On 09/04/2014 07:58 AM, Nikola Đipanov wrote:
 Hi team,

 I am requesting the exception for the feature from the subject (find
 specs at [1] and outstanding changes at [2]).

 Some reasons why we may want to grant it:

 First of all all patches have been approved in time and just lost the
 gate race.

 Rejecting it makes little sense really, as it has been commented on by a
 good chunk of the core team, most of the invasive stuff (db migrations
 for example) has already merged, and the few parts that may seem
 contentious have either been discussed and agreed upon [3], or can
 easily be addressed in subsequent bug fixes.

 It would be very beneficial to merge it so that we actually get real
 testing on the feature ASAP (scheduling features are not tested in the
 gate so we need to rely on downstream/3rd party/user testing for those).

 This statement bugs me. It seems kind of backwards to say we should
 merge a thing that we don't have a good upstream test plan on and put it
 in a release so that the testing will happen only in the downstream case.


The objective reality is that many other things have not had upstream
testing for a long time (anything that requires more than 1 compute node
in Nova for example, and any scheduling feature - as I mention clearly
above), so not sure how that is backwards from any reasonable point.

Thanks to folks using them, it is still kept working and bugs get fixed.
Getting features into the hands of users is extremely important...

 Anyway, not enough to -1 it, but enough to at least say something.


.. but I do not want to get into the discussion about software testing
here, not the place really.

However, I do think it is very harmful to respond to FFE request with
such blanket statements and generalizations, if only for the message it
sends to the contributors (that we really care more about upholding our
own myths as a community than users and features).


I believe you brought this up as one of your justifications for the FFE. When I 
read your statement it does sound as though you want to put experimental code 
in at the final release. I am sure that is not what you had in mind, but I am 
also sure you can also understand Sean's point of view. His point is clear and 
pertinent to your request.

As the person responsible for Nova in HP I will be interested to see how it 
operates in practice. I can assure you we will do extensive testing on it 
before it goes into the wild and we will not put it into practice if we are not 
happy.

Paul

Paul Murray
Nova Technical Lead, HP Cloud
+44 117 312 9309

Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12 1HN 
Registered No: 690597 England. The contents of this message and any attachments 
to it are confidential and may be legally privileged. If you have received this 
message in error, you should delete it from your system immediately and advise 
the sender. To any recipient of this message within HP, unless otherwise stated 
you should consider this message and attachments as HP CONFIDENTIAL.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [nova] requesting an FFE for SRIOV

2014-09-04 Thread Robert Li (baoli)

Hi,

The main sr-iov patches have gone through lots of code reviews, manual 
rebasing, etc. Now we have some critical refactoring work on the existing infra 
to get it ready. All the code for refactoring and sr-iov is up for review.

https://blueprints.launchpad.net/nova/+spec/pci-passthrough-sriov

thanks,
Robert
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [sahara] integration tests in python-saharaclient

2014-09-04 Thread Trevor McKay

Yes, I wrote them.  I use them all the time -- no typo that I know of.
They are great for spinning up a cluster and running EDP jobs.

They may need some polish, but the point is to test the whole chain of
operations from the CLI.  This is contrary to what most OpenStack
projects traditionally do -- most CLI testing is only transformation
testing, that is it tests the output of CLI commands in Tempest but does
not test any kind of integration from the CLI.

Different communities however will have different requirements.  At Red
Hat, for instance, many of our customers rely heavily on the command
line, and our testing includes integration tests from the CLI as the
entry point.  We want this kind of testing.

In fact, in the Icehouse release I found a bug by running the CLI
integration tests.  There was a mismatch between the CLI and Sahara.

These tests are not run in CI currently, however, when/if we end up with
more horsepower in CI they should be.  They should not be deleted.

Best,

Trevor

On Wed, 2014-09-03 at 14:58 -0700, Andrew Lazarev wrote:
 Hi team,
 
 
 Today I've realized that we have some tests called 'integration'
 in python-saharaclient. Also I've found out that Jenkins doesn't use
 them and they can't be run starting from April because of typo in
 tox.ini.
 
 
 Does anyone know what these tests are? Does anyone mind if I delete
 them since we don't use them anyway?
 
 
 Thanks,
 Andrew.
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [sahara] integration tests in python-saharaclient

2014-09-04 Thread Trevor McKay

by the way, what typo?

Trev

On Wed, 2014-09-03 at 14:58 -0700, Andrew Lazarev wrote:
 Hi team,
 
 
 Today I've realized that we have some tests called 'integration'
 in python-saharaclient. Also I've found out that Jenkins doesn't use
 them and they can't be run starting from April because of typo in
 tox.ini.
 
 
 Does anyone know what these tests are? Does anyone mind if I delete
 them since we don't use them anyway?
 
 
 Thanks,
 Andrew.
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

2014-09-04 Thread Thierry Carrez

Sean Dague wrote:
 [...]
 So, honestly, I'll probably remain -1 on the final integration vote, not
 because Zaqar is bad, but because I'm feeling more firmly that for
 OpenStack to not leave the small deployers behind we need to redefine
 the tightly integrated piece of OpenStack to basically the Layer 1  2
 parts of my diagram, and consider the rest of the layers exciting parts
 of our ecosystem that more advanced users may choose to deploy to meet
 their needs. Smaller tent, big ecosystem, easier on ramp.
 
 I realize that largely means Zaqar would be caught up in a definition
 discussion outside of it's control, and that's kind of unfortunate, as
 Flavio and team have been doing a bang up job of late. But we need to
 stop considering integration as the end game of all interesting
 software in the OpenStack ecosystem, and I think it's better to have
 that conversation sooner rather than later.

I think it's pretty clear at this point that:

(1) we need to have a discussion about layers (base nucleus, optional
extra services at the very least) and the level of support we grant to
each -- the current binary approach is not working very well

(2) If we accept Zaqar next week, it's pretty clear it would not fall in
the base nucleus layer but more in an optional extra services layer,
together with at the very least Trove and Sahara

There are two ways of doing this: follow Sean's approach and -1
integration (and have zaqar apply to that optional layer when we
create it), or +1 integration now (and have zaqar follow whichever other
integrated projects we place in that layer when we create it).

I'm still hesitating on the best approach. I think they yield the same
end result, but the -1 approach seems to be a bit more unfair, since it
would be purely for reasons we don't (yet) apply to currently-integrated
projects...

-- 
Thierry Carrez (ttx)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.db]A proposal for DB read/write separation

2014-09-04 Thread Jay Pipes


On 09/02/2014 07:15 AM, Duncan Thomas wrote:

On 11 August 2014 19:26, Jay Pipes jaypi...@gmail.com wrote:


The above does not really make sense for MySQL Galera/PXC clusters *if only
Galera nodes are used in the cluster*. Since Galera is synchronously
replicated, there's no real point in segregating writers from readers, IMO.
Better to just spread the write AND read load equally among all Galera
cluster nodes.


Unfortunately it is possible to get bitten by the difference between
'synchronous' and 'virtually synchronous' in practice.


Not in my experience. The thing that has bitten me in practice are 
Galera's lack of support for SELECT FOR UPDATE, which is used 
extensively in some of the OpenStack projects. Instead of taking a 
write-intent lock on one or more record gaps (which is what InnoDB does 
in the case of a SELECT FOR UPDATE on a local node), Galera happily 
replicates DML statements to all other nodes in the cluster. If two of 
those nodes attempt to modify the same row or rows in a table, then the 
working set replication will fail to certify, which results in a 
certification timeout, which is then converted to an InnoDB deadlock error.


It's the difference between hanging around waiting on a local node for 
the transaction that called SELECT FOR UPDATE to complete and release 
the write-intent locks on a set of table rows versus hanging around 
waiting for the InnoDB deadlock/lock timeout to bubble up from the 
working set replication certification (which typically is longer than 
the time taken to lock the rows in a single transaction, and therefore 
causes thundering herd issues with the conductor attempting to retry 
stuff due to the use of the @retry_on_deadlock decorator which is so 
commonly used everywhere)


FWIW, I've cc'd a real expert on the matter. Peter, feel free to 
clarify, contradict, or just ignore me :)


Best,
-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

2014-09-04 Thread Solly Ross

 My only question is about the need to separate out each virt driver into a 
 separate project, wouldn't you 
 accomplish a lot of the benefit by creating a single virt project that 
 includes all of the drivers?

I don't think there's particularly a *point* to having all drivers in one repo. 
 Part of code review is looking for code gotchas, but part of code review is 
looking for subtle issues that are caused by the very nature of the driver.  A 
HyperV core reviewing a libvirt change should certainly be able to provide 
the former, but most likely cannot provide the latter to a sufficient degree 
(if he or she can, then he or she should be a libvirt core as well).

A strong +1 to Dan's proposal.  I think this would also make it easier for 
non-core reviewers to get started reviewing, without having a specialized tool 
setup.

Best Regards,
Solly Ross

P.S. 
This is a crisis. A large crisis. In fact, if you got a moment, it's
 a twelve-storey crisis with a magnificent entrance hall, carpeting
 throughout, 24-hour portage, and an enormous sign on the roof,
 saying 'This Is a Large Crisis'. A large crisis requires a large
 plan.

Ha!

- Original Message -
 From: Donald D Dugger donald.d.dug...@intel.com
 To: Daniel P. Berrange berra...@redhat.com, OpenStack Development 
 Mailing List (not for usage questions)
 openstack-dev@lists.openstack.org
 Sent: Thursday, September 4, 2014 10:33:27 AM
 Subject: Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out 
 virt drivers
 
 Basically +1 with what Daniel is saying (note that, as mentioned, a side
 effect of our effort to split out the scheduler will help but not solve this
 problem).
 
 My only question is about the need to separate out each virt driver into a
 separate project, wouldn't you accomplish a lot of the benefit by creating a
 single virt project that includes all of the drivers?  I wouldn't
 necessarily expect a VMware guy to understand the specifics of the HyperV
 implementation but both people should understand what a virt driver does,
 how it interfaces to Nova and they should be able to intelligently review
 each other's code.
 
 --
 Don Dugger
 Censeo Toto nos in Kansa esse decisse. - D. Gale
 Ph: 303/443-3786
 
 -Original Message-
 From: Daniel P. Berrange [mailto:berra...@redhat.com]
 Sent: Thursday, September 4, 2014 4:24 AM
 To: OpenStack Development
 Subject: [openstack-dev] [nova] Averting the Nova crisis by splitting out
 virt drivers
 
 Position statement
 ==
 
 Over the past year I've increasingly come to the conclusion that Nova is
 heading for (or probably already at) a major crisis. If steps are not taken
 to avert this, the project is likely to loose a non-trivial amount of
 talent, both regular code contributors and core team members. That includes
 myself. This is not good for Nova's long term health and so should be of
 concern to anyone involved in Nova and OpenStack.
 
 For those who don't want to read the whole mail, the executive summary is
 that the nova-core team is an unfixable bottleneck in our development
 process with our current project structure.
 The only way I see to remove the bottleneck is to split the virt drivers out
 of tree and let them all have their own core teams in their area of code,
 leaving current nova core to focus on all the common code outside the virt
 driver impls. I, now, none the less urge people to read the whole mail.
 
 
 Background information
 ==
 
 I see many factors coming together to form the crisis
 
  - Burn out of core team members from over work
  - Difficulty bringing new talent into the core team
  - Long delay in getting code reviewed  merged
  - Marginalization of code areas which aren't popular
  - Increasing size of nova code through new drivers
  - Exclusion of developers without corporate backing
 
 Each item on their own may not seem too bad, but combined they add up to a
 big problem.
 
 Core team burn out
 --
 
 Having been involved in Nova for several dev cycles now, it is clear that the
 backlog of code up for review never goes away. Even intensive code review
 efforts at various points in the dev cycle makes only a small impact on the
 backlog. This has a pretty significant impact on core team members, as their
 work is never done. At best, the dial is sometimes set to 10, instead of 11.
 
 Many people, myself included, have built tools to help deal with the reviews
 in a more efficient manner than plain gerrit allows for. These certainly
 help, but they can't ever solve the problem on their own - just make it
 slightly more bearable. And this is not even considering that core team
 members might have useful contributions to make in ways beyond just code
 review. Ultimately the workload is just too high to sustain the levels of
 review required, so core team members will eventually burn out (as they have
 done many times already).
 
 Even if one person attempts to take the initiative to

Re: [openstack-dev] Treating notifications as a contract (CADF)

2014-09-04 Thread Neal, Phil

 On 9/04/2014 Sandy Walsh wrote:
 Yesterday, we had a great conversation with Matt Rutkowski from IBM,
 one
 of the authors of the CADF spec.
 
 I was having a disconnect on what CADF offers and got it clarified.
 
 My assumption was CADF was a set of transformation/extraction rules
 for
 taking data from existing data structures and defining them as
 well-known things. For example, CADF needs to know who sent this
 notification. I thought CADF would give us a means to point at an
 existing data structure and say that's where you find it.
 
 But I was wrong. CADF is a full-on schema/data structure of its own. It
 would be a fork-lift replacement for our existing notifications.

This was my aha as well, following a similar discussion with Matt and team, 
but also note that they've articulated an approach for bolt-on changes that 
would enable CADF content in existing pipelines. 
(https://wiki.openstack.org/wiki/Ceilometer/blueprints/support-standard-audit-formats)

 However, if your service hasn't really adopted notifications yet (green
 field) or you can handle a fork-lift replacement, CADF is a good option.
 There are a few gotcha's though. If you have required data that is
 outside of the CADF spec, it would need to go in the attachment
 section of the notification and that still needs a separate schema to
 define it. Matt's team is very receptive to extending the spec to
 include these special cases though.
Agreed that Matt's team was very willing to extend, but I still wonder about 
having to migrate appended data from its pre-approval location to its 
permanent location, depending on the speed of the CADF standard update.
 
 Anyway, I've written up all the options (as I see them) [1] with the
 advantages/disadvantages of each approach. It's just a strawman, so
 bend/spindle/mutilate.
Cool...will add comments there.
 
 Look forward to feedback!
 -S
 
 
 [1] https://wiki.openstack.org/wiki/NotificationsAndCADF
 
 
 
 
 On 9/3/2014 12:30 PM, Sandy Walsh wrote:
  On 9/3/2014 11:32 AM, Chris Dent wrote:
  On Wed, 3 Sep 2014, Sandy Walsh wrote:
 
  We're chatting with IBM about CADF and getting down to specifics
 on
  their applicability to notifications. Once I get StackTach.v3 into
  production I'm keen to get started on revisiting the notification
  format and olso.messaging support for notifications.
 
  Perhaps a hangout for those keenly interested in doing something
 about this?
  That seems like a good idea. I'd like to be a part of that.
I would , too, and I would suggest that much of the Ceilometer team would 
  Unfortunately I won't be at summit but would like to contribute what
  I can before and after.
 
  I took some notes on this a few weeks ago and extracted what
 seemed
  to be the two main threads or ideas the were revealed by the
  conversation that happened in this thread:
 
   * At the micro level have versioned schema for notifications such
 that
 one end can declare I am sending version X of notification
 foo.bar.Y and the other end can effectively deal.
  Yes, that's table-stakes I think. Putting structure around the payload
  section.
 
  Beyond type and version we should be able to attach meta information
  like public/private visibility and perhaps hints for external mapping
  (this trait - that trait in CADF, for example).
 
   * At the macro level standardize a packaging or envelope of all
 notifications so that they can be consumed by very similar code.
 That is: constrain the notifications in some way so we can also
 constrain the consumer code.
  That's the intention of what we have now. The top level traits are
  standard, the payload is open. We really only require: message_id,
  timestamp and event_type. For auditing we need to cover Who, What,
 When,
  Where, Why, OnWhat, OnWhere, FromWhere.
 

To whit, I think we've made good progress in this by defining the what is the 
minimum content for PaaS service notifications and gotten agreement around  
https://review.openstack.org/#/c/113396/11/doc/source/format.rst
for the Juno release. It's been driven by many of these same questions but is 
fairly narrow in scope; it defines a minimum set of content, but doesn't tackle 
the question of structure (beyond trait typing). The timing seems right to dig 
deeper.

   These ideas serve two different purposes: One is to ensure that
   existing notification use cases are satisfied with robustness and
   provide a contract between two endpoints. The other is to allow a
   fecund notification environment that allows and enables many
   participants.
  Good goals. When Producer and Consumer know what to expect,
 things are
  good ... I know to find the Instance ID here. When the consumer
  wants to deal with a notification as a generic object, things get tricky
  (find the instance ID in the payload, What is the image type?, Is
  this an error notification?)
 
  Basically, how do we define the principle artifacts for

[openstack-dev] [sahara] team meeting Sep 4 1800 UTC

2014-09-04 Thread Sergey Lukjanov

Hi folks,

We'll be having the Sahara team meeting as usual in
#openstack-meeting-alt channel.

Agenda: https://wiki.openstack.org/wiki/Meetings/SaharaAgenda#Next_meetings

http://www.timeanddate.com/worldclock/fixedtime.html?msg=Sahara+Meetingiso=20140904T18

-- 
Sincerely yours,
Sergey Lukjanov
Sahara Technical Lead
(OpenStack Data Processing)
Principal Software Engineer
Mirantis Inc.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [TripleO] Review metrics - what do we want to measure?

2014-09-04 Thread Derek Higgins

On 04/09/14 14:54, Jeremy Stanley wrote:
 On 2014-09-04 11:01:55 +0100 (+0100), Derek Higgins wrote:
 [...]
 How would people feel about turning [auto-abandon] back on?
 
 A lot of reviewers (myself among them) feel auto-abandon was a
 cold and emotionless way to provide feedback on a change. Especially
 on high-change-volume projects where core reviewers may at times get
 sucked into triaging other problems for long enough that the
 auto-abandoner kills lots of legitimate changes (possibly from
 new contributors who will get even more disgusted by this than the
 silence itself and walk away indefinitely with the impression that
 we really aren't a welcoming development community at all).

Ok, I see how this may be unwelcoming to a new contributor, a feeling
that could be justified in some cases. Any established contributor
should (I know I did when it was enforce) see it as part of the process.

perhaps we exempt new users?

On the other hand I'm not talking about abandoning a change because
there was silence for a fixed period of time, I'm talking about
abandoning it because it got negative feedback and it wasn't addressed
either through discussion or a new patch.

I have no problem if we push the inactivity period out to a month or
more, I just think there needs to be a cutoff at some stage.


 
 Can it be done on a per project basis?
 
 It can, by running your own... but again it seems far better for
 core reviewers to decide if a change has potential or needs to be
 abandoned--that way there's an accountable human making that
 deliberate choice rather than the review team hiding behind an
 automated process so that no one is to blame for hurt feelings
 besides the infra operators who are enforcing this draconian measure
 for you.

There are plenty of examples of places where we have automated processes
in the community (some of which may hurt feeling) in order to take load
off specific individuals or the community in general. In fact automating
processes in places where people don't scale or are bottlenecks seems to
be a common theme.

We automate CI and give people negative feedback
We expire bugs in some projects that are Incomplete and are 60 days inactive

I really don't see this as the review team hiding behind an automated
process. A patch got negative feedback and we're automating the process
to prompt the submitter to deal with it. It may be more friendly if it
was a 2 step process
  1. (after a few days if inactivity) Add a comment saying you got
negative feedback with suggestions of how to proceed and information
that the review will be autoabandoned if nothing is done in X number of
days.
  2. Auto abandon patch, with as much information as possible on how to
reopen if needed.

 
 To make the whole process a little friendlier we could increase
 the time frame from 1 week to 2.
 
 snarkHow about just automatically abandon any new change as soon
 as it's published, and if the contributor really feels it's
 important they'll unabandon it./snark
 


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [sahara] integration tests in python-saharaclient

2014-09-04 Thread Sergey Lukjanov

As for the sahara-ci, I don't think that we'll have enough free
resources on it to run one more set of tests. So, waiting for more 3rd
party CIs :)

On Thu, Sep 4, 2014 at 6:58 PM, Trevor McKay tmc...@redhat.com wrote:
 by the way, what typo?

 Trev

 On Wed, 2014-09-03 at 14:58 -0700, Andrew Lazarev wrote:
 Hi team,


 Today I've realized that we have some tests called 'integration'
 in python-saharaclient. Also I've found out that Jenkins doesn't use
 them and they can't be run starting from April because of typo in
 tox.ini.


 Does anyone know what these tests are? Does anyone mind if I delete
 them since we don't use them anyway?


 Thanks,
 Andrew.
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



-- 
Sincerely yours,
Sergey Lukjanov
Sahara Technical Lead
(OpenStack Data Processing)
Principal Software Engineer
Mirantis Inc.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

On 09/04/2014 04:59 PM, Thierry Carrez wrote:
 Sean Dague wrote:
 [...]
 So, honestly, I'll probably remain -1 on the final integration vote, not
 because Zaqar is bad, but because I'm feeling more firmly that for
 OpenStack to not leave the small deployers behind we need to redefine
 the tightly integrated piece of OpenStack to basically the Layer 1  2
 parts of my diagram, and consider the rest of the layers exciting parts
 of our ecosystem that more advanced users may choose to deploy to meet
 their needs. Smaller tent, big ecosystem, easier on ramp.

 I realize that largely means Zaqar would be caught up in a definition
 discussion outside of it's control, and that's kind of unfortunate, as
 Flavio and team have been doing a bang up job of late. But we need to
 stop considering integration as the end game of all interesting
 software in the OpenStack ecosystem, and I think it's better to have
 that conversation sooner rather than later.
 
 I think it's pretty clear at this point that:
 
 (1) we need to have a discussion about layers (base nucleus, optional
 extra services at the very least) and the level of support we grant to
 each -- the current binary approach is not working very well
 
 (2) If we accept Zaqar next week, it's pretty clear it would not fall in
 the base nucleus layer but more in an optional extra services layer,
 together with at the very least Trove and Sahara
 
 There are two ways of doing this: follow Sean's approach and -1
 integration (and have zaqar apply to that optional layer when we
 create it), or +1 integration now (and have zaqar follow whichever other
 integrated projects we place in that layer when we create it).

As I mentioned in my reply to Sean's email, I believe +1 integration is
the correct thing to do. I know it's hard to believe that I'm saying
this with my OpenStack hat on and not Zaqar's but that's the truth.

I truly believe we can't stop OpenStack's growth on this. We'll manage
these growth details later on as we've done so far. Growing is as
important as managing the growth. Though, in this case we're not growing
without any clue of what will happen. We've a well known path that all
integrated projects have followed and, in this specific case, Zaqar is
following.

Re-evaluating projects is something that has happened - and should
happen - every once in a while. Once we have a place for this optional
services, we will have to re-evaluate all the integrated projects and
move those that fit into that category.

 
 I'm still hesitating on the best approach. I think they yield the same
 end result, but the -1 approach seems to be a bit more unfair, since it
 would be purely for reasons we don't (yet) apply to currently-integrated
 projects...

+1

Cheers,
Flavio


-- 
@flaper87
Flavio Percoco

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

2014-09-04 Thread Sylvain Bauza



Le 04/09/2014 15:36, Gary Kotton a écrit :

Hi,
I do not think that Nova is in a death spiral. I just think that the
current way of working at the moment is strangling the project. I do not
understand why we need to split drivers out of the core project. Why not
have the ability to provide Œcore review¹ status to people for reviewing
those parts of the code? We have enough talented people in OpenStack to be
able to write a driver above gerrit to enable that.
Fragmenting the project will be very unhealthy.
For what it is worth having a release date at the end of a vacation is
really bad. Look at the numbers:
http://stackalytics.com/report/contribution/nova-group/30
Thanks
Gary


From my perspective, the raw number of reviews should not be the only 
metric for saying if someone good for being a core. Indeed, that's quite 
easy to provide some comments on cosmetic but if you see why the patches 
are getting a -1 from a core, that's mostly because of a more important 
design issue or going reverse from another current effort.



Also, I can note that Stackanalytics metrics are *really* different from 
other tools like 
http://russellbryant.net/openstack-stats/nova-reviewers-30.txt


As a non-core people, I can just say that a core people must be at least 
there during Nova meetings and voice his opinions, provide some help 
with the gate status, look at bugs, give feedback to newcomers etc. and 
not just click on -1 or +1



Here, the problem is that the core team is not scalable : I don't want 
to provide examples of governments but just adding more people is often 
not the solution. Instead, providing delegations to subteams seems maybe 
the intermediate solution for helping this as it could help the core 
team to only approve and leave the subteam's half-cores reviewing the 
iterations until they consider the patch enough good for being merged.


Of course, nova cores could still bypass half-cores as they know the 
whole knowledge of Nova, or they could disapprove what the halfcores 
agreed, but that would free a lot of time for cores without giving them 
more bureaucracy.



I really like Dan's proposal of splitting code into different repos with 
separate teams and a single PTL (that's exactly the difference in 
between a Program and a Project) but as it requires some prework, I'm 
just thinking of allocating halfcores as a short-term solution until all 
the bits are sorted out.


And yes, there is urgency, I also felt the pain.

-Sylvain



On 9/4/14, 3:59 PM, Thierry Carrez thie...@openstack.org wrote:


Like I mentioned before, I think the only way out of the Nova death
spiral is to split code and give control over it to smaller dedicated
review teams. This is one way to do it. Thanks Dan for pulling this
together :)

A couple comments inline:

Daniel P. Berrange wrote:

[...]
This is a crisis. A large crisis. In fact, if you got a moment, it's
a twelve-storey crisis with a magnificent entrance hall, carpeting
throughout, 24-hour portage, and an enormous sign on the roof,
saying 'This Is a Large Crisis'. A large crisis requires a large
plan.
[...]

I totally agree. We need a plan now, because we can't go through another
cycle without a solution in sight.


[...]
This has quite a few implications for the way development would
operate.

  - The Nova core team at least, would be voluntarily giving up a big
amount of responsibility over the evolution of virt drivers. Due
to human nature, people are not good at giving up power, so this
may be painful to swallow. Realistically current nova core are
not experts in most of the virt drivers to start with, and more
important we clearly do not have sufficient time to do a good job
of review with everything submitted. Much of the current need
for core review of virt drivers is to prevent the mis-use of a
poorly defined virt driver API...which can be mitigated - See
later point(s)

  - Nova core would/should not have automatic +2 over the virt driver
repositories since it is unreasonable to assume they have the
suitable domain knowledge for all virt drivers out there. People
would of course be able to be members of multiple core teams. For
example John G would naturally be nova-core and nova-xen-core. I
would aim for nova-core and nova-libvirt-core, and so on. I do not
want any +2 responsibility over VMWare/HyperV/Docker drivers since
they're not my area of expertize - I only look at them today because
they have no other nova-core representation.

  - Not sure if it implies the Nova PTL would be solely focused on
Nova common. eg would there continue to be one PTL over all virt
driver implementation projects, or would each project have its
own PTL. Maybe this is irrelevant if a Czars approach is chosen
by virt driver projects for their work. I'd be inclined to say
that a single PTL should stay as a figurehead to represent all
the virt driver projects, acting as a point of

Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

2014-09-04 Thread Matt Riedemann




On 9/4/2014 9:57 AM, Daniel P. Berrange wrote:

On Thu, Sep 04, 2014 at 02:33:27PM +, Dugger, Donald D wrote:

Basically +1 with what Daniel is saying (note that, as mentioned,
a side effect of our effort to split out the scheduler will help
but not solve this problem).


Thanks for taking the time to read  give feedback


My only question is about the need to separate out each virt driver
into a separate project, wouldn't you accomplish a lot of the
benefit by creating a single virt project that includes all of the
drivers?  I wouldn't necessarily expect a VMware guy to understand
the specifics of the HyperV implementation but both people should
understand what a virt driver does, how it interfaces to Nova and
they should be able to intelligently review each other's code.


A single repo for virt drivers would have all the same costs of
separating from nova common, but with fewer of the benefits of
separate repos per driver. IOW, if we're going to split the
virt drivers out from the nova common, then we should go all
the way.

I think the separate driver repos is fairly compelling for a
number of reasons besides just core team size. As mentioned
elsewhere it allows better targeting of CI test jobs. ie a
VMware CI job can be easily made gating for only VMware code
changes. So VMWare CI instability won't affect libvirt code
submissions, and libvirt CI instability won't affect VMware
code submissions. Separate repos means that people starting
off a new driver (like Ironic or Docker) would not have to
immediately meet the same very high quality  testing bar
that existing drivers do. THey can evolve at their own pace
and not have to then undergo the disruption of jumping from
their initial repo to the 'official' repo.  Finally, I would
like each drivers team to be isolated from each other in terms
of code review capacity planning as far as practical - ie the
libvirt team should be able to accept as many libvirt features
as they can handle without being concerned that they'll reduce
what vmware is able to accept (though changes involving the
nova common code would obviously still contend).



Position statement
==

Over the past year I've increasingly come to the conclusion that Nova is 
heading for (or probably already at) a major crisis. If steps are not taken to 
avert this, the project is likely to loose a non-trivial amount of talent, both 
regular code contributors and core team members. That includes myself. This is 
not good for Nova's long term health and so should be of concern to anyone 
involved in Nova and OpenStack.

For those who don't want to read the whole mail, the executive summary is that 
the nova-core team is an unfixable bottleneck in our development process with 
our current project structure.
The only way I see to remove the bottleneck is to split the virt drivers out of 
tree and let them all have their own core teams in their area of code, leaving 
current nova core to focus on all the common code outside the virt driver 
impls. I, now, none the less urge people to read the whole mail.


Background information
==

I see many factors coming together to form the crisis

  - Burn out of core team members from over work
  - Difficulty bringing new talent into the core team
  - Long delay in getting code reviewed  merged
  - Marginalization of code areas which aren't popular
  - Increasing size of nova code through new drivers
  - Exclusion of developers without corporate backing

Each item on their own may not seem too bad, but combined they add up to a big 
problem.

Core team burn out
--

Having been involved in Nova for several dev cycles now, it is clear that the 
backlog of code up for review never goes away. Even intensive code review 
efforts at various points in the dev cycle makes only a small impact on the 
backlog. This has a pretty significant impact on core team members, as their 
work is never done. At best, the dial is sometimes set to 10, instead of 11.

Many people, myself included, have built tools to help deal with the reviews in 
a more efficient manner than plain gerrit allows for. These certainly help, but 
they can't ever solve the problem on their own - just make it slightly more 
bearable. And this is not even considering that core team members might have 
useful contributions to make in ways beyond just code review. Ultimately the 
workload is just too high to sustain the levels of review required, so core 
team members will eventually burn out (as they have done many times already).

Even if one person attempts to take the initiative to heavily invest in review 
of certain features it is often to no avail.
Unless a second dedicated core reviewer can be found to 'tag team' it is hard for 
one person to make a difference. The end result is that a patch is +2d and then 
sits idle for weeks or more until a merge conflict requires it to be reposted at 
which point even that one +2 is lost. This is a pretty

Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

2014-09-04 Thread Sylvain Bauza



Le 04/09/2014 17:00, Solly Ross a écrit :

My only question is about the need to separate out each virt driver into a 
separate project, wouldn't you
accomplish a lot of the benefit by creating a single virt project that includes 
all of the drivers?

I don't think there's particularly a *point* to having all drivers in one repo.  Part of code review is 
looking for code gotchas, but part of code review is looking for subtle issues that are caused by 
the very nature of the driver.  A HyperV core reviewing a libvirt change should certainly be able 
to provide the former, but most likely cannot provide the latter to a sufficient degree (if he or she can, 
then he or she should be a libvirt core as well).

A strong +1 to Dan's proposal.  I think this would also make it easier for 
non-core reviewers to get started reviewing, without having a specialized tool 
setup.


As I said previously, I'm also giving a +1 to this proposal. That said, 
as I think it deserves at least one iteration for getting this done 
(look at the scheduler split and since hox long we're working on it), I 
also think we need a short-term solution like the one proposed by 
Thierry, ie. what I call half-cores - people who help reviewing an 
code area and free up time for cores just for approving instead of 
focusing on each iteration.


-Sylvain



Best Regards,
Solly Ross

P.S.

This is a crisis. A large crisis. In fact, if you got a moment, it's
a twelve-storey crisis with a magnificent entrance hall, carpeting
throughout, 24-hour portage, and an enormous sign on the roof,
saying 'This Is a Large Crisis'. A large crisis requires a large
plan.

Ha!

- Original Message -

From: Donald D Dugger donald.d.dug...@intel.com
To: Daniel P. Berrange berra...@redhat.com, OpenStack Development Mailing List 
(not for usage questions)
openstack-dev@lists.openstack.org
Sent: Thursday, September 4, 2014 10:33:27 AM
Subject: Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out   
virt drivers

Basically +1 with what Daniel is saying (note that, as mentioned, a side
effect of our effort to split out the scheduler will help but not solve this
problem).

My only question is about the need to separate out each virt driver into a
separate project, wouldn't you accomplish a lot of the benefit by creating a
single virt project that includes all of the drivers?  I wouldn't
necessarily expect a VMware guy to understand the specifics of the HyperV
implementation but both people should understand what a virt driver does,
how it interfaces to Nova and they should be able to intelligently review
each other's code.

--
Don Dugger
Censeo Toto nos in Kansa esse decisse. - D. Gale
Ph: 303/443-3786

-Original Message-
From: Daniel P. Berrange [mailto:berra...@redhat.com]
Sent: Thursday, September 4, 2014 4:24 AM
To: OpenStack Development
Subject: [openstack-dev] [nova] Averting the Nova crisis by splitting out
virt drivers

Position statement
==

Over the past year I've increasingly come to the conclusion that Nova is
heading for (or probably already at) a major crisis. If steps are not taken
to avert this, the project is likely to loose a non-trivial amount of
talent, both regular code contributors and core team members. That includes
myself. This is not good for Nova's long term health and so should be of
concern to anyone involved in Nova and OpenStack.

For those who don't want to read the whole mail, the executive summary is
that the nova-core team is an unfixable bottleneck in our development
process with our current project structure.
The only way I see to remove the bottleneck is to split the virt drivers out
of tree and let them all have their own core teams in their area of code,
leaving current nova core to focus on all the common code outside the virt
driver impls. I, now, none the less urge people to read the whole mail.


Background information
==

I see many factors coming together to form the crisis

  - Burn out of core team members from over work
  - Difficulty bringing new talent into the core team
  - Long delay in getting code reviewed  merged
  - Marginalization of code areas which aren't popular
  - Increasing size of nova code through new drivers
  - Exclusion of developers without corporate backing

Each item on their own may not seem too bad, but combined they add up to a
big problem.

Core team burn out
--

Having been involved in Nova for several dev cycles now, it is clear that the
backlog of code up for review never goes away. Even intensive code review
efforts at various points in the dev cycle makes only a small impact on the
backlog. This has a pretty significant impact on core team members, as their
work is never done. At best, the dial is sometimes set to 10, instead of 11.

Many people, myself included, have built tools to help deal with the reviews
in a more efficient manner than plain gerrit allows for. These certainly
help, but they can't ever

Re: [openstack-dev] [nova] requesting an FFE for SRIOV

2014-09-04 Thread Dan Smith

 The main sr-iov patches have gone through lots of code reviews, manual
 rebasing, etc. Now we have some critical refactoring work on the
 existing infra to get it ready. All the code for refactoring and sr-iov
 is up for review.  

I've been doing a lot of work on this recently, and plan to see it
through if possible.

So, I'll be a sponsor.

In the meeting russellb said he would as well. I think he's tied up
today, so I'm proxying him in here :)

--Dan



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

[openstack-dev] [FFE] requesting FFE for LVM ephemeral storage encryption

2014-09-04 Thread Dan Genin


I would like to request a feature freeze exception for

LVM ephemeral storage encryption[1].

The spec[2] for which was approved early in the Juno release cycle.

This feature provides security for data at-rest on compute nodes. The
proposed feature protects user data from disclosure due to disk block reuse
and improper storage media disposal among other threats and also eliminates
the need to sanitize LVM volumes.  The feature is crucial to data security
in OpenStack as explained in the OpenStack Security Guide[3] and benefits
cloud users and operators regardless of their industry and scale.

The feature was first submitted for review on August 6, 2013 and two of the
three patches implementing this feature were merged in Icehouse[4,5]. The
remaining patch has had approval from a core reviewer for most of the Icehouse
and Juno development cycles. The code is well vetted and ready to be merged.

The main concern about accepting this feature pertains to key management.
In particular, it uses Barbican to avoid storing keys on the compute host,
and Barbican at present has no gate testing.  However, the risk of
regression in case of failure to integrate Barbican is minimal because the
feature interacts with the key manager through an*existing*  abstract keymgr
interface, i.e., has no*explicit*  dependence on Barbican. Moreover, the
feature provides some measure of security even with the existing
place-holder key manager, for example, against disk block reuse attack.

For all of the above reasons I request a feature freeze exception for
LVM ephemeral storage encryption.

Best regards,
Dan

1.https://review.openstack.org/#/c/40467/
2.https://blueprints.launchpad.net/nova/+spec/lvm-ephemeral-storage-encryption
3. http://docs.openstack.org/security-guide/content/
4. https://review.openstack.org/#/c/60621/
5. https://review.openstack.org/#/c/61544/



smime.p7s
Description: S/MIME Cryptographic Signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Nova][FFE] Feature freeze exception for virt-driver-numa-placement

On 09/04/2014 04:51 PM, Murray, Paul (HP Cloud) wrote:
  
 
 On 4 September 2014 14:07, Nikola Đipanov ndipa...@redhat.com wrote:
 
 On 09/04/2014 02:31 PM, Sean Dague wrote:
 
 On 09/04/2014 07:58 AM, Nikola Đipanov wrote:
 
 Hi team,
 
 
 
 I am requesting the exception for the feature from the subject (find
 
 specs at [1] and outstanding changes at [2]).
 
 
 
 Some reasons why we may want to grant it:
 
 
 
 First of all all patches have been approved in time and just lost the
 
 gate race.
 
 
 
 Rejecting it makes little sense really, as it has been commented on by a
 
 good chunk of the core team, most of the invasive stuff (db migrations
 
 for example) has already merged, and the few parts that may seem
 
 contentious have either been discussed and agreed upon [3], or can
 
 easily be addressed in subsequent bug fixes.
 
 
 
 It would be very beneficial to merge it so that we actually get real
 
 testing on the feature ASAP (scheduling features are not tested in the
 
 gate so we need to rely on downstream/3rd party/user testing for those).
 
 
 
 This statement bugs me. It seems kind of backwards to say we should
 
 merge a thing that we don't have a good upstream test plan on and put it
 
 in a release so that the testing will happen only in the downstream case.
 
 
 
  
 
 The objective reality is that many other things have not had upstream
 
 testing for a long time (anything that requires more than 1 compute node
 
 in Nova for example, and any scheduling feature - as I mention clearly
 
 above), so not sure how that is backwards from any reasonable point.
 
  
 
 Thanks to folks using them, it is still kept working and bugs get fixed.
 
 Getting features into the hands of users is extremely important...
 
  
 
 Anyway, not enough to -1 it, but enough to at least say something.
 
 
 
  
 
 .. but I do not want to get into the discussion about software testing
 
 here, not the place really.
 
  
 
 However, I do think it is very harmful to respond to FFE request with
 
 such blanket statements and generalizations, if only for the message it
 
 sends to the contributors (that we really care more about upholding our
 
 own myths as a community than users and features).
 
  
 
  
 
 I believe you brought this up as one of your justifications for the FFE.
 When I read your statement it does sound as though you want to put
 experimental code in at the final release. I am sure that is not what
 you had in mind, but I am also sure you can also understand Sean's point
 of view. His point is clear and pertinent to your request.
 
  
 
 As the person responsible for Nova in HP I will be interested to see how
 it operates in practice. I can assure you we will do extensive testing
 on it before it goes into the wild and we will not put it into practice
 if we are not happy.
 

That is awesome and we as a project are lucky to have that! I would not
want things put into practice that users can't use or see huge flaws with.

I can't help but read this as you being OK with the feature going ahead,
though :).

N.

  
 
 Paul
 
  
 
 Paul Murray
 
 Nova Technical Lead, HP Cloud
 
 +44 117 312 9309
 
 Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks
 RG12 1HN Registered No: 690597 England. The contents of this message and
 any attachments to it are confidential and may be legally privileged. If
 you have received this message in error, you should delete it from your
 system immediately and advise the sender. To any recipient of this
 message within HP, unless otherwise stated you should consider this
 message and attachments as HP CONFIDENTIAL.
 
  
 
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers