[openstack-dev] Taking a break..

2014-10-22 Thread Chris Behrens
Hey all,

Just wanted to drop a quick note to say that I decided to leave Rackspace to 
pursue another opportunity. My last day was last Friday. I won’t have much time 
for OpenStack, but I’m going to continue to hang out in the channels. Having 
been involved in the project since day 1, I’m going to find it difficult to 
fully walk away. I really don’t know how much I’ll continue to stay involved. I 
am completely burned out on nova. However, I’d really like to see versioned 
objects broken out into oslo and Ironic synced with nova’s object advancements. 
So, if I work on anything, it’ll probably be related to that.

Cells will be left in a lot of capable hands. I have shared some thoughts with 
people on how I think we can proceed to make it ‘the way’ in nova. I’m going to 
work on documenting some of this in an etherpad so the thoughts aren’t lost.

Anyway, it’s been fun… the project has grown like crazy! Keep on trucking... 
And while I won’t be active much, don’t be afraid to ping me!

- Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Taking a break..

2014-10-22 Thread Chris Behrens
Thnx, everyone, for the nice comments. Replies to Dan below:

On Oct 22, 2014, at 10:52 AM, Dan Smith d...@danplanet.com wrote:

 I won’t have much time for OpenStack, but I’m going to continue to
 hang out in the channels.
 
 Nope, sorry, veto.

I'm the only Core in this project, so I'm sorry: You do not have -2 rights. :)

 Some options to explain your way out:
 
 1. Oops, I forgot it wasn't April
 2. I have a sick sense of humor; I'm getting help for it
 3. I've come to my senses after a brief break from reality
 
 Seriously, I don't recall a gerrit review for this terrible plan...

This could be arranged. #2 is certainly true (I'm only on step 1 of the 10 step 
program) even though this one is not a joke. :)

 
 Well, I for one am really sorry to see you go. I'd be lying if I said I
 hope that your next opportunity leaves you daydreaming about going back
 to OpenStack before too long. However, if not, good luck!

At the moment, I'm looking forward to new frustrating problems to solve. We'll 
see what happens. :)

Have fun in Paris. And remember this French: Où est la salle de bain?

- Chris

 
 --Dan
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Nominating Jay Pipes for nova-core

2014-07-30 Thread Chris Behrens
+1

On Jul 30, 2014, at 2:02 PM, Michael Still mi...@stillhq.com wrote:

 Greetings,
 
 I would like to nominate Jay Pipes for the nova-core team.
 
 Jay has been involved with nova for a long time now.  He's previously
 been a nova core, as well as a glance core (and PTL). He's been around
 so long that there are probably other types of core status I have
 missed.
 
 Please respond with +1s or any concerns.
 
 References:
 
  https://review.openstack.org/#/q/owner:%22jay+pipes%22+status:open,n,z
 
  https://review.openstack.org/#/q/reviewer:%22jay+pipes%22,n,z
 
  http://stackalytics.com/?module=nova-groupuser_id=jaypipes
 
 As a reminder, we use the voting process outlined at
 https://wiki.openstack.org/wiki/Nova/CoreTeam to add members to our
 core team.
 
 Thanks,
 Michael
 
 -- 
 Rackspace Australia
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][qa] proposal for moving forward on cells/tempest testing

2014-07-14 Thread Chris Behrens

On Jul 14, 2014, at 10:44 AM, Matt Riedemann mrie...@linux.vnet.ibm.com wrote:

 Today we only gate on exercises in devstack for cells testing coverage in the 
 gate-devstack-dsvm-cells job.
 
 The cells tempest non-voting job was moving to the experimental queue here 
 [1] since it doesn't work with a lot of the compute API tests.
 
 I think we all agreed to tar and feather comstud if he didn't get Tempest 
 working (read: passing) with cells enabled in Juno.
 
 The first part of this is just figuring out where we sit with what's failing 
 in Tempest (in the check-tempest-dsvm-cells-full job).
 
 I'd like to propose that we do the following to get the ball rolling:
 
 1. Add an option to tempest.conf under the compute-feature-enabled section to 
 toggle cells and then use that option to skip tests that we know will fail in 
 cells, e.g. security group tests.

I think I was told tempest could infer cells from devstack config or something? 
I dunno the right way to do this.

But, I'm basically +1 to all 3 of these. I think we just skip the broken tests 
for now and iterate on unskipping things one by one.

- Chris


 
 2. Open bugs for all of the tests we're skipping so we can track closing 
 those down, assuming they aren't already reported. [2]
 
 3. Once the known failures are being skipped, we can move 
 check-tempest-dsvm-cells-full out of the experimental queue.  I'm not 
 proposing that it'd be voting right away, I think we have to see it burn in 
 for awhile first.
 
 With at least this plan we should be able to move forward on identifying 
 issues and getting some idea for how much of Tempest doesn't work with cells 
 and the effort involved in making it work.
 
 Thoughts? If there aren't any objections, I said I'd work on the qa-spec and 
 can start doing the grunt-work of opening bugs and skipping tests.
 
 [1] https://review.openstack.org/#/c/87982/
 [2] https://bugs.launchpad.net/nova/+bugs?field.tag=cells+
 
 -- 
 
 Thanks,
 
 Matt Riedemann
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Asyncio and oslo.messaging

2014-07-07 Thread Chris Behrens

On Jul 7, 2014, at 11:11 AM, Angus Salkeld angus.salk...@rackspace.com wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On 03/07/14 05:30, Mark McLoughlin wrote:
 Hey
 
 This is an attempt to summarize a really useful discussion that Victor,
 Flavio and I have been having today. At the bottom are some background
 links - basically what I have open in my browser right now thinking
 through all of this.
 
 We're attempting to take baby-steps towards moving completely from
 eventlet to asyncio/trollius. The thinking is for Ceilometer to be the
 first victim.
 
 Has this been widely agreed on? It seems to me like we are mixing two
 issues:

Right. Does someone have a pointer to where this was decided?

- Chris



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] should we have a stale data indication in nova list/show?

2014-06-24 Thread Chris Behrens
I don't think we should be flipping states for instances on a potentially 
downed compute. We definitely should not set an instance to ERROR. I think a 
time associated with the last power state check might be nice and be good 
enough.

- Chris

 On Jun 24, 2014, at 5:17 PM, Joe Gordon joe.gord...@gmail.com wrote:
 
 
 
 
 On Tue, Jun 24, 2014 at 5:12 PM, Joe Gordon joe.gord...@gmail.com wrote:
 
 
 
 On Tue, Jun 24, 2014 at 4:16 PM, Ahmed RAHAL ara...@iweb.com wrote:
 Le 2014-06-24 17:38, Joe Gordon a écrit :
 
 On Jun 24, 2014 2:31 PM, Russell Bryant rbry...@redhat.com
 mailto:rbry...@redhat.com wrote:
 
   There be dragons here.  Just because Nova doesn't see the node reporting
   in, doesn't mean the VMs aren't actually still running.  I think this
   needs to be left to logic outside of Nova.
  
   For example, if your deployment monitoring really does think the host is
   down, you want to make sure it's *completely* dead before taking further
   action such as evacuating the host.  You certainly don't want to risk
   having the VM running on two different hosts.  This is just a business I
   don't think Nova should be getting in to.
 
 I agree nova shouldn't take any actions. But I don't think leaving an
 instance as 'active' is right either.  I was thinking move instance to
 error state (maybe an unknown state would be more accurate) and let the
 user deal with it, versus just letting the user deal with everything.
 Since nova knows something *may* be wrong shouldn't we convey that to
 the user (I'm not 100% sure we should myself).
 
 I saw compute nodes going down, from a management perspective (say, 
 nova-compute disappeared), but VMs were just fine. Reporting on the state 
 may be misleading. The 'unknown' state would fit, but nothing lets us 
 presume the VMs are non-functional or impacted.
 
 nothing lets us presume the opposite as well. We don't know if the instance 
 is still up.
  
 
 As far as an operator is concerned, a compute node not responding is a 
 reason enough to check the situation.
 
 To go further about other comments related to customer feedback, there are 
 many reasons a customer may think his VM is down, so showing him a 'useful 
 information' in some cases will only trigger more anxiety.
 Besides people will start hammering the API to check 'state' instead of 
 using proper monitoring.
 But, state is already reported if the customer shuts down a VM, so ...
 
 Currently, compute nodes state reporting is done by the nova-compute 
 process himself, reporting back with a time stamp to the database (through 
 conductor if I recall well). It's more like a watchdog than a reporting 
 system.
 For VMs (assuming we find it useful) the same kind of process could occur: 
 nova-compute reporting back all states with time stamps for all VMs he 
 hosts. This shall then be optional, as I already sense scaling/performance 
 issues here (ceilometer anyone ?).
 
 Finally, assuming the customer had access to this 'unknown' state 
 information, what would he be able to do with it ? Usually he has no lever 
 to 'evacuate' or 'recover' the VM. All he could do is spawn another 
 instance to replace the lost one. But only if the VM really is currently 
 unavailable, an information he must get from other sources.
 
 If I was a user, and my instance went to an 'UNKNOWN' state, I would check 
 if its still operating, and if not delete it and start another instance.
 
 The alternative is how things work today, if a nova-compute goes down we 
 don't change any instance states, and the user is responsible for making sure 
 there instance is still operating even if the instance is set to ACTIVE.
  
  
 
 So, I see how the state reporting could be a useful information, but am not 
 sure that nova Status is the right place for it.
 
 Ahmed. in
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Nominating Ken'ichi Ohmichi for nova-core

2014-06-14 Thread Chris Behrens
+1

 On Jun 13, 2014, at 3:40 PM, Michael Still mi...@stillhq.com wrote:
 
 Greetings,
 
 I would like to nominate Ken'ichi Ohmichi for the nova-core team.
 
 Ken'ichi has been involved with nova for a long time now.  His reviews
 on API changes are excellent, and he's been part of the team that has
 driven the new API work we've seen in recent cycles forward. Ken'ichi
 has also been reviewing other parts of the code base, and I think his
 reviews are detailed and helpful.
 
 Please respond with +1s or any concerns.
 
 References:
 
  
 https://review.openstack.org/#/q/owner:ken1ohmichi%2540gmail.com+status:open,n,z
 
  https://review.openstack.org/#/q/reviewer:ken1ohmichi%2540gmail.com,n,z
 
  http://www.stackalytics.com/?module=nova-groupuser_id=oomichi
 
 As a reminder, we use the voting process outlined at
 https://wiki.openstack.org/wiki/Nova/CoreTeam to add members to our
 core team.
 
 Thanks,
 Michael
 
 -- 
 Rackspace Australia
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Proposal: remove the server groups feature

2014-04-25 Thread Chris Behrens

On Apr 25, 2014, at 2:15 PM, Jay Pipes jaypi...@gmail.com wrote:

 Hi Stackers,
 
 When recently digging in to the new server group v3 API extension
 introduced in Icehouse, I was struck with a bit of cognitive dissonance
 that I can't seem to shake. While I understand and support the idea
 behind the feature (affinity and anti-affinity scheduling hints), I
 can't help but feel the implementation is half-baked and results in a
 very awkward user experience.

I agree with all you said about this.

 Proposal
 
 
 I propose to scrap the server groups API entirely and replace it with a
 simpler way to accomplish the same basic thing.
 
 Create two new options to nova boot:
 
 --near-tag TAG
 and
 --not-near-tag TAG
 
 The first would tell the scheduler to place the new VM near other VMs
 having a particular tag. The latter would tell the scheduler to place
 the new VM *not* near other VMs with a particular tag.
 
 What is a tag? Well, currently, since the Compute API doesn't have a
 concept of a single string tag, the tag could be a key=value pair that
 would be matched against the server extra properties.

You can actually already achieve this behavior… although with a little more 
work. There’s the Affinty filter which allows you to specify a 
same_host/different_host scheduler hint where you explicitly specify the 
instance uuids you want…  (the extra work is having to know the instance uuids).

But yeah, I think this makes more sense to me.

- Chris



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack] [nova] Havana - Icehouse upgrades with cells

2014-04-24 Thread Chris Behrens


On Apr 23, 2014, at 6:36 PM, Sam Morrison sorri...@gmail.com wrote:

 Yeah I’m not sure what’s going on, I removed my hacks and tried it using the 
 conductor rpcapi service and got what I think is a recursive call in 
 nova-conductor.
 
 Added more details to https://bugs.launchpad.net/nova/+bug/1308805
 
 I’m thinking there maybe something missing in the stable/havana branch or 
 else cells is doing something different when it comes to objects.
 I don’t think it is a cells issue though as debugging it, it seems like it 
 just can’t back port a 1.13 object to 1.9.
 
 Cheers,
 Sam

Oh.  You know, it turns out that conductor API bug you found…was really not a 
real bug, I don’t think. The only thing that can backport is the conductor 
service, if the conductor service has been upgraded. Ie, ‘use_local’ would 
never ever work, because it was the local service that didn’t understand the 
new object version to begin with. So trying to use_local would still not 
understand the new version. Make sense? (This should probably be made to fail 
gracefully, however :)

And yeah, I think what you have going on now when you’re actually using the 
conductor… is that conductor is getting a request to backport, but it doesn’t 
know how to backport…. so it’s kicking it to itself to backport.. and infinite 
recursion occurs. Do you happen to have use_local=False in your nova-conductor 
nova.conf? That would cause nova-conductor to RPC to itself to try to backport, 
hehe. Again, we should probably have some graceful failing here in some way. 1) 
nova-conductor should probably always force use_local=True. And the LocalAPI 
should probably just implement object_backport() such that it raises a nice 
error.

So, does your nova-conductor not have object version 1.13? As I was trying to 
get at in a previous reply, I think the only way this can possibly work is that 
you have Icehouse nova-conductor running in ALL cells.

- Chris



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack] [nova] Havana - Icehouse upgrades with cells

2014-04-24 Thread Chris Behrens

On Apr 24, 2014, at 6:10 AM, Sam Morrison sorri...@gmail.com wrote:

 Hmm I may have but I’ve just done another test with everything set to 
 use_local=False except nova-conductor where use_local=True
 I also reverted that change I put though as mentioned above and I still get 
 an infinite loop. Can’t really figure out what is going on here. 
 Conductor is trying to talk to conductor and use_local definitely equals True.
 (this is all with havana conductor btw)

Interesting

 
 So, does your nova-conductor not have object version 1.13? As I was trying 
 to get at in a previous reply, I think the only way this can possibly work 
 is that you have Icehouse nova-conductor running in ALL cells.
 
 OK so in my compute cell I am now running an Icehouse conductor. Everything 
 else is Havana including the DB version.
 
 This actually seems to make all the things that didn’t work now work. However 
 it also means that the thing that did work (booting an instance) no longer 
 works.
 This is an easy fix and just requires nova-conductor to call the run_instance 
 scheduler rpcapi method with version 2.9 as opposed the icehouse version 3.0.
 I don’t think anything has changed here so this might be an easy fix that 
 could be pushed upstream. It just needs to change the scheduler rpcapi to be 
 aware what version it can use.
 I changed the upgrade_levels scheduler=havana but that wasn’t handled by the 
 scheduler rpcapi and just gave a version not new enough exception.
 
 I think I’m making progress…..

Cool. So, what is tested upstream is upgrading everything except nova-compute. 
You could try upgrading nova-scheduler as well. Although, I didn’t think we had 
any build path going through conductor yet. Do you happen to have a traceback 
from that? (Curious what the call path looks like)

- Chris


 
 Sam
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Oslo] [Ironic] Can we change rpc_thread_pool_size default value?

2014-04-23 Thread Chris Behrens
Fwiw, we've seen this with nova-scheduler as well. I think the default pool 
size is too large in general. The problem that I've seen stems from the fact 
that DB calls all block and you can easily get a stack of 64 workers all 
waiting to do DB calls. And it happens to work out such that none of the rpc 
pool threads return before all run their DB calls. This is compounded by the 
explicit yield we have for every DB call in nova.  Anyway, this means that all 
of the workers are tied up for quite a while. Since nova casts to the 
scheduler, it doesn't impact the API much. But if you were waiting on an RPC 
response, you could be waiting a while.

Ironic does a lot of RPC calls. I don't think we know the exact behavior in 
Ironic, but I'm assuming it's something similar. If all rpc pool threads are 
essentially stuck until roughly the same time, you end up with API hangs. But 
we're also seeing periodic task run delays as well. It must be getting stuck 
behind a lot of the rpc worker threads such that lowering the number of threads 
helps considerably.

Given DB calls all block the process right now, there's really not much 
advantage to a larger pool size. 64 is too much, IMO. It would
make more sense if there was more IO that could be parallelized.

That didn't answer your question. I've been meaning to ask the same one since 
we discovered this. :)

- Chris

 On Apr 22, 2014, at 3:54 PM, Devananda van der Veen devananda@gmail.com 
 wrote:
 
 Hi!
 
 When a project is using oslo.messaging, how can we change our default 
 rpc_thread_pool_size?
 
 ---
 Background
 
 Ironic has hit a bug where a flood of API requests can deplete the RPC worker 
 pool on the other end and cause things to break in very bad ways. Apparently, 
 nova-conductor hit something similar a while back too. There've been a few 
 long discussions on IRC about it, tracked partially here:
   https://bugs.launchpad.net/ironic/+bug/1308680
 
 tldr; a way we can fix this is to set the rpc_thread_pool_size very small 
 (eg, 4) and keep our conductor.worker_pool size near its current value (eg, 
 64). I'd like these to be the default option values, rather than require 
 every user to change the rpc_thread_pool_size in their local ironic.conf file.
 
 We're also about to switch from the RPC module in oslo-incubator to using the 
 oslo.messaging library.
 
 Why are these related? Because it looks impossible for us to change the 
 default for this option from within Ironic, because the option is registered 
 when EventletExecutor is instantaited (rather than loaded).
 
 https://github.com/openstack/oslo.messaging/blob/master/oslo/messaging/_executors/impl_eventlet.py#L76
 
 
 Thanks,
 Devananda
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack] [nova] Havana - Icehouse upgrades with cells

2014-04-22 Thread Chris Behrens

On Apr 19, 2014, at 11:08 PM, Sam Morrison sorri...@gmail.com wrote:

 Thanks for the info Chris, I’ve actually managed to get things working. 
 Haven’t tested everything fully but seems to be working pretty good.
 
 On 19 Apr 2014, at 7:26 am, Chris Behrens cbehr...@codestud.com wrote:
 
 The problem here is that Havana is not going to know how to backport the 
 Icehouse object, even if had the conductor methods to do so… unless you’re 
 running the Icehouse conductor. But yes, your nova-computes would also need 
 the code to understand to hit conductor to do the backport, which we must 
 not have in Havana?
 
 OK this conductor api method was actually back ported to Havana, it kept it’s 
 1.62 version for the method but in Havana conductor manager it is set to 1.58.
 That is easily fixed but then it gets worse. I may be missing something but 
 the object_backport method doesn’t work at all and looking at the signature 
 never worked?
 I’ve raised a bug: https://bugs.launchpad.net/nova/+bug/1308805

(CCing openstack-dev and Dan Smith)

That looked wrong to me as well, and then I talked with Dan Smith and he 
reminded me the RPC deserializer would turn that primitive into a an object on 
the conductor side. The primitive there is the full primitive we use to wrap 
the object with the versioning information, etc.

Does your backport happen to not pass the full object primitive?  Or maybe 
missing the object RPC deserializer on conductor? (I would think that would 
have to be set in Havana)  nova/service.py would have:

194 serializer = objects_base.NovaObjectSerializer()
195
196 self.rpcserver = rpc.get_server(target, endpoints, serializer)
197 self.rpcserver.start()

I’m guessing that’s there… so I would think maybe the object_backport call you 
have is not passing the full primitive.

I don’t have the time to peak at your code on github right this second, but 
maybe later. :)

- Chris


 
 This also means that if you don’t want your computes on Icehouse yet, you 
 must actually be using nova-conductor and not use_local=True for it. (I saw 
 the patch go up to fix the objects use of conductor API… so I’m guessing you 
 must be using local right now?)
 
 Yeah we still haven’t moved to use conductor so if you also don’t use 
 conductor you’ll need the simple fix at bug: 
 https://bugs.launchpad.net/nova/+bug/1308811
 
 So, I think an upgrade process could be:
 
 1) Backport the ‘object backport’ code into Havana.
 2) Set up *Icehouse* nova-conductor in your child cells and use_local=False 
 on your nova-computes
 3) Restart your nova-computes.
 4) Update *all* nova-cells processes (in all cells) to Icehouse. You can 
 keep use_local=False on these, but you’ll need that object conductor API 
 patch.
 
 At this point you’d have all nova-cells and all nova-conductors on Icehouse 
 and everything else on Havana. If the Havana computes are able to talk to 
 the Icehouse conductors, they should be able to backport any newer object 
 versions. Same with nova-cells receiving older objects from nova-api. It 
 should be able to backport them.
 
 After this, you should be able to upgrade nova-api… and then probably 
 upgrade your nova-computes on a cell-by-cell basis.
 
 I don’t *think* nova-scheduler is getting objects yet, especially if you’re 
 somehow magically able to get builds to work in what you tested so far. :) 
 But if it is, you may find that you need to insert an upgrade of your 
 nova-schedulers to Icehouse between steps 3 and 4 above…or maybe just after 
 #4… so that it can backport objects, also.
 
 I still doubt this will work 100%… but I dunno. :)  And I could be missing 
 something… but… I wonder if that makes sense?
 
 What I have is an Icehouse API cell and a Havana compute cell and havana 
 compute nodes with the following changes:
 
 Change the method signature of attach_volume to match icehouse, the 
 additional arguments are optional and don’t seem to break things if you 
 ignore them.
 https://bugs.launchpad.net/nova/+bug/1308846
 
 Needed a small fix for unlocking, there is a race condition that I have a fix 
 for but haven’t pushed up.
 
 Then I hacked up a fix for object back porting.
 The code is at 
 https://github.com/NeCTAR-RC/nova/commits/nectar/havana-icehouse-compat
 The last three commits are the fixes needed. 
 I still need to push up the unlocking one and also a minor fix for metadata 
 syncing with deleting and notifications.
 
 Would love to get the object back porting stuff fixed properly from someone 
 who knows how all the object stuff works.
 
 Cheers,
 Sam

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ironic] Should we adopt a blueprint design process

2014-04-17 Thread Chris Behrens
+1

 On Apr 17, 2014, at 12:27 PM, Russell Haering russellhaer...@gmail.com 
 wrote:
 
 Completely agree.
 
 We're spending too much time discussing features after they're implemented, 
 which makes contribution more difficult for everyone. Forcing an explicit 
 design+review process, using the same tools as we use for coding+review seems 
 like a great idea. If it doesn't work we can iterate.
 
 
 On Thu, Apr 17, 2014 at 11:01 AM, Kyle Mestery mest...@noironetworks.com 
 wrote:
 On Thu, Apr 17, 2014 at 12:11 PM, Devananda van der Veen
 devananda@gmail.com wrote:
  Hi all,
 
  The discussion of blueprint review has come up recently for several 
  reasons,
  not the least of which is that I haven't yet reviewed many of the 
  blueprints
  that have been filed recently.
 
  My biggest issue with launchpad blueprints is that they do not provide a
  usable interface for design iteration prior to writing code. Between the
  whiteboard section, wikis, and etherpads, we have muddled through a few
  designs (namely cinder and ceilometer integration) with accuracy, but the
  vast majority of BPs are basically reviewed after they're implemented. This
  seems to be a widespread objection to launchpad blueprints within the
  OpenStack community, which others are trying to solve. Having now looked at
  what Nova is doing with the nova-specs repo, and considering that TripleO 
  is
  also moving to that format for blueprint submission, and considering that 
  we
  have a very good review things in gerrit culture in the Ironic community
  already, I think it would be a very positive change.
 
  For reference, here is the Nova discussion thread:
  http://lists.openstack.org/pipermail/openstack-dev/2014-March/029232.html
 
  and the specs repo BP template:
  https://github.com/openstack/nova-specs/blob/master/specs/template.rst
 
  So, I would like us to begin using this development process over the course
  of Juno. We have a lot of BPs up right now that are light on details, and,
  rather than iterate on each of them in launchpad, I would like to propose
  that:
  * we create an ironic-specs repo, based on Nova's format, before the summit
  * I will begin reviewing BPs leading up to the summit, focusing on features
  that were originally targeted to Icehouse and didn't make it, or are
  obviously achievable for J1
  * we'll probably discuss blueprints and milestones at the summit, and will
  probably adjust targets
  * after the summit, for any BP not targeted to J1, we require blueprint
  proposals to go through the spec review process before merging any
  associated code.
 
  Cores and interested parties, please reply to this thread with your
  opinions.
 
 I think this is a great idea Devananda. The Neutron community has
 moved to this model for Juno as well, and people have been very
 positive so far.
 
 Thanks,
 Kyle
 
  --
  Devananda
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] oslo removal of use_tpool conf option

2014-04-17 Thread Chris Behrens

I’m going to try to not lose my cool here, but I’m extremely upset by this.

In December, oslo apparently removed the code for ‘use_tpool’ which allows you 
to run DB calls in Threads because it was ‘eventlet specific’. I noticed this 
when a review was posted to nova to add the option within nova itself:

https://review.openstack.org/#/c/59760/

I objected to this and asked (more demanded) for this to be added back into 
oslo. It was not. What I did not realize when I was reviewing this nova patch, 
was that nova had already synced oslo’s change. And now we’ve released Icehouse 
with a conf option missing that existed in Havana. Whatever projects were using 
oslo’s DB API code has had this option disappear (unless an alternative was 
merged). Maybe it’s only nova.. I don’t know.

Some sort of process broke down here.  nova uses oslo.  And oslo removed 
something nova uses without deprecating or merging an alternative into nova 
first. How I believe this should have worked:

1) All projects using oslo’s DB API code should have merged an alternative 
first.
2) Remove code from oslo.
3) Then sync oslo.

What do we do now? I guess we’ll have to back port the removed code into nova. 
I don’t know about other projects.

NOTE: Very few people are probably using this, because it doesn’t work without 
a patched eventlet. However, Rackspace happens to be one that does. And anyone 
waiting on a new eventlet to be released such that they could use this with 
Icehouse is currently out of luck.

- Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] oslo removal of use_tpool conf option

2014-04-17 Thread Chris Behrens

On Apr 17, 2014, at 4:26 PM, Joshua Harlow harlo...@yahoo-inc.com wrote:

 Just an honest question (no negativity intended I swear!).
 
 If a configuration option exists and only works with a patched eventlet why 
 is that option an option to begin with? (I understand the reason for the 
 patch, don't get me wrong).
 

Right, it’s a valid question. This feature has existed one way or another in 
nova for quite a while. Initially the implementation in nova was wrong. I did 
not know that eventlet was also broken at the time, although I discovered it in 
the process of fixing nova’s code. I chose to leave the feature because it’s 
something that we absolutely need long term, unless you really want to live 
with DB calls blocking the whole process. I know I don’t. Unfortunately the bug 
in eventlet is out of our control. (I made an attempt at fixing it, but it’s 
not 100%. Eventlet folks currently have an alternative up that may or may not 
work… but certainly is not in a release yet.)  We have an outstanding bug on 
our side to track this, also.

The below is comparing apples/oranges for me.

- Chris


 Most users would not be able to use such a configuration since they do not 
 have this patched eventlet (I assume a newer version of eventlet someday in 
 the future will have this patch integrated in it?) so although I understand 
 the frustration around this I don't understand why it would be an option in 
 the first place. An aside, if the only way to use this option is via a 
 non-standard eventlet then how is this option tested in the community, aka 
 outside of said company?
 
 An example:
 
 If yahoo has some patched kernel A that requires an XYZ config turned on in 
 openstack and the only way to take advantage of kernel A is with XYZ config 
 'on', then it seems like that’s a yahoo only patch that is not testable and 
 useable for others, even if patched kernel A is somewhere on github it's 
 still imho not something that should be a option in the community (anyone can 
 throw stuff up on github and then say I need XYZ config to use it).
 
 To me non-standard patches that require XYZ config in openstack shouldn't be 
 part of the standard openstack, no matter the company. If patch A is in the 
 mainline kernel (or other mainline library), then sure it's fair game.
 
 -Josh
 
 From: Chris Behrens cbehr...@codestud.com
 Reply-To: OpenStack Development Mailing List (not for usage questions) 
 openstack-dev@lists.openstack.org
 Date: Thursday, April 17, 2014 at 3:20 PM
 To: OpenStack Development Mailing List openstack-dev@lists.openstack.org
 Subject: [openstack-dev] oslo removal of use_tpool conf option
 
 
 I’m going to try to not lose my cool here, but I’m extremely upset by this.
 
 In December, oslo apparently removed the code for ‘use_tpool’ which allows 
 you to run DB calls in Threads because it was ‘eventlet specific’. I noticed 
 this when a review was posted to nova to add the option within nova itself:
 
 https://review.openstack.org/#/c/59760/
 
 I objected to this and asked (more demanded) for this to be added back into 
 oslo. It was not. What I did not realize when I was reviewing this nova 
 patch, was that nova had already synced oslo’s change. And now we’ve 
 released Icehouse with a conf option missing that existed in Havana. 
 Whatever projects were using oslo’s DB API code has had this option 
 disappear (unless an alternative was merged). Maybe it’s only nova.. I don’t 
 know.
 
 Some sort of process broke down here.  nova uses oslo.  And oslo removed 
 something nova uses without deprecating or merging an alternative into nova 
 first. How I believe this should have worked:
 
 1) All projects using oslo’s DB API code should have merged an alternative 
 first.
 2) Remove code from oslo.
 3) Then sync oslo.
 
 What do we do now? I guess we’ll have to back port the removed code into 
 nova. I don’t know about other projects.
 
 NOTE: Very few people are probably using this, because it doesn’t work 
 without a patched eventlet. However, Rackspace happens to be one that does. 
 And anyone waiting on a new eventlet to be released such that they could use 
 this with Icehouse is currently out of luck.
 
 - Chris
 
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Thoughts from the PTL

2014-04-14 Thread Chris Behrens

On Apr 13, 2014, at 9:58 PM, Michael Still mi...@stillhq.com wrote:

 First off, thanks for electing me as the Nova PTL for Juno. I find the

First off, congrats!

 * a mid cycle meetup. I think the Icehouse meetup was a great success,
 and I'd like to see us do this again in Juno. I'd also like to get the
 location and venue nailed down as early as possible, so that people
 who have complex travel approval processes have a chance to get travel
 sorted out. I think its pretty much a foregone conclusion this meetup
 will be somewhere in the continental US. If you're interested in
 hosting a meetup in approximately August, please mail me privately so
 we can chat.

I think one of the outcomes from the first one was that we should try to do it 
earlier. Feature freeze would be somewhere around first week of September. I’d 
like to see us do it the last week of July at the latest, I think. That is 
still ‘approximately August’, I guess. :)

Thoughts?

- Chris






___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Dropping or weakening the 'only import modules' style guideline - H302

2014-04-09 Thread Chris Behrens

On Apr 9, 2014, at 12:50 PM, Dan Smith d...@danplanet.com wrote:

 So I'm a soft -1 on dropping it from hacking.
 
 Me too.
 
 from testtools import matchers
 ...
 
 Or = matchers.Or
 LessThan = matchers.LessThan
 ...
 
 This is the right way to do it, IMHO, if you have something like
 matchers.Or that needs to be treated like part of the syntax. Otherwise,
 module-only imports massively improves the ability to find where
 something comes from.

+1

My eyes bleed when I open up a python script and find 1 million imports for 
individual functions and classes.

- Chris



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Rolling upgrades in icehouse

2014-03-24 Thread Chris Behrens

On Mar 24, 2014, at 12:31 PM, Tim Bell tim.b...@cern.ch wrote:

 
 How does this interact with cells ? Can the cell API instances be upgraded 
 independently of the cells themselves ?
 
 My ideal use case would be
 
 - It would be possible to upgrade one of the cells (such as a QA environment) 
 before the cell API nodes
 - Cells can be upgraded one-by-one as needed by stability/functionality
 - API cells can be upgraded during this process ... i.e. mid way before the 
 most critical cells are migrated
 
 Is this approach envisaged ?

That would be my goal long term, but I’m not sure it’ll work right now. :)  We 
did try to take care in making sure that the cells manager is backwards 
compatible. I think all messages going DOWN to the child cell from the API will 
work. However, what I could possibly see as broken is messages coming from a 
child cell back up to the API cell. I believe we changed instance updates to 
pass objects back up…  The objects will fail to deserialize right now in the 
API cell, because it could get a newer version and not know how to deal with 
it. If we added support to make nova-cells always redirect via conductor, it 
could actually down-dev the object, but that has performance implications 
because of all of the DB updates the API nova-cells does. There are a number of 
things that I think cells doesn’t pass as objects yet, either, which could be a 
problem.

So, in order words, I think the answer right now is there really is no great 
upgrade plan wrt cells other than just taking a hit and doing everything at 
once. I’d love to fix that, as I think it should work as you describe some day. 
We have work to do to make sure we’re actually passing objects everywhere.. and 
then need to think about how we can get the API cell to be able to deserialize 
newer object versions.

- Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] nova-compute not re-establishing connectivity after controller switchover

2014-03-24 Thread Chris Behrens
Do you have some sort of network device like a firewall between your compute 
and rabbit or you failed from one rabbit over to another?  The only cases where 
I've seen this happen is when the compute side OS doesn't detect a closed 
connection for various reasons. I'm on my phone and didn't check your logs, but 
thought I'd throw it out there. If the OS (linux) doesn't know the connection 
is dead, then obviously the user land software will not, either.  You can 
netstat on both sides of the connection to see if something is out of whack.

 On Mar 24, 2014, at 10:40 AM, Chris Friesen chris.frie...@windriver.com 
 wrote:
 
 On 03/24/2014 11:31 AM, Chris Friesen wrote:
 
 It looks like we're raising
 
 RecoverableConnectionError: connection already closed
 
 down in /usr/lib64/python2.7/site-packages/amqp/abstract_channel.py, but
 nothing handles it.
 
 It looks like the most likely place that should be handling it is
 nova.openstack.common.rpc.impl_kombu.Connection.ensure().
 
 
 In the current oslo.messaging code the ensure() routine explicitly
 handles connection errors (which RecoverableConnectionError is) and
 socket timeouts--the ensure() routine in Havana doesn't do this.
 
 I misread the code, ensure() in Havana does in fact monitor socket timeouts, 
 but it doesn't handle connection errors.
 
 It looks like support for handling connection errors was added to 
 oslo.messaging just recently in git commit 0400cbf.  The git commit comment 
 talks about clustered rabbit nodes and mirrored queues which doesn't apply to 
 our scenario, but I suspect it would probably fix the problem that we're 
 seeing as well.
 
 Chris
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] An analysis of code review in Nova

2014-03-22 Thread Chris Behrens
I'd like to get spawn broken up sooner rather than later, personally. It has 
additional benefits of being able to do better orchestration of builds from 
conductor, etc.

On Mar 14, 2014, at 3:58 PM, Dan Smith d...@danplanet.com wrote:

 Just to answer this point, despite the review latency, please don't be
 tempted to think one big change will get in quicker than a series of
 little, easy to review, changes. All changes are not equal. A large
 change often scares me away to easier to review patches.
 
 Seems like, for Juno-1, it would be worth cancelling all non-urgent
 bug fixes, and doing the refactoring we need.
 
 I think the aim here should be better (and easier to understand) unit
 test coverage. Thats a great way to drive good code structure.
 
 Review latency will be directly affected by how good the refactoring
 changes are staged. If they are small, on-topic and easy to validate,
 they will go quickly. They should be linearized unless there are some
 places where multiple sequences of changes make sense (i.e. refactoring
 a single file that results in no changes required to others).
 
 As John says, if it's just a big change everything patch, or a ton of
 smaller ones that don't fit a plan or process, then it will be slow and
 painful (for everyone).
 
 +1 sounds like a good first step is to move to oslo.vmware
 
 I'm not sure whether I think that refactoring spawn would be better done
 first or second. My gut tells me that doing spawn first would mean that
 we could more easily validate the oslo refactors because (a) spawn is
 impossible to follow right now and (b) refactoring it to smaller methods
 should be fairly easy. The tests for spawn are equally hard to follow
 and refactoring it first would yield a bunch of more unit-y tests that
 would help us follow the oslo refactoring.
 
 However, it sounds like the osloificastion has maybe already started and
 that refactoring spawn will have to take a backseat to that.
 
 --Dan
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Backwards incompatible API changes

2014-03-21 Thread Chris Behrens

FWIW, I’m fine with any of the options posted. But I’m curious about the 
precedence that reverting would create. It essentially sounds like if we 
release a version with an API bug, the bug is no longer a bug in the API and 
the bug becomes a bug in the documentation. The only way to ‘fix' the API then 
would be to rev it. Is that an accurate representation and is that desirable? 
Or do we just say we take these on a case-by-case basis?

- Chris


On Mar 21, 2014, at 10:34 AM, David Kranz dkr...@redhat.com wrote:

 On 03/21/2014 05:04 AM, Christopher Yeoh wrote:
 On Thu, 20 Mar 2014 15:45:11 -0700
 Dan Smith d...@danplanet.com wrote:
 I know that our primary delivery mechanism is releases right now, and
 so if we decide to revert before this gets into a release, that's
 cool. However, I think we need to be looking at CD as a very important
 use-case and I don't want to leave those folks out in the cold.
 
 I don't want to cause issues for the CD people, but perhaps it won't be
 too disruptive for them (some direct feedback would be handy). The
 initial backwards incompatible change did not result in any bug reports
 coming back to us at all. If there were lots of users using it I think
 we could have expected some complaints as they would have had to adapt
 their programs to no longer manually add the flavor access (otherwise
 that would fail). It is of course possible that new programs written in
 the meantime would rely on the new behaviour.
 
 I think (please correct me if I'm wrong) the public CD clouds don't
 expose that part of API to their users so the fallout could be quite
 limited. Some opinions from those who do CD for private clouds would be
 very useful. I'll send an email to openstack-operators asking what
 people there believe the impact would be but at the moment I'm thinking
 that revert is the way we should go.
 
 Could we consider a middle road? What if we made the extension
 silently tolerate an add-myself operation to a flavor, (potentially
 only) right after create? Yes, that's another change, but it means
 that old clients (like horizon) will continue to work, and new
 clients (which expect to automatically get access) will continue to
 work. We can document in the release notes that we made the change to
 match our docs, and that anyone that *depends* on the (admittedly
 weird) behavior of the old broken extension, where a user doesn't
 retain access to flavors they create, may need to tweak their client
 to remove themselves after create.
 My concern is that we'd be digging ourselves an even deeper hole with
 that approach. That for some reason we don't really understand at the
 moment, people have programs which rely on adding flavor access to a
 tenant which is already on the access list being rejected rather than
 silently accepted. And I'm not sure its the behavior from flavor access
 that we actually want.
 
 But we certainly don't want to end up in the situation of trying to
 work out how to rollback two backwards incompatible API changes.
 
 Chris
 Nope.  IMO we should just accept that an incompatible change was made that 
 should not have been, revert it, and move on. I hope that saying our code 
 base is going to support CD does not mean that any incompatible change that 
 slips through our very limited gate cannot be reverted. October was a while 
 back but I'm not sure what principle we would use to draw the line. I am also 
 not sure why this is phrased as a CD vs. not issue. Are the *users* of a 
 system that happens to be managed using CD thought to be more tolerant of 
 their code breaking?
 
 Perhaps it would be a good time to review 
 https://wiki.openstack.org/wiki/Governance/Approved/APIStability and the 
 details of https://wiki.openstack.org/wiki/APIChangeGuidelines to make sure 
 they still reflect the will of the TC and our community.
 
 -David
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Constructive Conversations

2014-03-18 Thread Chris Behrens

On Mar 18, 2014, at 11:57 AM, Matt Riedemann mrie...@linux.vnet.ibm.com wrote:

 […]
 Not to detract from what you're saying, but this is 'meh' to me. My company 
 has some different kind of values thing every 6 months it seems and maybe 
 it's just me but I never really pay attention to any of it.  I think I have 
 to put something on my annual goals/results about it, but it's just fluffy 
 wording.
 
 To me this is a self-policing community, if someone is being a dick, the 
 others should call them on it, or the PTL for the project should stand up 
 against it and set the tone for the community and culture his project wants 
 to have.  That's been my experience at least.
 
 Maybe some people would find codifying this helpful, but there are already 
 lots of wikis and things that people can't remember on a daily basis so 
 adding another isn't probably going to help the problem. Bully's don't tend 
 to care about codes, but if people stand up against them in public they 
 should be outcast.

I agree with the goals and sentiment of Kurt’s message. But, just to add a 
little to Matt’s reply: Let’s face it. Everyone has a bad day now and then. 
It’s easier for some people to lose their cool over others. Nothing’s going to 
change that.

- Chris

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Some thoughts on the nova-specs design process

2014-03-16 Thread Chris Behrens

On Mar 16, 2014, at 7:58 PM, Michael Still mi...@stillhq.com wrote:

 Hi.
 
 So I've written a blueprint for nova for Juno, and uploaded it to
 nova-specs (https://review.openstack.org/#/c/80865/). That got me
 thinking about what this process might look like, and this is what I
 came up with:
 
 * create a launchpad blueprint
 * you write a proposal in the nova-specs repo
 * add the blueprint to the commit message of the design proposal, and
 send the design proposal off for review
 * advertise the existence of the design proposal to relevant stake
 holders (other people who hack on that bit of the code, operators
 mailing list if relevant, etc)
 * when the proposal is approved, it merges into the nova-specs git
 repo and nova-drivers then mark the launchpad blueprint as approved
 * off you go with development as normal
 
 This has the advantage that there's always a launchpad blueprint, and
 that the spec review is associated with that blueprint. That way
 someone who finds the launchpad blueprint but wants to see the actual
 design proposal can easily do so because it is linked as an addressed
 by review on the blueprint.
 
 Thoughts?

Makes sense to me.

- Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] RFC - using Gerrit for Nova Blueprint review approval

2014-03-06 Thread Chris Behrens

On Mar 6, 2014, at 11:09 AM, Russell Bryant rbry...@redhat.com wrote:
[…]
 I think a dedicated git repo for this makes sense.
 openstack/nova-blueprints or something, or openstack/nova-proposals if
 we want to be a bit less tied to launchpad terminology.

+1 to this whole idea.. and we definitely should have a dedicated repo for 
this. I’m indifferent to its name. :)  Either one of those works for me.

- Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Thought exercise for a V2 only world

2014-03-04 Thread Chris Behrens

On Mar 4, 2014, at 4:09 AM, Sean Dague s...@dague.net wrote:

 On 03/04/2014 01:14 AM, Chris Behrens wrote:
 […]
 I don’t think I have an answer, but I’m going to throw out some of my random 
 thoughts about extensions in general. They might influence a longer term 
 decision. But I’m also curious if I’m the only one that feels this way:
 
 I tend to feel like extensions should start outside of nova and any other 
 code needed to support the extension should be implemented by using hooks in 
 nova. The modules implementing the hook code should be shipped with the 
 extension. If hooks don’t exist where needed, they should be created in 
 trunk. I like hooks. Of course, there’s probably such a thing as too many 
 hooks, so… hmm… :)  Anyway, this addresses another annoyance of mine whereby 
 code for extensions is mixed in all over the place. Is it really an 
 extension if all of the supporting code is in ‘core nova’?
 
 That said, I then think that the only extensions shipped with nova are 
 really ones we deem “optional core API components”. “optional” and “core” 
 are probably oxymorons in this context, but I’m just going to go with it. 
 There would be some sort of process by which we let extensions “graduate” 
 into nova.
 
 Like I said, this is not really an answer. But if we had such a model, I 
 wonder if it turns “deprecating extensions” into something more like 
 “deprecating part of the API”… something less likely to happen. Extensions 
 that aren’t used would more likely just never graduate into nova.
 
 So this approach actually really concerns me, because what it says is
 that we should be optimizing Nova for out of tree changes to the API
 which are vendor specific. Which I think is completely the wrong
 direction. Because in that world you'll never be able to move between
 Nova installations. What's worse is you'll get multiple people
 implementing the same feature out of tree, slightly differently.

Right. And I have an internal conflict because I also tend to agree with what 
you’re saying. :) But I think that if we have API extensions at all, we have 
your issue of “never being able to move”. Well, maybe not “never”, because at 
least they’d be easy to “turn on” if they are in nova. But I think for the 
random API extension that only 1 person ever wants to enable, there’s your same 
problem. This is somewhat off-topic, but I just don’t want a ton of bloat in 
nova for something few people use.

 
 I 100% agree the current extensions approach is problematic. It's used
 as a way to circumvent the idea of a stable API (mostly with oh, it's
 an extension, we need this feature right now, and it's not part of core
 so we don't need to give the same guaruntees.)

Yeah, totally..  that’s bad.

 
 So realistically I want to march us towards a place where we stop doing
 that. Nova out of the box should have all the knobs that anyone needs to
 build these kinds of features on top of. If not, we should fix that. It
 shouldn't be optional.

Agree, although I’m not sure if I’m reading this correctly as it sounds like 
you want the knobs that you said above concern you. I want some sort of 
balance. There’s extensions I think absolutely should be part of nova as 
optional features… but I don’t want everything. :)

- Chris

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Thought exercise for a V2 only world

2014-03-04 Thread Chris Behrens

On Mar 4, 2014, at 11:14 AM, Sean Dague s...@dague.net wrote:

 
 I want to give the knobs to the users. If we thought it was important
 enough to review and test in Nova, then we made a judgement call that
 people should have access to it.

Oh, I see. But, I don’t agree, certainly not for every single knob. It’s less 
of an issue in the private cloud world, but when you start offering this as a 
service, not everything is appropriate to enable.

- Chris



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Thought exercise for a V2 only world

2014-03-03 Thread Chris Behrens

On Mar 3, 2014, at 9:23 PM, Joe Gordon joe.gord...@gmail.com wrote:

 Hi All,
 
 here's a case worth exploring in a v2 only world ... what about some
 extension we really think is dead and should go away?  can we ever
 remove it? In the past we have said backwards compatibility means no
 we cannot remove any extensions, if we adopt the v2 only notion of
 backwards compatibility is this still true?

I don’t think I have an answer, but I’m going to throw out some of my random 
thoughts about extensions in general. They might influence a longer term 
decision. But I’m also curious if I’m the only one that feels this way:

I tend to feel like extensions should start outside of nova and any other code 
needed to support the extension should be implemented by using hooks in nova. 
The modules implementing the hook code should be shipped with the extension. If 
hooks don’t exist where needed, they should be created in trunk. I like hooks. 
Of course, there’s probably such a thing as too many hooks, so… hmm… :)  
Anyway, this addresses another annoyance of mine whereby code for extensions is 
mixed in all over the place. Is it really an extension if all of the supporting 
code is in ‘core nova’?

That said, I then think that the only extensions shipped with nova are really 
ones we deem “optional core API components”. “optional” and “core” are probably 
oxymorons in this context, but I’m just going to go with it. There would be 
some sort of process by which we let extensions “graduate” into nova.

Like I said, this is not really an answer. But if we had such a model, I wonder 
if it turns “deprecating extensions” into something more like “deprecating part 
of the API”… something less likely to happen. Extensions that aren’t used would 
more likely just never graduate into nova.

- Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Future of the Nova API

2014-02-26 Thread Chris Behrens

This thread is many messages deep now and I’m busy with a conference this week, 
but I wanted to carry over my opinion from the other “v3 API in Icehouse” 
thread and add a little to it.

Bumping versions is painful. v2 is going to need to live for “a long time” to 
create the least amount of pain. I would think that at least anyone running a 
decent sized Public Cloud would agree, if not anyone just running any sort of 
decent sized cloud. I don’t think there’s a compelling enough reason to 
deprecate v2 and cause havoc with what we currently have in v3. I’d like us to 
spend more time on the proposed “tasks” changes. And I think we need more time 
to figure out if we’re doing versioning in the correct way. If we’ve got it 
wrong, a v3 doesn’t fix the problem and we’ll just be causing more havoc with a 
v4.

- Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Future of the Nova API

2014-02-26 Thread Chris Behrens
Again, just another quick response, but if we can find a way to merge v2 into 
the current v3 code, so that we don't have dual maintenance, that would be 
really nice.

 On Feb 26, 2014, at 5:15 PM, Christopher Yeoh cbky...@gmail.com wrote:
 
 On Wed, 26 Feb 2014 16:04:38 -0600
 Chris Behrens cbehr...@codestud.com wrote:
 
 This thread is many messages deep now and I’m busy with a conference
 this week, but I wanted to carry over my opinion from the other “v3
 API in Icehouse” thread and add a little to it.
 
 Bumping versions is painful. v2 is going to need to live for “a long
 time” to create the least amount of pain. I would think that at least
 anyone running a decent sized Public Cloud would agree, if not anyone
 just running any sort of decent sized cloud. I don’t think there’s a
 compelling enough reason to deprecate v2 and cause havoc with what we
 currently have in v3. I’d like us to spend more time on the proposed
 “tasks” changes. And I think we need more time to figure out if we’re
 doing versioning in the correct way. If we’ve got it wrong, a v3
 doesn’t fix the problem and we’ll just be causing more havoc with a
 v4.
 
 So I guess I agree tasks is something we should develop further and
 that makes significant non backwards compatible changes to the API -
 which is the major reason why we delayed V3. And its really important
 that we get those changes right so we don't need a v4.
 
 However, keeping V3 experimental indefinitely doesn't actually remove
 the dual maintenance burden. The only way to do that is eventually
 remove either the V2 or V3 version or do the suggested backport. 
 
 We've pretty well established that starting a fresh v3 API is a
 multi cycle effort. If we remove the V3 api code in Juno and then
 start working on a new major version bump at a later date at say L or
 M it'll be another multi cycle effort which I doubt would be
 feasible, especially with people knowing there is the real risk at the
 end that it'll just get thrown away. 
 
 And the alternative of not removing V3 leaves the extra maintenance
 burden. So whilst I agree with making sure we get it right but I'm
 wondering exactly what you mean by taking more time to figure out
 what we're doing - is it removing the V3 API code and just coping
 with extra maintenance burden? Or removing it and then trying to do
 a big multi cycle effort again a few cycles down the track?
 
 Chris
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] v3 API in Icehouse

2014-02-19 Thread Chris Behrens
+1. I'd like to leave it experimental as well. I think the task work is 
important to the future of nova-api and I'd like to make sure we're not rushing 
anything. We're going to need to live with old API versions for a long time, so 
it's important that we get it right. I'm also not convinced there's a 
compelling enough reason for one to move to v3 as it is. Extension versioning 
is important, but I'm not sure it can't be backported to v2 in the meantime.

- Chris

 On Feb 19, 2014, at 9:36 AM, Russell Bryant rbry...@redhat.com wrote:
 
 Greetings,
 
 The v3 API effort has been going for a few release cycles now.  As we
 approach the Icehouse release, we are faced with the following question:
 Is it time to mark v3 stable?
 
 My opinion is that I think we need to leave v3 marked as experimental
 for Icehouse.
 
 There are a number of reasons for this:
 
 1) Discussions about the v2 and v3 APIs at the in-person Nova meetup
 last week made me come to the realization that v2 won't be going away
 *any* time soon.  In some cases, users have long term API support
 expectations (perhaps based on experience with EC2).  In the best case,
 we have to get all of the SDKs updated to the new API, and then get to
 the point where everyone is using a new enough version of all of these
 SDKs to use the new API.  I don't think that's going to be quick.
 
 We really don't want to be in a situation where we're having to force
 any sort of migration to a new API.  The new API should be compelling
 enough that everyone *wants* to migrate to it.  If that's not the case,
 we haven't done our job.
 
 2) There's actually quite a bit still left on the existing v3 todo list.
 We have some notes here:
 
 https://etherpad.openstack.org/p/NovaV3APIDoneCriteria
 
 One thing is nova-network support.  Since nova-network is still not
 deprecated, we certainly can't deprecate the v2 API without nova-network
 support in v3.  We removed it from v3 assuming nova-network would be
 deprecated in time.
 
 Another issue is that we discussed the tasks API as the big new API
 feature we would include in v3.  Unfortunately, it's not going to be
 complete for Icehouse.  It's possible we may have some initial parts
 merged, but it's much smaller scope than what we originally envisioned.
 Without this, I honestly worry that there's not quite enough compelling
 functionality yet to encourage a lot of people to migrate.
 
 3) v3 has taken a lot more time and a lot more effort than anyone
 thought.  This makes it even more important that we're not going to need
 a v4 any time soon.  Due to various things still not quite wrapped up,
 I'm just not confident enough that what we have is something we all feel
 is Nova's API of the future.
 
 
 Let's all take some time to reflect on what has happened with v3 so far
 and what it means for how we should move forward.  We can regroup for Juno.
 
 Finally, I would like to thank everyone who has helped with the effort
 so far.  Many hours have been put in to code and reviews for this.  I
 would like to specifically thank Christopher Yeoh for his work here.
 Chris has done an *enormous* amount of work on this and deserves credit
 for it.  He has taken on a task much bigger than anyone anticipated.
 Thanks, Chris!
 
 -- 
 Russell Bryant
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Sent the first batch of invitations to Atlanta's Summit

2014-02-19 Thread Chris Behrens

On Jan 28, 2014, at 12:45 PM, Stefano Maffulli stef...@openstack.org wrote:

 A few minutes ago we sent the first batch of invites to people who
 contributed to any of the official OpenStack programs[1] from 00:00 UTC
 on April 4, 2014 (Grizzly release day) until present.

Something tells me that this date is not correct? :)

- Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Asynchrounous programming: replace eventlet with asyncio

2014-02-07 Thread Chris Behrens

On Feb 7, 2014, at 8:21 AM, Jesse Noller jesse.nol...@rackspace.com wrote:

 It seems that baking concurrency models into the individual clients / 
 services adds some opinionated choices that may not scale, or fit the needs 
 of a large-scale deployment. This is one of the things looking at the client 
 tools I’ve noticed - don’t dictate a concurrency backend, treat it as 
 producer/consumer/message passing and you end up with something that can 
 potentially scale out a lot more. 

I agree, and I think we should do this with our own clients. However, on the 
service side, there are a lot of 3rd party modules that would need the support 
as well. libvirt, xenapi, pyamqp, qpid, kombu (sits on pyamqp), etc, come to 
mind as the top possibilities.

I was also going to change direction in this reply and say that we should back 
up and come up with a basic set of requirements. In this thread, I think I’ve 
only seen arguments against various technology choices without a clear list of 
our requirements. Since Chuck has posted in the meantime, I’m going to start 
(what I view) should be some of our requirements in reply to him.

- Chris



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Asynchrounous programming: replace eventlet with asyncio

2014-02-07 Thread Chris Behrens
I want to address some of Chuck’s post, but I think we should come up with a 
list of requirements. Replies to Chuck inline, and then some requirements below:

On Feb 7, 2014, at 8:38 AM, Chuck Thier cth...@gmail.com wrote:

 Concurrency is hard, let's blame the tools!
 
 Any lib that we use in python is going to have a set of trade-offs.  Looking 
 at a couple of the options on the table:
 
 1.  Threads:  Great! code doesn't have to change too much, but now that code 
 *will* be preempted at any time, so now we have to worry about locking and we 
 have even more race conditions that are difficult to debug.

Yes. I mean, as was pointed out earlier in this thread, there are also some 
gotchas when using eventlet, but there are a lot of cases that you 100% know 
will not result in a context switch. We’ve been able to avoid locks for this 
reason. (Although I also feel like if there’s cases where locking would be 
necessary when using Threads, we should look at how we can re-factor to avoid 
them. It tends to mean we’re sharing too much globally.)

Besides the locking issue, our current model of creating a million greenthreads 
would not work well if we simply converted them to Threads. Our processes are 
already using way too much memory as it is (a separate issue that needs 
investigation). This becomes even worse if we only support async by using 
worker processes, as was suggested and commented on earlier in this thread.

 
 2.  Asyncio:  Explicit FTW!  Except now that big list of dependencies has to 
 also support the same form of explicit concurrency.  This is a trade-off that 
 twisted makes as well.  Any library that might block has to have a separate 
 library made for it.
 
 We could dig deeper, but hopefully you see what I mean.  Changing tools may 
 solve one problem, but at the same time introduce a different set of problems.

Yeah, exactly what I was trying to point out last night in my quick reply 
before bed. :)  This should really be amended to say ‘not monkey-patching’ 
instead of ‘asyncio’. I realized that as soon as I hit Send last night. An 
implementation that would monkey patch and use asyncio underneath doesn’t have 
this issue.

 
 I think the biggest issue with using Eventlet is that developers want to 
 treat it like magic, and you can't do that.  If you are monkey patching the 
 world, then you are doing it wrong.  How about we take a moment to learn how 
 to use the tools we have effectively, rather than just blaming them.  Many 
 projects have managed to use Eventlet effectively (including some in 
 Openstack).

In general, I agree with the ‘monkey patching the world’ statement. Except that 
tests are exempt from that argument. ;) But it may be a necessary evil.

 
 Eventlet isn't perfect, but it has gotten us quite a ways.  If you do choose 
 to use another library, please make sure you are trading for the right set of 
 problems.
 

Which is what leads me to wanting us to get a list of our requirements before 
we make any decisions.

1) Socket/fifo/pipe I/O cannot block ‘other work’.
2) Currently executing code that has potential to block for long periods of 
time need the ability to easily yield for ‘other work’ to be done. This 
statement is general, but I’m thinking about file I/O here. For example, if a 
block of code needs to copy a large file, it needs to be able to yield now and 
then.
3) Semaphores/locks/etc cannot block ‘other work’ that is not trying to acquire 
the same lock.
4) OS calls such as ‘wait’ or ‘waitpid’ need to not block ‘other work’.
5) The solution needs to perform reasonably well.
6) The solution needs to be reasonably resource efficient.
7) The solution needs to fulfill the above requirements even when using 3rd 
party modules.
8) Clients and libraries that we produce need to support the above in a way 
that arbitrary implementations could be used.

I’m debating whether File I/O in #2 should be combined with #1 such that #1 
becomes ‘any I/O’. I might only be separating File I/O out by thinking about 
possible implementations. And I’ve probably missed something.

Anyway, I have opinions on what does and doesn’t satisfy the above, but I’ll 
reply separately. :)

- Chris



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Asynchrounous programming: replace eventlet with asyncio

2014-02-07 Thread Chris Behrens

On Feb 7, 2014, at 2:59 PM, Victor Stinner victor.stin...@enovance.com wrote:

 I don't see why external libraries should be modified. Only the few libraries 
 sending HTTP queries and requests to the database should handle asyncio. 
 Dummy 
 example: the iso8601 module used to parse time doesn't need to be aware of 
 asyncio.

When talking to libvirt, we don't want to block. When we're waiting on rabbit 
or qpid, we don't want to block. When we talk to XenAPI, we don't want to 
block. These are all 3rd party modules. We'd have to convert these all to work 
via a Thread pool, or we would have to monkey patch them like we do today.

- Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Asynchrounous programming: replace eventlet with asyncio

2014-02-06 Thread Chris Behrens

On Feb 6, 2014, at 11:07 PM, Joshua Harlow harlo...@yahoo-inc.com wrote:

 +1
 
 To give an example as to why eventlet implicit monkey patch the world isn't 
 especially great (although it's what we are currently using throughout 
 openstack).
 
 The way I think about how it works is to think about what libraries that a 
 single piece of code calls and how it is very hard to predict whether that 
 code will trigger a implicit switch (conceptually similar to a context 
 switch).

Conversely, switching to asyncio means that every single module call that would 
have blocked before monkey patching… will now block. What is worse? :)

- Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [keystone][nova] Re: Hierarchicical Multitenancy Discussion

2014-02-05 Thread Chris Behrens

Hi Vish,

I’m jumping in slightly late on this, but I also have an interest in this. I’m 
going to preface this by saying that I have not read this whole thread yet, so 
I apologize if I repeat things, say anything that is addressed by previous 
posts, or doesn’t jive with what you’re looking for. :) But what you describe 
below sounds like exactly a use case I’d come up with.

Essentially I want another level above project_id. Depending on the exact use 
case, you could name it ‘wholesale_id’ or ‘reseller_id’...and yeah, ‘org_id’ 
fits in with your example. :) I think that I had decided I’d call it ‘domain’ 
to be more generic, especially after seeing keystone had a domain concept.

Your idea below (prefixing the project_id) is exactly one way I thought of 
doing this to be least intrusive. I, however, thought that this would not be 
efficient. So, I was thinking about proposing that we add ‘domain’ to all of 
our models. But that limits your hierarchy and I don’t necessarily like that. 
:)  So I think that if the queries are truly indexed as you say below, you have 
a pretty good approach. The one issue that comes into mind is that if there’s 
any chance of collision. For example, if project ids (or orgs) could contain a 
‘.’, then ‘.’ as a delimiter won’t work.

My requirements could be summed up pretty well by thinking of this as ‘virtual 
clouds within a cloud’. Deploy a single cloud infrastructure that could look 
like many multiple clouds. ‘domain’ would be the key into each different 
virtual cloud. Accessing one virtual cloud doesn’t reveal any details about 
another virtual cloud.

What this means is:

1) domain ‘a’ cannot see instances (or resources in general) in domain ‘b’. It 
doesn’t matter if domain ‘a’ and domain ‘b’ share the same tenant ID. If you 
act with the API on behalf of domain ‘a’, you cannot see your instances in 
domain ‘b’.
2) Flavors per domain. domain ‘a’ can have different flavors than domain ‘b’.
3) Images per domain. domain ‘a’ could see different images than domain ‘b’.
4) Quotas and quota limits per domain. your instances in domain ‘a’ don’t count 
against quotas in domain ‘b’.
5) Go as far as using different config values depending on what domain you’re 
using. This one is fun. :)

etc.

I’m not sure if you were looking to go that far or not. :) But I think that our 
ideas are close enough, if not exact, that we can achieve both of our goals 
with the same implementation.

I’d love to be involved with this. I am not sure that I currently have the time 
to help with implementation, however.

- Chris



On Feb 3, 2014, at 1:58 PM, Vishvananda Ishaya vishvana...@gmail.com wrote:

 Hello Again!
 
 At the meeting last week we discussed some options around getting true 
 multitenancy in nova. The use case that we are trying to support can be 
 described as follows:
 
 Martha, the owner of ProductionIT provides it services to multiple 
 Enterprise clients. She would like to offer cloud services to Joe at 
 WidgetMaster, and Sam at SuperDevShop. Joe is a Development Manager for 
 WidgetMaster and he has multiple QA and Development teams with many users. 
 Joe needs the ability create users, projects, and quotas, as well as the 
 ability to list and delete resources across WidgetMaster. Martha needs to be 
 able to set the quotas for both WidgetMaster and SuperDevShop; manage users, 
 projects, and objects across the entire system; and set quotas for the client 
 companies as a whole. She also needs to ensure that Joe can't see or mess 
 with anything owned by Sam.
 
 As per the plan I outlined in the meeting I have implemented a 
 Proof-of-Concept that would allow me to see what changes were required in 
 nova to get scoped tenancy working. I used a simple approach of faking out 
 heirarchy by prepending the id of the larger scope to the id of the smaller 
 scope. Keystone uses uuids internally, but for ease of explanation I will 
 pretend like it is using the name. I think we can all agree that 
 ‘orga.projecta’ is more readable than 
 ‘b04f9ea01a9944ac903526885a2666dec45674c5c2c6463dad3c0cb9d7b8a6d8’.
 
 The code basically creates the following five projects:
 
 orga
 orga.projecta
 orga.projectb
 orgb
 orgb.projecta
 
 I then modified nova to replace everywhere where it searches or limits policy 
 by project_id to do a prefix match. This means that someone using project 
 ‘orga’ should be able to list/delete instances in orga, orga.projecta, and 
 orga.projectb.
 
 You can find the code here:
 
  
 https://github.com/vishvananda/devstack/commit/10f727ce39ef4275b613201ae1ec7655bd79dd5f
  
 https://github.com/vishvananda/nova/commit/ae4de19560b0a3718efaffb6c205c7a3c372412f
 
 Keeping in mind that this is a prototype, but I’m hoping to come to some kind 
 of consensus as to whether this is a reasonable approach. I’ve compiled a 
 list of pros and cons.
 
 Pros:
 
  * Very easy to understand
  * Minimal changes to nova
  * Good performance in db (prefix matching uses indexes)

Re: [openstack-dev] [Nova][Scheduler] Will the Scheuler use Nova Objects?

2014-02-05 Thread Chris Behrens

On Jan 30, 2014, at 5:55 AM, Andrew Laski andrew.la...@rackspace.com wrote:

 I'm of the opinion that the scheduler should use objects, for all the reasons 
 that Nova uses objects, but that they should not be Nova objects.  Ultimately 
 what the scheduler needs is a concept of capacity, allocations, and locality 
 of resources.  But the way those are modeled doesn't need to be tied to how 
 Nova does it, and once the scope expands to include Cinder it may quickly 
 turn out to be limiting to hold onto Nova objects.

+2! 
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [keystone][nova] Re: Hierarchicical Multitenancy Discussion

2014-02-05 Thread Chris Behrens

On Feb 5, 2014, at 9:13 AM, Tiwari, Arvind arvind.tiw...@hp.com wrote:

 Hi Chris,
 
 Looking at your requirements, seems my solution (see attached email) is 
 pretty much aligned. What I am trying to propose is
 
 1. One root domain as owner of virtual cloud. Logically linked to n leaf 
 domains. 
 2. All leaf domains falls under admin boundary of virtual cloud owner.
 3. No sharing of resources at project level, that will keep the authorization 
 model simple.
 4. No sharing of resources at domain level either.
 5. Hierarchy or admin boundary will be totally governed by roles. 
 
 This way we can setup a true virtual cloud/Reseller/wholesale model.
 
 Thoughts?

Yeah, sounds the same, although we should clarify what 'resources' means (I 
used the term without completely clarifying it as well :). For example, a 
physical host is a resource, but I fully intend for it to be shared in that it 
will run VMs for multiple domains. So, by resources, I mean things like 
instances, images, networks, although I would also want the flexibility to be 
able to share images/networks between domains.

Here's my larger thought process which led me to these features/requirements:

Within a large company, you will find that you need to provide many discrete 
clouds to different organizations within the company. Each organization 
potentially has different requirements when it comes to flavors, images, 
networks, and even config options. The only current option is to setup 'x' 
completely separate openstack installs. This can be completely cost 
ineffective. Instead of doing this, I want to build 1 big cloud. The benefits 
are:

1) You don't have 'x' groups maintaining 'y' platforms. This results in saving 
time and saving money on people.
2) Creating a new cloud for a new organization takes seconds.
3) You can have a huge cost savings on hardware as it is all shared.

and so forth.

And yes, this exact same model is what Service Providers should want if they 
intend to Resell/Co-brand, etc.

- Chris


 
 Thanks,
 Arvind
 
 -Original Message-
 From: Chris Behrens [mailto:cbehr...@codestud.com] 
 Sent: Wednesday, February 05, 2014 1:27 AM
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [keystone][nova] Re: Hierarchicical Multitenancy 
 Discussion
 
 
 Hi Vish,
 
 I'm jumping in slightly late on this, but I also have an interest in this. 
 I'm going to preface this by saying that I have not read this whole thread 
 yet, so I apologize if I repeat things, say anything that is addressed by 
 previous posts, or doesn't jive with what you're looking for. :) But what you 
 describe below sounds like exactly a use case I'd come up with.
 
 Essentially I want another level above project_id. Depending on the exact use 
 case, you could name it 'wholesale_id' or 'reseller_id'...and yeah, 'org_id' 
 fits in with your example. :) I think that I had decided I'd call it 'domain' 
 to be more generic, especially after seeing keystone had a domain concept.
 
 Your idea below (prefixing the project_id) is exactly one way I thought of 
 doing this to be least intrusive. I, however, thought that this would not be 
 efficient. So, I was thinking about proposing that we add 'domain' to all of 
 our models. But that limits your hierarchy and I don't necessarily like that. 
 :)  So I think that if the queries are truly indexed as you say below, you 
 have a pretty good approach. The one issue that comes into mind is that if 
 there's any chance of collision. For example, if project ids (or orgs) could 
 contain a '.', then '.' as a delimiter won't work.
 
 My requirements could be summed up pretty well by thinking of this as 
 'virtual clouds within a cloud'. Deploy a single cloud infrastructure that 
 could look like many multiple clouds. 'domain' would be the key into each 
 different virtual cloud. Accessing one virtual cloud doesn't reveal any 
 details about another virtual cloud.
 
 What this means is:
 
 1) domain 'a' cannot see instances (or resources in general) in domain 'b'. 
 It doesn't matter if domain 'a' and domain 'b' share the same tenant ID. If 
 you act with the API on behalf of domain 'a', you cannot see your instances 
 in domain 'b'.
 2) Flavors per domain. domain 'a' can have different flavors than domain 'b'.
 3) Images per domain. domain 'a' could see different images than domain 'b'.
 4) Quotas and quota limits per domain. your instances in domain 'a' don't 
 count against quotas in domain 'b'.
 5) Go as far as using different config values depending on what domain you're 
 using. This one is fun. :)
 
 etc.
 
 I'm not sure if you were looking to go that far or not. :) But I think that 
 our ideas are close enough, if not exact, that we can achieve both of our 
 goals with the same implementation.
 
 I'd love to be involved with this. I am not sure that I currently have the 
 time to help with implementation, however.
 
 - Chris
 
 
 
 On Feb 3, 2014, at 1:58 PM

Re: [openstack-dev] [keystone][nova] Re: Hierarchicical Multitenancy Discussion

2014-02-05 Thread Chris Behrens

On Feb 5, 2014, at 3:38 AM, Vishvananda Ishaya vishvana...@gmail.com wrote:

 
 On Feb 5, 2014, at 12:27 AM, Chris Behrens cbehr...@codestud.com wrote:
 
 1) domain ‘a’ cannot see instances (or resources in general) in domain ‘b’. 
 It doesn’t matter if domain ‘a’ and domain ‘b’ share the same tenant ID. If 
 you act with the API on behalf of domain ‘a’, you cannot see your instances 
 in domain ‘b’.
 2) Flavors per domain. domain ‘a’ can have different flavors than domain ‘b’.
 
 I hadn’t thought of this one, but we do have per-project flavors so I think 
 this could work in a project hierarchy world. We might have to rethink the 
 idea of global flavors and just stick them in the top-level project. That way 
 the flavors could be removed. The flavor list would have to be composed by 
 matching all parent projects. It might make sense to have an option for 
 flavors to be “hidden in sub projects somehow as well. In other words if 
 orgb wants to delete a flavor from the global list they could do it by hiding 
 the flavor.
 
 Definitely some things to be thought about here.

Yeah, it's completely do-able in some way. The per-project flavors is a good 
start.

 
 3) Images per domain. domain ‘a’ could see different images than domain ‘b’.
 
 Yes this would require similar hierarchical support in glance.

Yup :)

 
 4) Quotas and quota limits per domain. your instances in domain ‘a’ don’t 
 count against quotas in domain ‘b’.
 
 Yes we’ve talked about quotas for sure. This is definitely needed.

Also: not really related to this, but if we're making considerable quota 
changes, I would also like to see the option for separate quotas _per flavor_, 
even. :)

 
 5) Go as far as using different config values depending on what domain 
 you’re using. This one is fun. :)
 
 Curious for some examples here.

With the idea that I want to be able to provide multiple virtual clouds within 
1 big cloud, these virtual clouds may desire different config options. I'll 
pick one that could make sense:

# When set, compute API will consider duplicate hostnames
# invalid within the specified scope, regardless of case.
# Should be empty, project or global. (string value)
#osapi_compute_unique_server_name_scope=

This is the first one that popped into my mind for some reason, and it turns 
out that this is actually a more complicated example than I was originally 
intending. I left it here, because there might be a potential issue with this 
config option when using 'org.tenant' as project_id. Ignoring that, let's say 
this config option had a way to say I don't want duplicate hostnames within my 
organization at all, I don't want any single tenant in my organization to 
have duplicate hostnames, or I don't care at all about duplicate hostnames. 
Ideally each organization could have its own config for this.

 volved with this. I am not sure that I currently have the time to help with 
 implementation, however.
 
 Come to the meeting on friday! 1600 UTC

I meant to hit the first one. :-/   I'll try to hit it this week.

- Chris



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Asynchrounous programming: replace eventlet with asyncio

2014-02-04 Thread Chris Behrens
Hi,

Interesting thread. I have been working on a side project that is a 
gevent/eventlet replacement [1] that focuses on thread-safety and performance. 
This came about because of an outstanding bug we have with eventlet not being 
Thread safe. (We cannot safely enable thread pooling for DB calls so that they 
will not block.) Unfortunately, I tried to fix the issue while maintaining 
similar performance but haven’t been completely successful. This led me to 
believe that it was reasonable to work on an alternative micro-thread 
implementation on top of greenlet.

So, I admit that this might be somewhat of a biased opinion [2], but I think 
that using a micro-thread implementation is useful. If not for any other 
reason, the resulting code is very clean and easy to read. It allows you to 
write code ‘the normal way’. If you have any sort of experience with real 
threading, it’s really easy to understand.

Regardless of direction, I would like to see an oslo abstraction so that we can 
easily switch out the underlying implementation, potentially even making the 
choice a config option. I think that means that even if we move to asyncio, our 
abstraction layer provides something that looks like microthreads. I think that 
it’s maybe the only common ground that makes sense, and it addresses my 
concerns above regarding readability and ease of use.

- Chris

[1] I haven’t made the code public yet, but will shortly. Mostly I was 
concerned that it looked like a pile of garbage. :) But it’s at a point that 
this isn’t a concern anymore.
[2] I really don’t care if my side project is used w/ OpenStack or not, despite 
thinking we’d do so. It will have usefulness to others outside of OpenStack, 
even if only for the 80-90% gains in performance that it seems to have compared 
to eventlet. Most importantly, it has just been fun.


On Feb 4, 2014, at 12:38 PM, victor stinner victor.stin...@enovance.com wrote:

 Kevin Conway wrote:
 Switching our async IO management from eventlet to asyncio would not be a
 trivial task. Tell me when I'm wrong, but it would require that we
 completely change our programming model from typical, function-call based
 programming to use generator-iterators for everything.
 
 My proposition is to put asyncio on top of greenlet using the greenio 
 project. So the current code can be leaved unchanged (it will continue to 
 eventlet) if you don't want to modify it. New code may use asyncio API 
 instead of greenlet/eventlet API, but the code will still be executed by 
 greenlet. Or you may have different implementations of the same feature, one 
 for eventlet and another for asyncio.
 
 For example, the Oslo Messaging project has an abstraction of the 
 asynchronous framework called executor. So you can use a blocking executor, 
 eventlet, trollius or something else. Today, a patch was proposed by Joshua 
 Harlow (*) to support concurrent.futures to use a pool of thread. I don't 
 know yet how asyncio can be integrated in other projects. I'm just starting 
 with Oslo Messaging :-)
 
 The abstraction layer may be moved from Oslo Messaging to Oslo Incubator, so 
 other projects can reuse it.
 
 (*) Start adding a futures executor based executor, 
 https://review.openstack.org/#/c/70914/
 
 Victor
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] [Ironic] mid-cycle meetup?

2014-01-28 Thread Chris Behrens
I’d be interested in this.  While I have not provided any contributions to 
Ironic thus far, I’m beginning to look at it for some things.  I am local to 
the bay area, so Sunnyvale is a convenient location for me as well. :)

- Chris


On Jan 24, 2014, at 5:30 PM, Devananda van der Veen devananda@gmail.com 
wrote:

 On Fri, Jan 24, 2014 at 2:03 PM, Robert Collins robe...@robertcollins.net 
 wrote:
 This was meant to go to -dev, not -operators. Doh.
 
 
 -- Forwarded message --
 From: Robert Collins robe...@robertcollins.net
 Date: 24 January 2014 08:47
 Subject: [TripleO] mid-cycle meetup?
 To: openstack-operat...@lists.openstack.org
 openstack-operat...@lists.openstack.org
 
 
 Hi, sorry for proposing this at *cough* the mid-way point [christmas
 shutdown got in the way of internal acks...], but who would come if
 there was a mid-cycle meetup? I'm thinking the HP sunnyvale office as
 a venue.
 
 -Rob
 
 
 Hi!
 
 I'd like to co-locate the Ironic midcycle meetup, as there's a lot of overlap 
 between our team's needs and facilitating that collaboration will be good. 
 I've added the [Ironic] tag to the subject to pull in folks who may be 
 filtering on this project specifically. Please keep us in the loop!
 
 Sunnyvale is easy for me, so I'll definitely be there.
 
 Cheers,
 Deva
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Cells] compute api and objects

2013-12-10 Thread Chris Behrens
On Dec 9, 2013, at 2:58 PM, Sam Morrison sorri...@gmail.com wrote:

 Hi,
 
 I’m trying to fix up some cells issues related to objects. Do all compute api 
 methods take objects now?
 cells is still sending DB objects for most methods (except start and stop) 
 and I know there are more than that.
 
 Eg. I know lock/unlock, shelve/unshelve take objects, I assume there are 
 others if not all methods now?

I don't think all of them do.  As the compute API methods were changing, we 
were changing the cells code at the same time to not use the generic 
'call_compute_api_method' RPC call.

It's possible some got missed, however.  And in fact, it does look like this is 
the case.  The shelve calls appear to be example of where things were 
converted, but the cells code was forgotten.  :-/

We'll want to implement new RPC calls in nova/cells/rpcapi that are compatible 
with the compute_rpcapi calls that are normally used.  And then add the 
appropriate code in nova/cells/manager.py and nova/cells/messaging.py.

I can help fix this all up.  I guess we'll want to find and file bugs for all 
of these.  It appears you've got a bug filed for unlock… (lock would also be 
broken, I would think).

- Chris



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Proposal to re-add Dan Prince to nova-core

2013-11-26 Thread Chris Behrens
+1

On Nov 26, 2013, at 11:32 AM, Russell Bryant rbry...@redhat.com wrote:

 Greetings,
 
 I would like to propose that we re-add Dan Prince to the nova-core
 review team.
 
 Dan Prince has been involved with Nova since early in OpenStack's
 history (Bexar timeframe).  He was a member of the nova-core review team
 from May 2011 to June 2013.  He has since picked back up with nova
 reviews [1].  We always say that when people leave nova-core, we would
 love to have them back if they are able to commit the time in the
 future.  I think this is a good example of that.
 
 Please respond with +1s or any concerns.
 
 Thanks,
 
 [1] http://russellbryant.net/openstack-stats/nova-reviewers-30.txt
 
 -- 
 Russell Bryant
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Thoughs please on how to address a problem with mutliple deletes leading to a nova-compute thread pool problem

2013-10-25 Thread Chris Behrens

On Oct 25, 2013, at 3:46 AM, Day, Phil philip@hp.com wrote:

 Hi Folks,
 
 We're very occasionally seeing problems where a thread processing a create 
 hangs (and we've seen when taking to Cinder and Glance).  Whilst those issues 
 need to be hunted down in their own rights, they do show up what seems to me 
 to be a weakness in the processing of delete requests that I'd like to get 
 some feedback on.
 
 Delete is the one operation that is allowed regardless of the Instance state 
 (since it's a one-way operation, and users should always be able to free up 
 their quota).   However when we get a create thread hung in one of these 
 states, the delete requests when they hit the manager will also block as they 
 are synchronized on the uuid.   Because the user making the delete request 
 doesn't see anything happen they tend to submit more delete requests.   The 
 Service is still up, so these go to the computer manager as well, and 
 eventually all of the threads will be waiting for the lock, and the compute 
 manager will stop consuming new messages.
 
 The problem isn't limited to deletes - although in most cases the change of 
 state in the API means that you have to keep making different calls to get 
 past the state checker logic to do it with an instance stuck in another 
 state.   Users also seem to be more impatient with deletes, as they are 
 trying to free up quota for other things. 
 
 So while I know that we should never get a thread into a hung state into the 
 first place, I was wondering about one of the following approaches to address 
 just the delete case:
 
 i) Change the delete call on the manager so it doesn't wait for the uuid 
 lock.  Deletes should be coded so that they work regardless of the state of 
 the VM, and other actions should be able to cope with a delete being 
 performed from under them.  There is of course no guarantee that the delete 
 itself won't block as well. 
 

Agree.  I've argued for a long time that our code should be able to handle the 
instance disappearing.  We do have a number of places where we catch 
InstanceNotFound to handle this already.


 ii) Record in the API server that a delete has been started (maybe enough to 
 use the task state being set to DELETEING in the API if we're sure this 
 doesn't get cleared), and add a periodic task in the compute manager to check 
 for and delete instances that are in a DELETING state for more than some 
 timeout. Then the API, knowing that the delete will be processes eventually 
 can just no-op any further delete requests.

We already set to DELETING in the API (unless I'm mistaken -- but I looked at 
this recently).  However, instead of dropping duplicate deletes, I say they 
should still be sent/handled.  Any delete code should be able to handle if 
another delete is occurring at the same time, IMO…  much like how you say other 
methods should be able to handle an instance disappearing from underneath.  If 
a compute goes down while 'deleting', a 2nd delete later should still be able 
to function locally.  Same thing if the message to compute happens to be lost.

 
 iii) Add some hook into the ServiceGroup API so that the timer could depend 
 on getting a free thread from the compute manager pool (ie run some no-op 
 task) - so that of there are no free threads then the service becomes down. 
 That would (eventually) stop the scheduler from sending new requests to it, 
 and make deleted be processed in the API server but won't of course help with 
 commands for other instances on the same host.

This seems kinda hacky to me.

 
 iv) Move away from having a general topic and thread pool for all requests, 
 and start a listener on an instance specific topic for each running instance 
 on a host (leaving the general topic and pool just for creates and other 
 non-instance calls like the hypervisor API).   Then a blocked task would only 
 affect request for a specific instance.
 

I don't like this one when thinking about scale.  1 million instances = = 1 
million more queues.

 I'm tending towards ii) as a simple and pragmatic solution in the near term, 
 although I like both iii) and iv) as being both generally good enhancments - 
 but iv) in particular feels like a pretty seismic change.

I vote for both i) and ii) at minimum.

- Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Swift] Havana Release Notes Known Issues is talking about Nova (Re: [Openstack] OpenStack 2013.2 (Havana) is released !)

2013-10-19 Thread Chris Behrens
I may have put that in the wrong spot.  Oops.

 On Oct 18, 2013, at 11:11 PM, Akihiro Motoki amot...@gmail.com wrote:
 
 Hi Thierry, John,
 
 In Havana release notes, Swift known issues section is talking about
 Nova Cells issue. Could you confirm?
 https://wiki.openstack.org/wiki/ReleaseNotes/Havana#Known_Issues
 
 Thanks,
 Akihiro
 
 On Thu, Oct 17, 2013 at 11:23 PM, Thierry Carrez thie...@openstack.org 
 wrote:
 Hello everyone,
 
 It is my great pleasure to announce the final release of OpenStack
 2013.2. It marks the end of the Havana 6-month-long development cycle,
 which saw the addition of two integrated components (Ceilometer and
 Heat), the completion of more than 400 feature blueprints and the fixing
 of more than 3000 reported bugs !
 
 You can find source tarballs for each integrated project, together with
 lists of features and bugfixes, at:
 
 OpenStack Compute:https://launchpad.net/nova/havana/2013.2
 OpenStack Object Storage: https://launchpad.net/swift/havana/1.10.0
 OpenStack Image Service:  https://launchpad.net/glance/havana/2013.2
 OpenStack Networking: https://launchpad.net/neutron/havana/2013.2
 OpenStack Block Storage:  https://launchpad.net/cinder/havana/2013.2
 OpenStack Identity:   https://launchpad.net/keystone/havana/2013.2
 OpenStack Dashboard:  https://launchpad.net/horizon/havana/2013.2
 OpenStack Metering:   https://launchpad.net/ceilometer/havana/2013.2
 OpenStack Orchestration:  https://launchpad.net/heat/havana/2013.2
 
 The Havana Release Notes contain an overview of the key features, as
 well as upgrade notes and current lists of known issues. You can access
 them at: https://wiki.openstack.org/wiki/ReleaseNotes/Havana
 
 In 19 days, our community will gather in Hong-Kong for the OpenStack
 Summit: 4 days of conference to discuss all things OpenStack and a
 Design Summit to plan the next 6-month development cycle, codenamed
 Icehouse. It's not too late to join us there, see
 http://www.openstack.org/summit/openstack-summit-hong-kong-2013/ for
 more details.
 
 Congratulations to everyone who contributed to this development cycle
 and participated in making this awesome release possible !
 
 --
 Thierry Carrez (ttx)
 
 ___
 Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
 Post to : openst...@lists.openstack.org
 Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Swift] Havana Release Notes Known Issues is talking about Nova (Re: [Openstack] OpenStack 2013.2 (Havana) is released !)

2013-10-19 Thread Chris Behrens
Ah, I know what happened.  This is corrected now.

- Chris

On Oct 19, 2013, at 12:27 AM, Chris Behrens cbehr...@codestud.com wrote:

 I may have put that in the wrong spot.  Oops.
 
 On Oct 18, 2013, at 11:11 PM, Akihiro Motoki amot...@gmail.com wrote:
 
 Hi Thierry, John,
 
 In Havana release notes, Swift known issues section is talking about
 Nova Cells issue. Could you confirm?
 https://wiki.openstack.org/wiki/ReleaseNotes/Havana#Known_Issues
 
 Thanks,
 Akihiro
 
 On Thu, Oct 17, 2013 at 11:23 PM, Thierry Carrez thie...@openstack.org 
 wrote:
 Hello everyone,
 
 It is my great pleasure to announce the final release of OpenStack
 2013.2. It marks the end of the Havana 6-month-long development cycle,
 which saw the addition of two integrated components (Ceilometer and
 Heat), the completion of more than 400 feature blueprints and the fixing
 of more than 3000 reported bugs !
 
 You can find source tarballs for each integrated project, together with
 lists of features and bugfixes, at:
 
 OpenStack Compute:https://launchpad.net/nova/havana/2013.2
 OpenStack Object Storage: https://launchpad.net/swift/havana/1.10.0
 OpenStack Image Service:  https://launchpad.net/glance/havana/2013.2
 OpenStack Networking: https://launchpad.net/neutron/havana/2013.2
 OpenStack Block Storage:  https://launchpad.net/cinder/havana/2013.2
 OpenStack Identity:   https://launchpad.net/keystone/havana/2013.2
 OpenStack Dashboard:  https://launchpad.net/horizon/havana/2013.2
 OpenStack Metering:   https://launchpad.net/ceilometer/havana/2013.2
 OpenStack Orchestration:  https://launchpad.net/heat/havana/2013.2
 
 The Havana Release Notes contain an overview of the key features, as
 well as upgrade notes and current lists of known issues. You can access
 them at: https://wiki.openstack.org/wiki/ReleaseNotes/Havana
 
 In 19 days, our community will gather in Hong-Kong for the OpenStack
 Summit: 4 days of conference to discuss all things OpenStack and a
 Design Summit to plan the next 6-month development cycle, codenamed
 Icehouse. It's not too late to join us there, see
 http://www.openstack.org/summit/openstack-summit-hong-kong-2013/ for
 more details.
 
 Congratulations to everyone who contributed to this development cycle
 and participated in making this awesome release possible !
 
 --
 Thierry Carrez (ttx)
 
 ___
 Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
 Post to : openst...@lists.openstack.org
 Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] TC candidacy

2013-10-09 Thread Chris Behrens
Hi all,

I'd like to announce my candidacy for a seat on the OpenStack
Technical Committee.

- General background -

I have over 15 years of experience designing and building distributed
systems.  I am currently a Principal Engineer at Rackspace, where
I have been for a little over 3 years now.  Most of my time at
Rackspace has been spent working on OpenStack as both a developer
and a technical leader.  My first week at Rackspace was spent at
the very first OpenStack Design Summit in Austin where the project
was announced.

Prior to working at Rackspace, I held various roles over 14 years
at Concentric Network Corporation/XO Communications including Senior
Software Architect and eventually Director of Engineering.  My main
focus there was on an award winning web/email hosting platform which
we'd built to be extremely scalable and fault tolerant.  While my
name is not on this patent, I was heavily involved with the development
and design that led to US6611861.

- Why am I interested? -

This is my 3rd time running and I don't want to be considered a failure!

But seriously, as I have mentioned in the past, I have strong
feelings for OpenStack and I want to help as much as possible to
take it to the next level.  I have a lot of technical knowledge and
experience building scalable distributed systems.  I would like to
use this knowledge for good, not evil.

- OpenStack contributions -

As I mentioned above, I was at the very first design summit, so
I've been involved with the project from the beginning.  I started
the initial work for nova-scheduler shortly after the project was
opened.  I also implemented the RPC support for kombu, making sure
to properly support reconnecting and so forth which didn't work
quite so well with the carrot code.  I've contributed a number of
improvements designed to make nova-api more performant.  I've worked
on the filter scheduler as well as designing and implementing the
first version of the Zones replacement that we named 'Cells'.  And
most recently, I was involved in the design and implementation of
the unified objects code in nova.

During Icehouse, I'm hoping to focus on performance and stabilization
while also helping to finish objects conversion.

- Summary -

I feel my years of experience contributing to and leading large scale
technical projects along with my knowledge of the OpenStack projects
will provide a good foundation for technical leadership.

Thanks,

- Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo.db] Proposal: Get rid of deleted column

2013-08-20 Thread Chris Behrens

On Aug 20, 2013, at 12:51 PM, Ed Leafe e...@openstack.org wrote:

 On Aug 20, 2013, at 2:33 PM, Chris Behrens cbehr...@codestud.com
 wrote:
 
 For instances table, we want to make sure 'uuid' is unique.  But we can't 
 put a unique constraint on that alone.  If that instance gets deleted.. we 
 should be able to create another entry with the same uuid without a problem. 
  So we need a unique constraint on uuid+deleted.  But if 'deleted' is only 0 
 or 1… we can only have 1 entry deleted and 1 entry not deleted.  Using 
 deleted=`id` to mark deletion solves that problem.  You could use 
 deleted_at… but 2 creates and deletes within the same second would not work. 
 :)
 
 This creates another problem if you ever need to delete this second instance, 
 because now you have two with the same uuid and the same deleted status.

Not with the setting of 'deleted' to the row's `id` on delete… since `id` is 
unique.

- Chris




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo.db] Proposal: Get rid of deleted column

2013-08-20 Thread Chris Behrens

On Aug 20, 2013, at 1:05 PM, Jay Pipes jaypi...@gmail.com wrote:

 I see the following use case:
 
 1) Create something with a unique name within your tenant
 2) Delete that
 3) Create something with the same unique name immediately after
 
 As a pointless and silly use case that we should not cater to.
 
 It's made the database schema needlessly complex IMO and added columns to a 
 unique constraint that make a DBA's job more complex in order to fulfill a 
 use case that really isn't particularly compelling.
 
 I was having a convo on IRC with Boris and stated the use case in different 
 terms:
 
 If you delete your Gmail email address, do you expect to immediately be able 
 to create a new Gmail email with the previous address?
 
 If you answer yes, then this unique constraint on the deleted column makes 
 sense to you. If you answer no, then the whole thing seems like we've spent a 
 lot of effort on something that isn't particularly useful except in random 
 test cases that try to create and delete the same thing in rapid succession. 
 And IMO, those kinds of test cases should be deleted -- hard-deleted.
 

I would answer 'no' to the gmail question.  I would answer 'yes' depending on 
what other things we may talk about.  If we put (or maybe we have this -- I 
didn't check) unique constraints on the metadata table for metadata key… It 
would be rather silly to not allow someone to reset some metadata with the same 
key immediately.  One could argue that we just un-delete the former row and 
update it, however… but I think that breaks archiving (something*I'm* not a fan 
of ;)

- Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo.db] Proposal: Get rid of deleted column

2013-08-20 Thread Chris Behrens

On Aug 20, 2013, at 3:29 PM, Vishvananda Ishaya vishvana...@gmail.com wrote:

 c) is going ot take a while. There are still quite a few places in nova,
 for example, that depend on accessing deleted records.
 
 Do you have a list of these places?
 
 No. I believe Joe Gordon did an initial look long ago. Off the top of my head 
 I remember flavors and the simple-usage extension use them.


Yeah, flavors is a problem still, I think.  Although we've moved towards fixing 
most of it.

Unfortunately the API supports showing some amount of deleted instances if you 
specify 'changes-since'.  Although since I don't think 'some amount' is really 
quantified, we may be able to ignore that.  We should make that go away in v3… 
as long as there is some way for someone to see instances that can be reclaimed 
(soft delete state which is different than DB soft-delete)

There are some periodic tasks that look at deleted records in order to sync 
things.  The one that stands out to me is '_cleanup_running_deleted_instances'.

- Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] cells checks on patches

2013-07-26 Thread Chris Behrens
I have just put up a review here:

https://review.openstack.org/#/c/38897/

which should address the exercise.sh issues when n-cell is enabled.  Hopefully 
this works in the gate like it does for me locally.  Then we can move on to 
looking at tempest.

- Chris


On Jul 15, 2013, at 6:13 AM, Andrew Laski andrew.la...@rackspace.com wrote:

 I will also be working to help get cells passing tests.  I just setup a 
 blueprint on the Nova side for this, 
 https://blueprints.launchpad.net/nova/+spec/cells-gating.
 
 On 07/13/13 at 05:00pm, Chris Behrens wrote:
 I can make a commitment to help getting cells passing.  Basically, I'd like 
 to do whatever I can to make sure we can have a useful gate on cells.  
 Unfortunately I'm going to be mostly offline for the next 10 days or so, 
 however. :)
 
 I thought there was a sec group patch up for cells, but I've not fully 
 reviewed it.
 
 The generic cannot communicate with cell 'child' almost sounds like some 
 other basic issue I'll see if I can take a peak during my layovers 
 tonight.
 
 On Jul 13, 2013, at 8:28 AM, Sean Dague s...@dague.net wrote:
 
 On 07/13/2013 10:50 AM, Dan Smith wrote:
 Currently cells can even get past devstack exercises, which
 are very
 minor sanity checks for the environment (nothing tricky).
 
 I thought that the plan was to deprecate the devstack exercises and
 just use tempest. Is that not the case? I'd bet that the devstack
 exercises are just not even on anyone's radar. Since the excellent work
 you QA folks did to harden those tests before grizzly, I expect most
 people take them for granted now :)
 
 Digging into the logs just a bit, I see what looks like early failures
 related to missing security group issues in the cells manager log. I
 know there are some specific requirements in how things have to be set
 up for cells, so I think it's likely that we'll need to do some
 tweaking of configs to get all of this right.
 
 We enabled the test knowing that it wasn't going to pass for a while,
 and it's only been running for less than 24 hours. In the same way that
 the grenade job had (until recently) been failing on everything, the
 point of enabling the cells test now is so that we can start iterating
 on fixes so that we can hopefully have some amount of regular test
 coverage before havana.
 
 Like I said, as long as someone is going to work on it, I'm happy. :) I 
 just don't want this to be an enable the tests and hope magically fairies 
 come to fix them issue. That's what we did on full neutron tests, and it's 
 been bouncing around like that for a while.
 
 We are planning on disabling the devstack exercises, it wasn't so much 
 that, it's that it looks like there is fundamental lack of functioning nova 
 on devstack for cells right now. The security groups stack trace is just a 
 side effect of cells falling over in a really low level way (this is what's 
 before and after the trace).
 
 2013-07-13 00:12:18.605 ERROR nova.cells.scheduler 
 [req-dcbb868c-98a7-4d65-94b3-e1234c50e623 demo demo] Couldn't communicate 
 with cell 'child'
 
 2013-07-13 00:12:18.606 ERROR nova.cells.scheduler 
 [req-dcbb868c-98a7-4d65-94b3-e1234c50e623 demo demo] Couldn't communicate 
 with any cells
 
 Again, mostly I want to know that we've got a blueprint or bug that's high 
 priority and someone's working on it. It did take a while to get grenade 
 there (we're 2 bugs away from being able to do it repeatably in the gate), 
 but during that time we did have people working on it. It just takes a 
 while to get to the bottom of these issues some times, so I want people to 
 have a realistic expectation on how quickly we'll go from running upstream 
 to gating.
 
   -Sean
 
 --
 Sean Dague
 http://dague.net
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] cells checks on patches

2013-07-13 Thread Chris Behrens
I can make a commitment to help getting cells passing.  Basically, I'd like to 
do whatever I can to make sure we can have a useful gate on cells.  
Unfortunately I'm going to be mostly offline for the next 10 days or so, 
however. :)

I thought there was a sec group patch up for cells, but I've not fully reviewed 
it.  

The generic cannot communicate with cell 'child' almost sounds like some 
other basic issue I'll see if I can take a peak during my layovers tonight.

On Jul 13, 2013, at 8:28 AM, Sean Dague s...@dague.net wrote:

 On 07/13/2013 10:50 AM, Dan Smith wrote:
 Currently cells can even get past devstack exercises, which
 are very
 minor sanity checks for the environment (nothing tricky).
 
 I thought that the plan was to deprecate the devstack exercises and
 just use tempest. Is that not the case? I'd bet that the devstack
 exercises are just not even on anyone's radar. Since the excellent work
 you QA folks did to harden those tests before grizzly, I expect most
 people take them for granted now :)
 
 Digging into the logs just a bit, I see what looks like early failures
 related to missing security group issues in the cells manager log. I
 know there are some specific requirements in how things have to be set
 up for cells, so I think it's likely that we'll need to do some
 tweaking of configs to get all of this right.
 
 We enabled the test knowing that it wasn't going to pass for a while,
 and it's only been running for less than 24 hours. In the same way that
 the grenade job had (until recently) been failing on everything, the
 point of enabling the cells test now is so that we can start iterating
 on fixes so that we can hopefully have some amount of regular test
 coverage before havana.
 
 Like I said, as long as someone is going to work on it, I'm happy. :) I just 
 don't want this to be an enable the tests and hope magically fairies come to 
 fix them issue. That's what we did on full neutron tests, and it's been 
 bouncing around like that for a while.
 
 We are planning on disabling the devstack exercises, it wasn't so much that, 
 it's that it looks like there is fundamental lack of functioning nova on 
 devstack for cells right now. The security groups stack trace is just a side 
 effect of cells falling over in a really low level way (this is what's before 
 and after the trace).
 
 2013-07-13 00:12:18.605 ERROR nova.cells.scheduler 
 [req-dcbb868c-98a7-4d65-94b3-e1234c50e623 demo demo] Couldn't communicate 
 with cell 'child'
 
 2013-07-13 00:12:18.606 ERROR nova.cells.scheduler 
 [req-dcbb868c-98a7-4d65-94b3-e1234c50e623 demo demo] Couldn't communicate 
 with any cells
 
 Again, mostly I want to know that we've got a blueprint or bug that's high 
 priority and someone's working on it. It did take a while to get grenade 
 there (we're 2 bugs away from being able to do it repeatably in the gate), 
 but during that time we did have people working on it. It just takes a while 
 to get to the bottom of these issues some times, so I want people to have a 
 realistic expectation on how quickly we'll go from running upstream to gating.
 
-Sean
 
 -- 
 Sean Dague
 http://dague.net
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] cells checks on patches

2013-07-13 Thread Chris Behrens

On Jul 13, 2013, at 8:28 AM, Sean Dague s...@dague.net wrote:

 Like I said, as long as someone is going to work on it, I'm happy. :) I just 
 don't want this to be an enable the tests and hope magically fairies come to 
 fix them issue. That's what we did on full neutron tests, and it's been 
 bouncing around like that for a while.
 
 We are planning on disabling the devstack exercises, it wasn't so much that, 
 it's that it looks like there is fundamental lack of functioning nova on 
 devstack for cells right now. The security groups stack trace is just a side 
 effect of cells falling over in a really low level way (this is what's before 
 and after the trace).
 
 2013-07-13 00:12:18.605 ERROR nova.cells.scheduler 
 [req-dcbb868c-98a7-4d65-94b3-e1234c50e623 demo demo] Couldn't communicate 
 with cell 'child'
 
 2013-07-13 00:12:18.606 ERROR nova.cells.scheduler 
 [req-dcbb868c-98a7-4d65-94b3-e1234c50e623 demo demo] Couldn't communicate 
 with any cells
 

Did you dig these out manually somehow?   It looks like that unfortunately 
there's no screen-n-cells.txt saved in the gate, which would be extremely 
useful. :)  It looks like all errors must be limited to that service right now… 
makes me wonder if the devstack needs tweaked now for cells.

In fact, I *might* know the problem.   Some cells config options were 
deprecated, and it appears that backwards compatibility was lost.  I ran into 
this myself, and I took a stab at fixing it (I was unable to reproduce it in 
tests, but it certainly showed up in one of our environments).

We should probably commit a fix to devstack to use the new config options no 
matter what:

1) Remove the usage of compute_api_class CONF option
2) Where compute_api_class was set to the ComputeCells class in the API cell, 
instead use this config:

[cells]
cell_type=api

3) In a child cell where you did not override compute_api_class, use this:
[cells]
cell_type=compute

Maybe someone could try committing that fix to devstack for me while I'm 
traveling? :)   I wonder if that'll get us a little further along...

- Chris



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Nominating John Garbutt for nova-core

2013-06-26 Thread Chris Behrens
+1

On Jun 26, 2013, at 10:09 AM, Russell Bryant rbry...@redhat.com wrote:

 Greetings,
 
 I would like to nominate John Garbutt for the nova-core team.
 
 John has been involved with nova for a long time now.  He's primarily
 known for his great work on the xenapi driver.  However, he has been
 contributing and reviewing in other areas, as well.  Based on my
 experience with him I think he would be a good addition, so it would be
 great to have him on board to help keep up with the review load.
 
 Please respond with +1s or any concerns.
 
 References:
 
  https://review.openstack.org/#/dashboard/782
 
  https://review.openstack.org/#/q/reviewer:782,n,z
 
  https://launchpad.net/~johngarbutt/+specs?role=assignee
 
  https://launchpad.net/~johngarbutt/+bugs?role=assignee
 
 Thanks,
 
 -- 
 Russell Bryant
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Cells design issue

2013-06-21 Thread Chris Behrens

On Jun 21, 2013, at 9:16 AM, Armando Migliaccio amigliac...@vmware.com wrote:

 In my view a cell should only know about the queue it's connected to, and let 
 the 'global' message queue to do its job of dispatching the messages to the 
 right recipient: that would solve the problem altogether.
 
 Were federated queues and topic routing not considered fit for the purpose? I 
 guess the drawback with this is that it is tight to Rabbit.

If you're referring to the rabbit federation plugin, no, it was not considered. 
  I'm not even sure that via rabbit queues is the right way to talk cell to 
cell.  But I really do not want to get into a full blown cells communication 
design discussion here.  We can do that in another thread, if we need to do so. 
:)

It is what it is today and this thread is just about how to express the 
configuration for it.

Regarding Mark's config suggestion:

 On Mon, Jun 17, 2013 at 2:14 AM, Mark McLoughlin mar...@redhat.com wrote:
 I don't know whether I like it yet or not, but here's how it might look:
 
  [cells]
  parents = parent1
  children = child1, child2
 
  [cell:parent1]
  transport_url = qpid://host1/nova
 
  [cell:child1]
  transport_url = qpid://host2/child1_nova
 
  [cell:child2]
  transport_url = qpid://host2/child2_nova
[…]

Yeah, that's what I was picturing if going that route.  I guess the code for it 
is not bad at all.  But with oslo.config, can I reload (re-parse) the config 
file later, or does the service need to be restarted?

- Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Compute node stats sent to the scheduler

2013-06-17 Thread Chris Behrens

On Jun 17, 2013, at 7:49 AM, Russell Bryant rbry...@redhat.com wrote:

 On 06/16/2013 11:25 PM, Dugger, Donald D wrote:
 Looking into the scheduler a bit there's an issue of duplicated effort that 
 is a little puzzling.  The database table `compute_nodes' is being updated 
 periodically with data about capabilities and resources used (memory, vcpus, 
 ...) while at the same time a periodic RPC call is being made to the 
 scheduler sending pretty much the same data.
 
 Does anyone know why we are updating the same data in two different place 
 using two different mechanisms?  Also, assuming we were to remove one of 
 these updates, which one should go?  (I thought at one point in time there 
 was a goal to create a database free compute node which would imply we 
 should remove the DB update.)
 
 Have you looked around to see if any code is using the data from the db?
 
 Having schedulers hit the db for the current state of all compute nodes
 all of the time would be a large additional db burden that I think we
 should avoid.  So, it makes sense to keep the rpc fanout_cast of current
 stats to schedulers.

This is actually what the scheduler uses. :)   The fanout messages are too 
infrequent and can be too laggy.  So, the scheduler was moved to using the DB a 
long, long time ago… but it was very inefficient, at first, because it looped 
through all instances.  So we added things we needed into compute_node and 
compute_node_stats so we only had to look at the hosts.  You have to pull the 
hosts anyway, so we pull the stats at the same time.

The problem is… when we stopped using certain data from the fanout messages…. 
we never removed it.   We should AT LEAST do this.  But.. (see below)..

 
 The scheduler also does a fanout_cast to all compute nodes when it
 starts up to trigger the compute nodes to populate the cache in the
 scheduler.  It would be nice to never fanout_cast to all compute nodes
 (given that there may be a *lot* of them).  We could replace this with
 having the scheduler populate its cache from the database.

I think we should audit the remaining things that the scheduler uses from these 
messages and move them to the DB.  I believe it's limited to the hypervisor 
capabilities to compare against aggregates or some such.  I believe it's things 
that change very rarely… so an alternative can be to only send fanout messages 
when capabilities change!   We could always do that as a first step.

 
 Removing the db usage completely would be nice if nothing is actually
 using it, but we'd have to look into an alternative solution for
 removing the scheduler fanout_cast to compute.

Relying on anything but the DB for current memory free, etc, is just too laggy… 
so we need to stick with it, IMO.

- Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev