Re: [openstack-dev] [nova][puppet][tripleo][kolla][fuel] Let's talk nova cell v2 deployments and upgrades

2017-01-15 Thread Matt Riedemann

On 1/13/2017 6:56 PM, Matt Riedemann wrote:

On 1/13/2017 2:05 PM, Matt Riedemann wrote:


Documenting this is going to be a priority. We should have something up
for review in Nova by next week (like Monday), at least a draft.



Dan Smith has a start on the docs here:

https://review.openstack.org/#/c/420198/



We got some more updates in the cells v2 wiki here:

https://wiki.openstack.org/wiki/Nova-Cells-v2

We now have patches up for creating, listing and deleting cells v2 cells.

Dan Smith has a patch up to fix the cell0 DB connection naming (it was 
using the API DB connection but should be the main DB connection), this 
is going to require a change to the from-newton upgrade script in 
grenade (those patches are also update in grenade and devstack).


We've had good review on Dan's initial docs patch so I expect that to be 
merged on Monday.


--

Thanks,

Matt Riedemann


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][puppet][tripleo][kolla][fuel] Let's talk nova cell v2 deployments and upgrades

2017-01-14 Thread Shake Chen
Kolla now add cell support. hope can help review.

https://review.openstack.org/#/c/418116/

On Sat, Jan 14, 2017 at 8:56 AM, Matt Riedemann 
wrote:

> On 1/13/2017 2:05 PM, Matt Riedemann wrote:
>
>>
>> Documenting this is going to be a priority. We should have something up
>> for review in Nova by next week (like Monday), at least a draft.
>>
>>
> Dan Smith has a start on the docs here:
>
> https://review.openstack.org/#/c/420198/
>
>
> --
>
> Thanks,
>
> Matt Riedemann
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Shake Chen
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][puppet][tripleo][kolla][fuel] Let's talk nova cell v2 deployments and upgrades

2017-01-13 Thread Matt Riedemann

On 1/13/2017 2:05 PM, Matt Riedemann wrote:


Documenting this is going to be a priority. We should have something up
for review in Nova by next week (like Monday), at least a draft.



Dan Smith has a start on the docs here:

https://review.openstack.org/#/c/420198/

--

Thanks,

Matt Riedemann


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][puppet][tripleo][kolla][fuel] Let's talk nova cell v2 deployments and upgrades

2017-01-13 Thread melanie witt

On Fri, 13 Jan 2017 14:04:27 -0700, Alex Schultz wrote:

Just from the puppet standpoint, it's much easier to create the cell
and populate it after the fact and run some command to sync stuff
after the nodes have been added.  This also would be easier to consume
for scale up/scale down actions.  I'm pretty sure that's also the case
if you're going to implement this in ansible or some other workflow
tooling.  I think this is a cleaner path than having to preplan your
computes, install them and then setup your cell.


Agreed this is the reasonable way for scale up/scale down to occur 
during deployment. The usual expectation for scale up/scale down is on a 
live system, so the compute hosts would be registered at the time you 
are creating cells (and the existing commands support this). But, we 
have a gap for the fresh install case and the scale up during deploy 
case. For others tuned in to this, there's a patch up [1] for a command 
that can create an empty cell, and the operator can add hosts to it at a 
later time. The expected use of the commands would be: at setup time, 
run 'map_cell0' to create cell0 for storage of instances that fail to 
schedule and 'create_cell' to create an empty cell intended for compute 
hosts. Then later after compute hosts are up and running, 
'discover_hosts' to associate hosts with a given cell.



Thanks. We don't have to necessarily fix simple_cell_setup if it's
working the way it's supposed to.  It seems that we're missing pieces
and the knowledge around when we're supposed to run.  What made this
worse was it seems that nova works without any cell v2 items on the
fresh install but the upgrade has it as a hard requirement.


I think the current consensus is to leave 'simple_cell_setup' alone 
since it means "set up all the things" and if it can't, it should return 
1. Instead we'll add 'create_cell' to allow each piece to be run 
deliberately at the appropriate stages of a deployment. The 
'simple_cell_setup' was intended to be a lightweight way for people to 
setup things during an upgrade of an existing non-cells-v1 deployment.



That seems
like it should be made consistent and clearer for the end user to
understand why something is failing. But I would say that anything
that is going to be a hard requirement as part of an upgrade, it
really should be documented prior to when it's made a hard
requirement.  Just to share, I'd also like to point to
https://review.openstack.org/#/c/267153/ which I was only made aware
of today.  It seems like the start of the documentation that we needed
around this process but hasn't been merged yet.  I would personally
like to see that completed and merged prior to any new hard
requirements get merged in.


I agree. I think the thought was that the release notes covered the 
upgrade requirement, but it's clear there wasn't enough info there. 
What's really needed is a small manual that describes each command and 
its use case, along with how-to steps for each common deployment 
scenario. We'll be working on writing that.


-melanie

[1] https://review.openstack.org/#/c/332713/


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][puppet][tripleo][kolla][fuel] Let's talk nova cell v2 deployments and upgrades

2017-01-13 Thread Alex Schultz
On Fri, Jan 13, 2017 at 1:05 PM, Matt Riedemann
 wrote:
> On 1/13/2017 11:43 AM, Alex Schultz wrote:
>>
>> Ahoy folks,
>>
>> So we've been running into issues with the addition of the cell v2
>> setup as part of the requirements for different parts of nova.  It was
>> recommended that we move the conversation to the ML to get a wider
>> audience.  Basically, cell v2 has been working it's way into a
>> required thing for various parts of nova for the Ocata cycle.  We've
>> hit several issues[0][1] because of this.  I put a wider audience than
>> just nova because deployment tools need to understand how this works
>> as it impacts anyone installing or upgrading.
>>
>> What is not clear is what is the expectation around how to install and
>> configure cell v2.  When we hit the first bug in the upgrade, we
>> reached out in irc[2] around this and it seemed that there was little
>> to no documentation around how this stuff all works.  There are
>> mentions of these new commands in the release notes[3] but it's not
>> clear on the specifics for both the upgrade process and also a fresh
>> install.  We attempted to just run simple_cell_setup in the puppet
>> (and tripleo downstream) because we assumed this would handle all the
>> things.  It's become clear that this is not the case.  The latest
>> bug[1] has shown that we do not have a proper understanding of what it
>> means to setup cell v2, so I'd like to use this email to start the
>> conversation as it impacts anyone either install Ocata fresh or
>> attempting some sort of Newton -> Ocata upgrade.
>>
>> Additionally after some conversations today on irc[4], it's also
>> become clear there is some disconnect around understanding between
>> nova folks and people who deploy as to how this thing would ideally
>> work.  So, what I would like to know is how should people be
>> installing and configuring nova cell v2? Specifically what are the
>> steps so that the deployment tools and operators can properly handle
>> these complexities.  What are the assumptions being baked into
>> simple_cell_setup?  It seems to assume computes should exist before
>> the cell simple setup where as traditionally computes are the last
>> thing to be setup (for new installs).
>>
>> So, help?
>>
>> Thanks,
>> -Alex
>>
>> [0] https://bugs.launchpad.net/tripleo/+bug/1649341
>> [1] https://bugs.launchpad.net/nova/+bug/1656276
>> [2]
>> http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2016-12-12.log.html#t2016-12-12T17:38:56
>> [3] http://docs.openstack.org/releasenotes/nova/unreleased.html#id12
>> [4]
>> http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2017-01-13.log.html#t2017-01-13T14:11:37
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> Thanks for starting this thread.
>
> First, I want to apologize for the lack of communication and documentation
> around this. I know this is frustrating. We've been very heads down on
> getting the changes out and getting devstack/grenade working with this stuff
> that we haven't taken the time to document it outside of the release notes,
> which isn't adequate.
>
> Without going into details, there was a major change in personnel this
> release for working on cells v2 so we've been doing some catch-up and that's
> part of why things are a bit scatter-brained.
>
> Based on the discussion in IRC this morning, I took some time to try and
> capture some of the immediate issues/todos/questions in the cells v2 wiki
> [1].
>
> Documenting this is going to be a priority. We should have something up for
> review in Nova by next week (like Monday), at least a draft.
>
> I think it's also important to realize (for the nova team) that we've been
> thinking about cells v2 deployment from an upgrade perspective, which I
> think is why we had the simple_cell_setup command asserting that you needed
> computes already registered to map to hosts and cells. As noted, this is not
> going to work in a fresh install deployment where the control services are
> setup before the computes. We're working on addressing that in [2].
>
> To compare with how simple_cell_setup works, the recently created
> 'nova-status upgrade check' command [3] is OK with there being no compute
> nodes yet (it does fail if you don't have any cell mappings though). It's OK
> with that because of the fresh install scenario. It doesn't fail but it does
> report that you need to remember to discover new hosts and map them once
> they are registered.
>

I think it's great that you've added this into a command that could be
run by the operator.  This is another tool that I would recommend
brushing up documentation on as it seems relatively new and help
people understand how 

Re: [openstack-dev] [nova][puppet][tripleo][kolla][fuel] Let's talk nova cell v2 deployments and upgrades

2017-01-13 Thread Matt Riedemann

On 1/13/2017 11:43 AM, Alex Schultz wrote:

Ahoy folks,

So we've been running into issues with the addition of the cell v2
setup as part of the requirements for different parts of nova.  It was
recommended that we move the conversation to the ML to get a wider
audience.  Basically, cell v2 has been working it's way into a
required thing for various parts of nova for the Ocata cycle.  We've
hit several issues[0][1] because of this.  I put a wider audience than
just nova because deployment tools need to understand how this works
as it impacts anyone installing or upgrading.

What is not clear is what is the expectation around how to install and
configure cell v2.  When we hit the first bug in the upgrade, we
reached out in irc[2] around this and it seemed that there was little
to no documentation around how this stuff all works.  There are
mentions of these new commands in the release notes[3] but it's not
clear on the specifics for both the upgrade process and also a fresh
install.  We attempted to just run simple_cell_setup in the puppet
(and tripleo downstream) because we assumed this would handle all the
things.  It's become clear that this is not the case.  The latest
bug[1] has shown that we do not have a proper understanding of what it
means to setup cell v2, so I'd like to use this email to start the
conversation as it impacts anyone either install Ocata fresh or
attempting some sort of Newton -> Ocata upgrade.

Additionally after some conversations today on irc[4], it's also
become clear there is some disconnect around understanding between
nova folks and people who deploy as to how this thing would ideally
work.  So, what I would like to know is how should people be
installing and configuring nova cell v2? Specifically what are the
steps so that the deployment tools and operators can properly handle
these complexities.  What are the assumptions being baked into
simple_cell_setup?  It seems to assume computes should exist before
the cell simple setup where as traditionally computes are the last
thing to be setup (for new installs).

So, help?

Thanks,
-Alex

[0] https://bugs.launchpad.net/tripleo/+bug/1649341
[1] https://bugs.launchpad.net/nova/+bug/1656276
[2] 
http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2016-12-12.log.html#t2016-12-12T17:38:56
[3] http://docs.openstack.org/releasenotes/nova/unreleased.html#id12
[4] 
http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2017-01-13.log.html#t2017-01-13T14:11:37

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Thanks for starting this thread.

First, I want to apologize for the lack of communication and 
documentation around this. I know this is frustrating. We've been very 
heads down on getting the changes out and getting devstack/grenade 
working with this stuff that we haven't taken the time to document it 
outside of the release notes, which isn't adequate.


Without going into details, there was a major change in personnel this 
release for working on cells v2 so we've been doing some catch-up and 
that's part of why things are a bit scatter-brained.


Based on the discussion in IRC this morning, I took some time to try and 
capture some of the immediate issues/todos/questions in the cells v2 
wiki [1].


Documenting this is going to be a priority. We should have something up 
for review in Nova by next week (like Monday), at least a draft.


I think it's also important to realize (for the nova team) that we've 
been thinking about cells v2 deployment from an upgrade perspective, 
which I think is why we had the simple_cell_setup command asserting that 
you needed computes already registered to map to hosts and cells. As 
noted, this is not going to work in a fresh install deployment where the 
control services are setup before the computes. We're working on 
addressing that in [2].


To compare with how simple_cell_setup works, the recently created 
'nova-status upgrade check' command [3] is OK with there being no 
compute nodes yet (it does fail if you don't have any cell mappings 
though). It's OK with that because of the fresh install scenario. It 
doesn't fail but it does report that you need to remember to discover 
new hosts and map them once they are registered.


So for whatever reason the existing commands and code were written under 
the assumption that you'd first create computes (or already have them) 
and then map those to a cell, and we need to adjust the tooling for the 
scenario that you want to create the cell first and map hosts later. I 
think today grenade and devstack are working the same way as far as 
setting up cells v2, but eventually I think we're going to want to make 
grenade do the more specific upgrade path where we expect a compute