Re: [openstack-dev] [nova][cinder][neutron] Cross-cell cold migration

2018-11-06 Thread Matt Riedemann
After hacking on the PoC for awhile [1] I have finally pushed up a spec 
[2]. Behold it in all its dark glory!


[1] https://review.openstack.org/#/c/603930/
[2] https://review.openstack.org/#/c/616037/

On 8/22/2018 8:23 PM, Matt Riedemann wrote:

Hi everyone,

I have started an etherpad for cells topics at the Stein PTG [1]. The 
main issue in there right now is dealing with cross-cell cold migration 
in nova.


At a high level, I am going off these requirements:

* Cells can shard across flavors (and hardware type) so operators would 
like to move users off the old flavors/hardware (old cell) to new 
flavors in a new cell.


* There is network isolation between compute hosts in different cells, 
so no ssh'ing the disk around like we do today. But the image service is 
global to all cells.


Based on this, for the initial support for cross-cell cold migration, I 
am proposing that we leverage something like shelve offload/unshelve 
masquerading as resize. We shelve offload from the source cell and 
unshelve in the target cell. This should work for both volume-backed and 
non-volume-backed servers (we use snapshots for shelved offloaded 
non-volume-backed servers).


There are, of course, some complications. The main ones that I need help 
with right now are what happens with volumes and ports attached to the 
server. Today we detach from the source and attach at the target, but 
that's assuming the storage backend and network are available to both 
hosts involved in the move of the server. Will that be the case across 
cells? I am assuming that depends on the network topology (are routed 
networks being used?) and storage backend (routed storage?). If the 
network and/or storage backend are not available across cells, how do we 
migrate volumes and ports? Cinder has a volume migrate API for admins 
but I do not know how nova would know the proper affinity per-cell to 
migrate the volume to the proper host (cinder does not have a routed 
storage concept like routed provider networks in neutron, correct?). And 
as far as I know, there is no such thing as port migration in Neutron.


Could Placement help with the volume/port migration stuff? Neutron 
routed provider networks rely on placement aggregates to schedule the VM 
to a compute host in the same network segment as the port used to create 
the VM, however, if that segment does not span cells we are kind of 
stuck, correct?


To summarize the issues as I see them (today):

* How to deal with the targeted cell during scheduling? This is so we 
can even get out of the source cell in nova.


* How does the API deal with the same instance being in two DBs at the 
same time during the move?


* How to handle revert resize?

* How are volumes and ports handled?

I can get feedback from my company's operators based on what their 
deployment will look like for this, but that does not mean it will work 
for others, so I need as much feedback from operators, especially those 
running with multiple cells today, as possible. Thanks in advance.


[1] https://etherpad.openstack.org/p/nova-ptg-stein-cells




--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][cinder][neutron] Cross-cell cold migration

2018-08-28 Thread Matt Riedemann

On 8/27/2018 1:53 PM, Matt Riedemann wrote:

On 8/27/2018 12:11 PM, Miguel Lavalle wrote:
Isn't multiple port binding what we need in the case of ports? In my 
mind, the big motivator for multiple port binding is the ability to 
change a port's backend


Hmm, yes maybe. Nova's usage of multiple port bindings today is 
restricted to live migration which isn't what we're supporting with the 
initial cross-cell (cold) migration support, but it could be a 
dependency if that's what we need.


What I was wondering is if there is a concept like a port spanning or 
migrating across networks? I'm assuming there isn't, and I'm not even 
sure if that would be required here. But it would mean there is an 
implicit requirement that for cross-cell migration to work, neutron 
networks need to span cells (similarly storage backends would need to 
span cells).


In thinking about this again (sleepless at 3am of course), port bindings 
doesn't help us here if we're orchestrating the cross-cell move using 
shelve offload, because in that case the port is unbound from the source 
host - while the instance is shelved offloaded, it has no host. When we 
unshelve in the new cell, we'd update the port binding. So there isn't 
really a use in this flow for multiple port bindings on multiple hosts 
(assuming we stick with using the shelve/unshelve idea here).


--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][cinder][neutron] Cross-cell cold migration

2018-08-27 Thread melanie witt

On Fri, 24 Aug 2018 10:44:16 -0500, Jay S Bryant wrote:

I haven't checked the PTG agenda yet, but is there a meeting on this?
Because we may want to have one to try to understand the requirements
and figure out if there's a way to do it with current Cinder
functionality of if we'd need something new.

Gorka,

I don't think that this has been put on the agenda yet.  Might be good
to add.  I don't think we have a cross project time officially planned
with Nova.  I will start that discussion with Melanie so that we can
cover the couple of cross projects subjects we have.


Just to update everyone, we've schedule Cinder/Nova cross project time 
for Thursday 9am-11am at the PTG, please add topics starting at L134 in 
the Cinder section:


https://etherpad.openstack.org/p/nova-ptg-stein

Cheers,
-melanie




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][cinder][neutron] Cross-cell cold migration

2018-08-27 Thread Matt Riedemann

On 8/27/2018 12:11 PM, Miguel Lavalle wrote:
Isn't multiple port binding what we need in the case of ports? In my 
mind, the big motivator for multiple port binding is the ability to 
change a port's backend


Hmm, yes maybe. Nova's usage of multiple port bindings today is 
restricted to live migration which isn't what we're supporting with the 
initial cross-cell (cold) migration support, but it could be a 
dependency if that's what we need.


What I was wondering is if there is a concept like a port spanning or 
migrating across networks? I'm assuming there isn't, and I'm not even 
sure if that would be required here. But it would mean there is an 
implicit requirement that for cross-cell migration to work, neutron 
networks need to span cells (similarly storage backends would need to 
span cells).


--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][cinder][neutron] Cross-cell cold migration

2018-08-27 Thread Miguel Lavalle
Hi Matt,

Isn't multiple port binding what we need in the case of ports? In my mind,
the big motivator for multiple port binding is the ability to change a
port's backend

Best regards

Miguel

On Mon, Aug 27, 2018 at 4:32 AM, Gorka Eguileor  wrote:

> On 24/08, Jay S Bryant wrote:
> >
> >
> > On 8/23/2018 12:07 PM, Gorka Eguileor wrote:
> > > On 23/08, Dan Smith wrote:
> > > > > I think Nova should never have to rely on Cinder's hosts/backends
> > > > > information to do migrations or any other operation.
> > > > >
> > > > > In this case even if Nova had that info, it wouldn't be the
> solution.
> > > > > Cinder would reject migrations if there's an incompatibility on the
> > > > > Volume Type (AZ, Referenced backend, capabilities...)
> > > > I think I'm missing a bunch of cinder knowledge required to fully
> grok
> > > > this situation and probably need to do some reading. Is there some
> > > > reason that a volume type can't exist in multiple backends or
> something?
> > > > I guess I think of volume type as flavor, and the same definition in
> two
> > > > places would be interchangeable -- is that not the case?
> > > >
> > > Hi,
> > >
> > > I just know the basics of flavors, and they are kind of similar, though
> > > I'm sure there are quite a few differences.
> > >
> > > Sure, multiple storage arrays can meet the requirements of a Volume
> > > Type, but then when you create the volume you don't know where it's
> > > going to land. If your volume type is too generic you volume could land
> > > somewhere your cell cannot reach.
> > >
> > >
> > > > > I don't know anything about Nova cells, so I don't know the
> specifics of
> > > > > how we could do the mapping between them and Cinder backends, but
> > > > > considering the limited range of possibilities in Cinder I would
> say we
> > > > > only have Volume Types and AZs to work a solution.
> > > > I think the only mapping we need is affinity or distance. The point
> of
> > > > needing to migrate the volume would purely be because moving cells
> > > > likely means you moved physically farther away from where you were,
> > > > potentially with different storage connections and networking. It
> > > > doesn't *have* to mean that, but I think in reality it would. So the
> > > > question I think Matt is looking to answer here is "how do we move an
> > > > instance from a DC in building A to building C and make sure the
> > > > volume gets moved to some storage local in the new building so we're
> > > > not just transiting back to the original home for no reason?"
> > > >
> > > > Does that explanation help or are you saying that's fundamentally
> hard
> > > > to do/orchestrate?
> > > >
> > > > Fundamentally, the cells thing doesn't even need to be part of the
> > > > discussion, as the same rules would apply if we're just doing a
> normal
> > > > migration but need to make sure that storage remains affined to
> compute.
> > > >
> > > We could probably work something out using the affinity filter, but
> > > right now we don't have a way of doing what you need.
> > >
> > > We could probably rework the migration to accept scheduler hints to be
> > > used with the affinity filter and to accept calls with the host or the
> > > hints, that way it could migrate a volume without knowing the
> > > destination host and decide it based on affinity.
> > >
> > > We may have to do more modifications, but it could be a way to do it.
> > >
> > >
> > >
> > > > > I don't know how the Nova Placement works, but it could hold an
> > > > > equivalency mapping of volume types to cells as in:
> > > > >
> > > > >   Cell#1Cell#2
> > > > >
> > > > > VolTypeA <--> VolTypeD
> > > > > VolTypeB <--> VolTypeE
> > > > > VolTypeC <--> VolTypeF
> > > > >
> > > > > Then it could do volume retypes (allowing migration) and that would
> > > > > properly move the volumes from one backend to another.
> > > > The only way I can think that we could do this in placement would be
> if
> > > > volume types were resource providers and we assigned them traits that
> > > > had special meaning to nova indicating equivalence. Several of the
> words
> > > > in that sentence are likely to freak out placement people, myself
> > > > included :)
> > > >
> > > > So is the concern just that we need to know what volume types in one
> > > > backend map to those in another so that when we do the migration we
> know
> > > > what to ask for? Is "they are the same name" not enough? Going back
> to
> > > > the flavor analogy, you could kinda compare two flavor definitions
> and
> > > > have a good idea if they're equivalent or not...
> > > >
> > > > --Dan
> > > In Cinder you don't get that from Volume Types, unless all your
> backends
> > > have the same hardware and are configured exactly the same.
> > >
> > > There can be some storage specific information there, which doesn't
> > > correlate to anything on other hardware.  Volume types may refer to a
> > > specific pool that has been configured in the array to 

Re: [openstack-dev] [nova][cinder][neutron] Cross-cell cold migration

2018-08-27 Thread Gorka Eguileor
On 24/08, Jay S Bryant wrote:
>
>
> On 8/23/2018 12:07 PM, Gorka Eguileor wrote:
> > On 23/08, Dan Smith wrote:
> > > > I think Nova should never have to rely on Cinder's hosts/backends
> > > > information to do migrations or any other operation.
> > > >
> > > > In this case even if Nova had that info, it wouldn't be the solution.
> > > > Cinder would reject migrations if there's an incompatibility on the
> > > > Volume Type (AZ, Referenced backend, capabilities...)
> > > I think I'm missing a bunch of cinder knowledge required to fully grok
> > > this situation and probably need to do some reading. Is there some
> > > reason that a volume type can't exist in multiple backends or something?
> > > I guess I think of volume type as flavor, and the same definition in two
> > > places would be interchangeable -- is that not the case?
> > >
> > Hi,
> >
> > I just know the basics of flavors, and they are kind of similar, though
> > I'm sure there are quite a few differences.
> >
> > Sure, multiple storage arrays can meet the requirements of a Volume
> > Type, but then when you create the volume you don't know where it's
> > going to land. If your volume type is too generic you volume could land
> > somewhere your cell cannot reach.
> >
> >
> > > > I don't know anything about Nova cells, so I don't know the specifics of
> > > > how we could do the mapping between them and Cinder backends, but
> > > > considering the limited range of possibilities in Cinder I would say we
> > > > only have Volume Types and AZs to work a solution.
> > > I think the only mapping we need is affinity or distance. The point of
> > > needing to migrate the volume would purely be because moving cells
> > > likely means you moved physically farther away from where you were,
> > > potentially with different storage connections and networking. It
> > > doesn't *have* to mean that, but I think in reality it would. So the
> > > question I think Matt is looking to answer here is "how do we move an
> > > instance from a DC in building A to building C and make sure the
> > > volume gets moved to some storage local in the new building so we're
> > > not just transiting back to the original home for no reason?"
> > >
> > > Does that explanation help or are you saying that's fundamentally hard
> > > to do/orchestrate?
> > >
> > > Fundamentally, the cells thing doesn't even need to be part of the
> > > discussion, as the same rules would apply if we're just doing a normal
> > > migration but need to make sure that storage remains affined to compute.
> > >
> > We could probably work something out using the affinity filter, but
> > right now we don't have a way of doing what you need.
> >
> > We could probably rework the migration to accept scheduler hints to be
> > used with the affinity filter and to accept calls with the host or the
> > hints, that way it could migrate a volume without knowing the
> > destination host and decide it based on affinity.
> >
> > We may have to do more modifications, but it could be a way to do it.
> >
> >
> >
> > > > I don't know how the Nova Placement works, but it could hold an
> > > > equivalency mapping of volume types to cells as in:
> > > >
> > > >   Cell#1Cell#2
> > > >
> > > > VolTypeA <--> VolTypeD
> > > > VolTypeB <--> VolTypeE
> > > > VolTypeC <--> VolTypeF
> > > >
> > > > Then it could do volume retypes (allowing migration) and that would
> > > > properly move the volumes from one backend to another.
> > > The only way I can think that we could do this in placement would be if
> > > volume types were resource providers and we assigned them traits that
> > > had special meaning to nova indicating equivalence. Several of the words
> > > in that sentence are likely to freak out placement people, myself
> > > included :)
> > >
> > > So is the concern just that we need to know what volume types in one
> > > backend map to those in another so that when we do the migration we know
> > > what to ask for? Is "they are the same name" not enough? Going back to
> > > the flavor analogy, you could kinda compare two flavor definitions and
> > > have a good idea if they're equivalent or not...
> > >
> > > --Dan
> > In Cinder you don't get that from Volume Types, unless all your backends
> > have the same hardware and are configured exactly the same.
> >
> > There can be some storage specific information there, which doesn't
> > correlate to anything on other hardware.  Volume types may refer to a
> > specific pool that has been configured in the array to use specific type
> > of disks.  But even the info on the type of disks is unknown to the
> > volume type.
> >
> > I haven't checked the PTG agenda yet, but is there a meeting on this?
> > Because we may want to have one to try to understand the requirements
> > and figure out if there's a way to do it with current Cinder
> > functionality of if we'd need something new.
> Gorka,
>
> I don't think that this has been put on the agenda yet.  Might be good to
> add.  

Re: [openstack-dev] [nova][cinder][neutron] Cross-cell cold migration

2018-08-24 Thread Matt Riedemann

+operators

On 8/24/2018 4:08 PM, Matt Riedemann wrote:

On 8/23/2018 10:22 AM, Sean McGinnis wrote:
I haven't gone through the workflow, but I thought shelve/unshelve 
could detach
the volume on shelving and reattach it on unshelve. In that workflow, 
assuming
the networking is in place to provide the connectivity, the nova 
compute host
would be connecting to the volume just like any other attach and 
should work

fine. The unknown or tricky part is making sure that there is the network
connectivity or routing in place for the compute host to be able to 
log in to

the storage target.


Yeah that's also why I like shelve/unshelve as a start since it's doing 
volume detach from the source host in the source cell and volume attach 
to the target host in the target cell.


Host aggregates in Nova, as a grouping concept, are not restricted to 
cells at all, so you could have hosts in the same aggregate which span 
cells, so I'd think that's what operators would be doing if they have 
network/storage spanning multiple cells. Having said that, host 
aggregates are not exposed to non-admin end users, so again, if we rely 
on a normal user to do this move operation via resize, the only way we 
can restrict the instance to another host in the same aggregate is via 
availability zones, which is the user-facing aggregate construct in 
nova. I know Sam would care about this because NeCTAR sets 
[cinder]/cross_az_attach=False in nova.conf so servers/volumes are 
restricted to the same AZ, but that's not the default, and specifying an 
AZ when you create a server is not required (although there is a config 
option in nova which allows operators to define a default AZ for the 
instance if the user didn't specify one).


Anyway, my point is, there are a lot of "ifs" if it's not an 
operator/admin explicitly telling nova where to send the server if it's 
moving across cells.




If it's the other scenario mentioned where the volume needs to be 
migrated from
one storage backend to another storage backend, then that may require 
a little
more work. The volume would need to be retype'd or migrated (storage 
migration)

from the original backend to the new backend.


Yeah, the thing with retype/volume migration that isn't great is it 
triggers the swap_volume callback to the source host in nova, so if nova 
was orchestrating the volume retype/move, we'd need to wait for the swap 
volume to be done (not impossible) before proceeding, and only the 
libvirt driver implements the swap volume API. I've always wondered, 
what the hell do non-libvirt deployments do with respect to the volume 
retype/migration APIs in Cinder? Just disable them via policy?





--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][cinder][neutron] Cross-cell cold migration

2018-08-24 Thread Matt Riedemann

On 8/23/2018 10:22 AM, Sean McGinnis wrote:

I haven't gone through the workflow, but I thought shelve/unshelve could detach
the volume on shelving and reattach it on unshelve. In that workflow, assuming
the networking is in place to provide the connectivity, the nova compute host
would be connecting to the volume just like any other attach and should work
fine. The unknown or tricky part is making sure that there is the network
connectivity or routing in place for the compute host to be able to log in to
the storage target.


Yeah that's also why I like shelve/unshelve as a start since it's doing 
volume detach from the source host in the source cell and volume attach 
to the target host in the target cell.


Host aggregates in Nova, as a grouping concept, are not restricted to 
cells at all, so you could have hosts in the same aggregate which span 
cells, so I'd think that's what operators would be doing if they have 
network/storage spanning multiple cells. Having said that, host 
aggregates are not exposed to non-admin end users, so again, if we rely 
on a normal user to do this move operation via resize, the only way we 
can restrict the instance to another host in the same aggregate is via 
availability zones, which is the user-facing aggregate construct in 
nova. I know Sam would care about this because NeCTAR sets 
[cinder]/cross_az_attach=False in nova.conf so servers/volumes are 
restricted to the same AZ, but that's not the default, and specifying an 
AZ when you create a server is not required (although there is a config 
option in nova which allows operators to define a default AZ for the 
instance if the user didn't specify one).


Anyway, my point is, there are a lot of "ifs" if it's not an 
operator/admin explicitly telling nova where to send the server if it's 
moving across cells.




If it's the other scenario mentioned where the volume needs to be migrated from
one storage backend to another storage backend, then that may require a little
more work. The volume would need to be retype'd or migrated (storage migration)
from the original backend to the new backend.


Yeah, the thing with retype/volume migration that isn't great is it 
triggers the swap_volume callback to the source host in nova, so if nova 
was orchestrating the volume retype/move, we'd need to wait for the swap 
volume to be done (not impossible) before proceeding, and only the 
libvirt driver implements the swap volume API. I've always wondered, 
what the hell do non-libvirt deployments do with respect to the volume 
retype/migration APIs in Cinder? Just disable them via policy?


--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][cinder][neutron] Cross-cell cold migration

2018-08-24 Thread Matt Riedemann

On 8/23/2018 12:07 PM, Gorka Eguileor wrote:

I haven't checked the PTG agenda yet, but is there a meeting on this?
Because we may want to have one to try to understand the requirements
and figure out if there's a way to do it with current Cinder
functionality of if we'd need something new.


I don't see any set schedule yet for topics like we've done in the past, 
I'll ask Mel since time is getting short (~2 weeks out now). But I have 
this as an item for discussion in the etherpad [1]. In previous PTGs, we 
usually have 3 days for (mostly) vertical team stuff with Wednesday 
being our big topics days split into morning and afternoon, e.g. cells 
and placement, then Thursday is split into 1-2 hour cross-project 
sessions, e.g. nova/cinder, nova/neutron, etc, and then Friday is the 
miscellaneous everything else day for stuff on the etherpad.


[1] https://etherpad.openstack.org/p/nova-ptg-stein

--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][cinder][neutron] Cross-cell cold migration

2018-08-24 Thread Matt Riedemann

On 8/22/2018 9:14 PM, Sam Morrison wrote:

I think in our case we’d only migrate between cells if we know the network and 
storage is accessible and would never do it if not.
Thinking moving from old to new hardware at a cell level.


If it's done via the resize API at the top, initiated by a non-admin 
user, how would you prevent it? We don't really know if we're going 
across cell boundaries until the scheduler picks a host, and today we 
restrict all move operations to within the same cell. But that's part of 
the problem that needs addressing - how to tell the scheduler when it's 
OK to get target hosts for a move from all cells rather than the cell 
that the server is currently in.




If storage and network isn’t available ideally it would fail at the api request.


Not sure this is something we can really tell beforehand in the API, but 
maybe possible depending on whatever we come up with regarding volumes 
and ports. I expect this is a whole new orchestrated task in the 
(super)conductor when it happens. So while I think about using 
shelve/unshelve from a compute operation standpoint, I don't want to try 
and shoehorn this into existing conductor tasks.




There is also ceph backed instances and so this is also something to take into 
account which nova would be responsible for.


Not everyone is using ceph and it's not really something the API is 
aware of...at least not today - but long-term with shared storage 
providers in placement we might be able to leverage this for 
non-volume-backed instances, i.e. if we know the source and target host 
are on the same shared storage, regardless of cell boundary, we could 
just move rather than use snapshots (shelve). But I think phase1 is 
easiest universally if we are using snapshots to get from cell 1 to cell 2.




I’ll be in Denver so we can discuss more there too.


Awesome.

--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][cinder][neutron] Cross-cell cold migration

2018-08-24 Thread Jay S Bryant



On 8/23/2018 12:07 PM, Gorka Eguileor wrote:

On 23/08, Dan Smith wrote:

I think Nova should never have to rely on Cinder's hosts/backends
information to do migrations or any other operation.

In this case even if Nova had that info, it wouldn't be the solution.
Cinder would reject migrations if there's an incompatibility on the
Volume Type (AZ, Referenced backend, capabilities...)

I think I'm missing a bunch of cinder knowledge required to fully grok
this situation and probably need to do some reading. Is there some
reason that a volume type can't exist in multiple backends or something?
I guess I think of volume type as flavor, and the same definition in two
places would be interchangeable -- is that not the case?


Hi,

I just know the basics of flavors, and they are kind of similar, though
I'm sure there are quite a few differences.

Sure, multiple storage arrays can meet the requirements of a Volume
Type, but then when you create the volume you don't know where it's
going to land. If your volume type is too generic you volume could land
somewhere your cell cannot reach.



I don't know anything about Nova cells, so I don't know the specifics of
how we could do the mapping between them and Cinder backends, but
considering the limited range of possibilities in Cinder I would say we
only have Volume Types and AZs to work a solution.

I think the only mapping we need is affinity or distance. The point of
needing to migrate the volume would purely be because moving cells
likely means you moved physically farther away from where you were,
potentially with different storage connections and networking. It
doesn't *have* to mean that, but I think in reality it would. So the
question I think Matt is looking to answer here is "how do we move an
instance from a DC in building A to building C and make sure the
volume gets moved to some storage local in the new building so we're
not just transiting back to the original home for no reason?"

Does that explanation help or are you saying that's fundamentally hard
to do/orchestrate?

Fundamentally, the cells thing doesn't even need to be part of the
discussion, as the same rules would apply if we're just doing a normal
migration but need to make sure that storage remains affined to compute.


We could probably work something out using the affinity filter, but
right now we don't have a way of doing what you need.

We could probably rework the migration to accept scheduler hints to be
used with the affinity filter and to accept calls with the host or the
hints, that way it could migrate a volume without knowing the
destination host and decide it based on affinity.

We may have to do more modifications, but it could be a way to do it.




I don't know how the Nova Placement works, but it could hold an
equivalency mapping of volume types to cells as in:

  Cell#1Cell#2

VolTypeA <--> VolTypeD
VolTypeB <--> VolTypeE
VolTypeC <--> VolTypeF

Then it could do volume retypes (allowing migration) and that would
properly move the volumes from one backend to another.

The only way I can think that we could do this in placement would be if
volume types were resource providers and we assigned them traits that
had special meaning to nova indicating equivalence. Several of the words
in that sentence are likely to freak out placement people, myself
included :)

So is the concern just that we need to know what volume types in one
backend map to those in another so that when we do the migration we know
what to ask for? Is "they are the same name" not enough? Going back to
the flavor analogy, you could kinda compare two flavor definitions and
have a good idea if they're equivalent or not...

--Dan

In Cinder you don't get that from Volume Types, unless all your backends
have the same hardware and are configured exactly the same.

There can be some storage specific information there, which doesn't
correlate to anything on other hardware.  Volume types may refer to a
specific pool that has been configured in the array to use specific type
of disks.  But even the info on the type of disks is unknown to the
volume type.

I haven't checked the PTG agenda yet, but is there a meeting on this?
Because we may want to have one to try to understand the requirements
and figure out if there's a way to do it with current Cinder
functionality of if we'd need something new.

Gorka,

I don't think that this has been put on the agenda yet.  Might be good 
to add.  I don't think we have a cross project time officially planned 
with Nova.  I will start that discussion with Melanie so that we can 
cover the couple of cross projects subjects we have.


Jay


Cheers,
Gorka.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




Re: [openstack-dev] [nova][cinder][neutron] Cross-cell cold migration

2018-08-23 Thread Gorka Eguileor
On 23/08, Dan Smith wrote:
> > I think Nova should never have to rely on Cinder's hosts/backends
> > information to do migrations or any other operation.
> >
> > In this case even if Nova had that info, it wouldn't be the solution.
> > Cinder would reject migrations if there's an incompatibility on the
> > Volume Type (AZ, Referenced backend, capabilities...)
>
> I think I'm missing a bunch of cinder knowledge required to fully grok
> this situation and probably need to do some reading. Is there some
> reason that a volume type can't exist in multiple backends or something?
> I guess I think of volume type as flavor, and the same definition in two
> places would be interchangeable -- is that not the case?
>

Hi,

I just know the basics of flavors, and they are kind of similar, though
I'm sure there are quite a few differences.

Sure, multiple storage arrays can meet the requirements of a Volume
Type, but then when you create the volume you don't know where it's
going to land. If your volume type is too generic you volume could land
somewhere your cell cannot reach.


> > I don't know anything about Nova cells, so I don't know the specifics of
> > how we could do the mapping between them and Cinder backends, but
> > considering the limited range of possibilities in Cinder I would say we
> > only have Volume Types and AZs to work a solution.
>
> I think the only mapping we need is affinity or distance. The point of
> needing to migrate the volume would purely be because moving cells
> likely means you moved physically farther away from where you were,
> potentially with different storage connections and networking. It
> doesn't *have* to mean that, but I think in reality it would. So the
> question I think Matt is looking to answer here is "how do we move an
> instance from a DC in building A to building C and make sure the
> volume gets moved to some storage local in the new building so we're
> not just transiting back to the original home for no reason?"
>
> Does that explanation help or are you saying that's fundamentally hard
> to do/orchestrate?
>
> Fundamentally, the cells thing doesn't even need to be part of the
> discussion, as the same rules would apply if we're just doing a normal
> migration but need to make sure that storage remains affined to compute.
>

We could probably work something out using the affinity filter, but
right now we don't have a way of doing what you need.

We could probably rework the migration to accept scheduler hints to be
used with the affinity filter and to accept calls with the host or the
hints, that way it could migrate a volume without knowing the
destination host and decide it based on affinity.

We may have to do more modifications, but it could be a way to do it.



> > I don't know how the Nova Placement works, but it could hold an
> > equivalency mapping of volume types to cells as in:
> >
> >  Cell#1Cell#2
> >
> > VolTypeA <--> VolTypeD
> > VolTypeB <--> VolTypeE
> > VolTypeC <--> VolTypeF
> >
> > Then it could do volume retypes (allowing migration) and that would
> > properly move the volumes from one backend to another.
>
> The only way I can think that we could do this in placement would be if
> volume types were resource providers and we assigned them traits that
> had special meaning to nova indicating equivalence. Several of the words
> in that sentence are likely to freak out placement people, myself
> included :)
>
> So is the concern just that we need to know what volume types in one
> backend map to those in another so that when we do the migration we know
> what to ask for? Is "they are the same name" not enough? Going back to
> the flavor analogy, you could kinda compare two flavor definitions and
> have a good idea if they're equivalent or not...
>
> --Dan

In Cinder you don't get that from Volume Types, unless all your backends
have the same hardware and are configured exactly the same.

There can be some storage specific information there, which doesn't
correlate to anything on other hardware.  Volume types may refer to a
specific pool that has been configured in the array to use specific type
of disks.  But even the info on the type of disks is unknown to the
volume type.

I haven't checked the PTG agenda yet, but is there a meeting on this?
Because we may want to have one to try to understand the requirements
and figure out if there's a way to do it with current Cinder
functionality of if we'd need something new.

Cheers,
Gorka.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][cinder][neutron] Cross-cell cold migration

2018-08-23 Thread Sean McGinnis
On Wed, Aug 22, 2018 at 08:23:41PM -0500, Matt Riedemann wrote:
> Hi everyone,
> 
> I have started an etherpad for cells topics at the Stein PTG [1]. The main
> issue in there right now is dealing with cross-cell cold migration in nova.
> 
> At a high level, I am going off these requirements:
> 
> * Cells can shard across flavors (and hardware type) so operators would like
> to move users off the old flavors/hardware (old cell) to new flavors in a
> new cell.
> 
> * There is network isolation between compute hosts in different cells, so no
> ssh'ing the disk around like we do today. But the image service is global to
> all cells.
> 
> Based on this, for the initial support for cross-cell cold migration, I am
> proposing that we leverage something like shelve offload/unshelve
> masquerading as resize. We shelve offload from the source cell and unshelve
> in the target cell. This should work for both volume-backed and
> non-volume-backed servers (we use snapshots for shelved offloaded
> non-volume-backed servers).
> 
> There are, of course, some complications. The main ones that I need help
> with right now are what happens with volumes and ports attached to the
> server. Today we detach from the source and attach at the target, but that's
> assuming the storage backend and network are available to both hosts
> involved in the move of the server. Will that be the case across cells? I am
> assuming that depends on the network topology (are routed networks being
> used?) and storage backend (routed storage?). If the network and/or storage
> backend are not available across cells, how do we migrate volumes and ports?
> Cinder has a volume migrate API for admins but I do not know how nova would
> know the proper affinity per-cell to migrate the volume to the proper host
> (cinder does not have a routed storage concept like routed provider networks
> in neutron, correct?). And as far as I know, there is no such thing as port
> migration in Neutron.
> 

Just speaking to iSCSI storage, I know some deployments do not route their
storage traffic. If this is the case, then both cells would need to have access
to the same subnet to still access the volume.

I'm also referring to the case where the migration is from one compute host to
another compute host, and not from one storage backend to another storage
backend.

I haven't gone through the workflow, but I thought shelve/unshelve could detach
the volume on shelving and reattach it on unshelve. In that workflow, assuming
the networking is in place to provide the connectivity, the nova compute host
would be connecting to the volume just like any other attach and should work
fine. The unknown or tricky part is making sure that there is the network
connectivity or routing in place for the compute host to be able to log in to
the storage target.

If it's the other scenario mentioned where the volume needs to be migrated from
one storage backend to another storage backend, then that may require a little
more work. The volume would need to be retype'd or migrated (storage migration)
from the original backend to the new backend.

Again, in this scenario at some point there needs to be network connectivity
between cells to copy over that data.

There is no storage-offloaded migration in this situation, so Cinder can't
currently optimize how that data gets from the original volume backend to the
new one. It would require a host copy of all the data on the volume (an often
slow and expensive operation) and it would require that the host doing the data
copy has access to both the original backend and then new backend.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][cinder][neutron] Cross-cell cold migration

2018-08-23 Thread Dan Smith
> I think Nova should never have to rely on Cinder's hosts/backends
> information to do migrations or any other operation.
>
> In this case even if Nova had that info, it wouldn't be the solution.
> Cinder would reject migrations if there's an incompatibility on the
> Volume Type (AZ, Referenced backend, capabilities...)

I think I'm missing a bunch of cinder knowledge required to fully grok
this situation and probably need to do some reading. Is there some
reason that a volume type can't exist in multiple backends or something?
I guess I think of volume type as flavor, and the same definition in two
places would be interchangeable -- is that not the case?

> I don't know anything about Nova cells, so I don't know the specifics of
> how we could do the mapping between them and Cinder backends, but
> considering the limited range of possibilities in Cinder I would say we
> only have Volume Types and AZs to work a solution.

I think the only mapping we need is affinity or distance. The point of
needing to migrate the volume would purely be because moving cells
likely means you moved physically farther away from where you were,
potentially with different storage connections and networking. It
doesn't *have* to mean that, but I think in reality it would. So the
question I think Matt is looking to answer here is "how do we move an
instance from a DC in building A to building C and make sure the
volume gets moved to some storage local in the new building so we're
not just transiting back to the original home for no reason?"

Does that explanation help or are you saying that's fundamentally hard
to do/orchestrate?

Fundamentally, the cells thing doesn't even need to be part of the
discussion, as the same rules would apply if we're just doing a normal
migration but need to make sure that storage remains affined to compute.

> I don't know how the Nova Placement works, but it could hold an
> equivalency mapping of volume types to cells as in:
>
>  Cell#1Cell#2
>
> VolTypeA <--> VolTypeD
> VolTypeB <--> VolTypeE
> VolTypeC <--> VolTypeF
>
> Then it could do volume retypes (allowing migration) and that would
> properly move the volumes from one backend to another.

The only way I can think that we could do this in placement would be if
volume types were resource providers and we assigned them traits that
had special meaning to nova indicating equivalence. Several of the words
in that sentence are likely to freak out placement people, myself
included :)

So is the concern just that we need to know what volume types in one
backend map to those in another so that when we do the migration we know
what to ask for? Is "they are the same name" not enough? Going back to
the flavor analogy, you could kinda compare two flavor definitions and
have a good idea if they're equivalent or not...

--Dan

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][cinder][neutron] Cross-cell cold migration

2018-08-23 Thread Gorka Eguileor
On 22/08, Matt Riedemann wrote:
> Hi everyone,
>
> I have started an etherpad for cells topics at the Stein PTG [1]. The main
> issue in there right now is dealing with cross-cell cold migration in nova.
>
> At a high level, I am going off these requirements:
>
> * Cells can shard across flavors (and hardware type) so operators would like
> to move users off the old flavors/hardware (old cell) to new flavors in a
> new cell.
>
> * There is network isolation between compute hosts in different cells, so no
> ssh'ing the disk around like we do today. But the image service is global to
> all cells.
>
> Based on this, for the initial support for cross-cell cold migration, I am
> proposing that we leverage something like shelve offload/unshelve
> masquerading as resize. We shelve offload from the source cell and unshelve
> in the target cell. This should work for both volume-backed and
> non-volume-backed servers (we use snapshots for shelved offloaded
> non-volume-backed servers).
>
> There are, of course, some complications. The main ones that I need help
> with right now are what happens with volumes and ports attached to the
> server. Today we detach from the source and attach at the target, but that's
> assuming the storage backend and network are available to both hosts
> involved in the move of the server. Will that be the case across cells? I am
> assuming that depends on the network topology (are routed networks being
> used?) and storage backend (routed storage?). If the network and/or storage
> backend are not available across cells, how do we migrate volumes and ports?
> Cinder has a volume migrate API for admins but I do not know how nova would
> know the proper affinity per-cell to migrate the volume to the proper host
> (cinder does not have a routed storage concept like routed provider networks
> in neutron, correct?). And as far as I know, there is no such thing as port
> migration in Neutron.

Hi Matt,

I think Nova should never have to rely on Cinder's hosts/backends
information to do migrations or any other operation.

In this case even if Nova had that info, it wouldn't be the solution.
Cinder would reject migrations if there's an incompatibility on the
Volume Type (AZ, Referenced backend, capabilities...)

I don't know anything about Nova cells, so I don't know the specifics of
how we could do the mapping between them and Cinder backends, but
considering the limited range of possibilities in Cinder I would say we
only have Volume Types and AZs to work a solution.

>
> Could Placement help with the volume/port migration stuff? Neutron routed
> provider networks rely on placement aggregates to schedule the VM to a
> compute host in the same network segment as the port used to create the VM,
> however, if that segment does not span cells we are kind of stuck, correct?
>

I don't know how the Nova Placement works, but it could hold an
equivalency mapping of volume types to cells as in:

 Cell#1Cell#2

VolTypeA <--> VolTypeD
VolTypeB <--> VolTypeE
VolTypeC <--> VolTypeF

Then it could do volume retypes (allowing migration) and that would
properly move the volumes from one backend to another.

Cheers,
Gorka.


> To summarize the issues as I see them (today):
>
> * How to deal with the targeted cell during scheduling? This is so we can
> even get out of the source cell in nova.
>
> * How does the API deal with the same instance being in two DBs at the same
> time during the move?
>
> * How to handle revert resize?
>
> * How are volumes and ports handled?
>
> I can get feedback from my company's operators based on what their
> deployment will look like for this, but that does not mean it will work for
> others, so I need as much feedback from operators, especially those running
> with multiple cells today, as possible. Thanks in advance.
>
> [1] https://etherpad.openstack.org/p/nova-ptg-stein-cells
>
> --
>
> Thanks,
>
> Matt
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][cinder][neutron] Cross-cell cold migration

2018-08-22 Thread Sam Morrison
I think in our case we’d only migrate between cells if we know the network and 
storage is accessible and would never do it if not. 
Thinking moving from old to new hardware at a cell level.

If storage and network isn’t available ideally it would fail at the api request.

There is also ceph backed instances and so this is also something to take into 
account which nova would be responsible for.

I’ll be in Denver so we can discuss more there too.

Cheers,
Sam





> On 23 Aug 2018, at 11:23 am, Matt Riedemann  wrote:
> 
> Hi everyone,
> 
> I have started an etherpad for cells topics at the Stein PTG [1]. The main 
> issue in there right now is dealing with cross-cell cold migration in nova.
> 
> At a high level, I am going off these requirements:
> 
> * Cells can shard across flavors (and hardware type) so operators would like 
> to move users off the old flavors/hardware (old cell) to new flavors in a new 
> cell.
> 
> * There is network isolation between compute hosts in different cells, so no 
> ssh'ing the disk around like we do today. But the image service is global to 
> all cells.
> 
> Based on this, for the initial support for cross-cell cold migration, I am 
> proposing that we leverage something like shelve offload/unshelve 
> masquerading as resize. We shelve offload from the source cell and unshelve 
> in the target cell. This should work for both volume-backed and 
> non-volume-backed servers (we use snapshots for shelved offloaded 
> non-volume-backed servers).
> 
> There are, of course, some complications. The main ones that I need help with 
> right now are what happens with volumes and ports attached to the server. 
> Today we detach from the source and attach at the target, but that's assuming 
> the storage backend and network are available to both hosts involved in the 
> move of the server. Will that be the case across cells? I am assuming that 
> depends on the network topology (are routed networks being used?) and storage 
> backend (routed storage?). If the network and/or storage backend are not 
> available across cells, how do we migrate volumes and ports? Cinder has a 
> volume migrate API for admins but I do not know how nova would know the 
> proper affinity per-cell to migrate the volume to the proper host (cinder 
> does not have a routed storage concept like routed provider networks in 
> neutron, correct?). And as far as I know, there is no such thing as port 
> migration in Neutron.
> 
> Could Placement help with the volume/port migration stuff? Neutron routed 
> provider networks rely on placement aggregates to schedule the VM to a 
> compute host in the same network segment as the port used to create the VM, 
> however, if that segment does not span cells we are kind of stuck, correct?
> 
> To summarize the issues as I see them (today):
> 
> * How to deal with the targeted cell during scheduling? This is so we can 
> even get out of the source cell in nova.
> 
> * How does the API deal with the same instance being in two DBs at the same 
> time during the move?
> 
> * How to handle revert resize?
> 
> * How are volumes and ports handled?
> 
> I can get feedback from my company's operators based on what their deployment 
> will look like for this, but that does not mean it will work for others, so I 
> need as much feedback from operators, especially those running with multiple 
> cells today, as possible. Thanks in advance.
> 
> [1] https://etherpad.openstack.org/p/nova-ptg-stein-cells
> 
> -- 
> 
> Thanks,
> 
> Matt


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][cinder][neutron] Cross-cell cold migration

2018-08-22 Thread Matt Riedemann

Hi everyone,

I have started an etherpad for cells topics at the Stein PTG [1]. The 
main issue in there right now is dealing with cross-cell cold migration 
in nova.


At a high level, I am going off these requirements:

* Cells can shard across flavors (and hardware type) so operators would 
like to move users off the old flavors/hardware (old cell) to new 
flavors in a new cell.


* There is network isolation between compute hosts in different cells, 
so no ssh'ing the disk around like we do today. But the image service is 
global to all cells.


Based on this, for the initial support for cross-cell cold migration, I 
am proposing that we leverage something like shelve offload/unshelve 
masquerading as resize. We shelve offload from the source cell and 
unshelve in the target cell. This should work for both volume-backed and 
non-volume-backed servers (we use snapshots for shelved offloaded 
non-volume-backed servers).


There are, of course, some complications. The main ones that I need help 
with right now are what happens with volumes and ports attached to the 
server. Today we detach from the source and attach at the target, but 
that's assuming the storage backend and network are available to both 
hosts involved in the move of the server. Will that be the case across 
cells? I am assuming that depends on the network topology (are routed 
networks being used?) and storage backend (routed storage?). If the 
network and/or storage backend are not available across cells, how do we 
migrate volumes and ports? Cinder has a volume migrate API for admins 
but I do not know how nova would know the proper affinity per-cell to 
migrate the volume to the proper host (cinder does not have a routed 
storage concept like routed provider networks in neutron, correct?). And 
as far as I know, there is no such thing as port migration in Neutron.


Could Placement help with the volume/port migration stuff? Neutron 
routed provider networks rely on placement aggregates to schedule the VM 
to a compute host in the same network segment as the port used to create 
the VM, however, if that segment does not span cells we are kind of 
stuck, correct?


To summarize the issues as I see them (today):

* How to deal with the targeted cell during scheduling? This is so we 
can even get out of the source cell in nova.


* How does the API deal with the same instance being in two DBs at the 
same time during the move?


* How to handle revert resize?

* How are volumes and ports handled?

I can get feedback from my company's operators based on what their 
deployment will look like for this, but that does not mean it will work 
for others, so I need as much feedback from operators, especially those 
running with multiple cells today, as possible. Thanks in advance.


[1] https://etherpad.openstack.org/p/nova-ptg-stein-cells

--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev