Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends

2015-02-02 Thread Daniel P. Berrange
On Mon, Feb 02, 2015 at 01:21:31PM -0500, Andrew Laski wrote:
> 
> On 02/02/2015 11:26 AM, Daniel P. Berrange wrote:
> >On Mon, Feb 02, 2015 at 11:19:45AM -0500, Andrew Laski wrote:
> >>On 02/02/2015 05:58 AM, Daniel P. Berrange wrote:
> >>>On Sun, Feb 01, 2015 at 11:20:08AM -0800, Noel Burton-Krahn wrote:
> Thanks for bringing this up, Daniel.  I don't think it makes sense to have
> a timeout on live migration, but operators should be able to cancel it,
> just like any other unbounded long-running process.  For example, there's
> no timeout on file transfers, but they need an interface report progress
> and to cancel them.  That would imply an option to cancel evacuation too.
> >>>There has been periodic talk about a generic "tasks API" in Nova for 
> >>>managing
> >>>long running operations and getting information about their progress, but I
> >>>am not sure what the status of that is. It would obviously be applicable to
> >>>migration if that's a route we took.
> >>Currently the status of a tasks API is that it would happen after the API
> >>v2.1 microversions work has created a suitable framework in which to add
> >>tasks to the API.
> >So is all work on tasks blocked by the microversions support ? I would have
> >though that would only block places where we need to modify existing APIs.
> >Are we not able to add APIs for listing / cancelling tasks as new APIs
> >without such a dependency on microversions ?
> 
> Tasks work is certainly not blocked on waiting for microversions. There is a
> large amount of non API facing work that could be done to move forward the
> idea of a task driving state changes within Nova. I would very likely be
> working on that if I wasn't currently spending much of my time on cells v2.

Ok, thanks for the info. So from the POV of migration, I'll focus on the
non-API stuff, and expect the tasks work to provide the API mechanisms

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends

2015-02-02 Thread Andrew Laski


On 02/02/2015 11:26 AM, Daniel P. Berrange wrote:

On Mon, Feb 02, 2015 at 11:19:45AM -0500, Andrew Laski wrote:

On 02/02/2015 05:58 AM, Daniel P. Berrange wrote:

On Sun, Feb 01, 2015 at 11:20:08AM -0800, Noel Burton-Krahn wrote:

Thanks for bringing this up, Daniel.  I don't think it makes sense to have
a timeout on live migration, but operators should be able to cancel it,
just like any other unbounded long-running process.  For example, there's
no timeout on file transfers, but they need an interface report progress
and to cancel them.  That would imply an option to cancel evacuation too.

There has been periodic talk about a generic "tasks API" in Nova for managing
long running operations and getting information about their progress, but I
am not sure what the status of that is. It would obviously be applicable to
migration if that's a route we took.

Currently the status of a tasks API is that it would happen after the API
v2.1 microversions work has created a suitable framework in which to add
tasks to the API.

So is all work on tasks blocked by the microversions support ? I would have
though that would only block places where we need to modify existing APIs.
Are we not able to add APIs for listing / cancelling tasks as new APIs
without such a dependency on microversions ?


Tasks work is certainly not blocked on waiting for microversions. There 
is a large amount of non API facing work that could be done to move 
forward the idea of a task driving state changes within Nova. I would 
very likely be working on that if I wasn't currently spending much of my 
time on cells v2.




Regards,
Daniel



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends

2015-02-02 Thread Daniel P. Berrange
On Mon, Feb 02, 2015 at 11:19:45AM -0500, Andrew Laski wrote:
> 
> On 02/02/2015 05:58 AM, Daniel P. Berrange wrote:
> >On Sun, Feb 01, 2015 at 11:20:08AM -0800, Noel Burton-Krahn wrote:
> >>Thanks for bringing this up, Daniel.  I don't think it makes sense to have
> >>a timeout on live migration, but operators should be able to cancel it,
> >>just like any other unbounded long-running process.  For example, there's
> >>no timeout on file transfers, but they need an interface report progress
> >>and to cancel them.  That would imply an option to cancel evacuation too.
> >There has been periodic talk about a generic "tasks API" in Nova for managing
> >long running operations and getting information about their progress, but I
> >am not sure what the status of that is. It would obviously be applicable to
> >migration if that's a route we took.
> 
> Currently the status of a tasks API is that it would happen after the API
> v2.1 microversions work has created a suitable framework in which to add
> tasks to the API.

So is all work on tasks blocked by the microversions support ? I would have
though that would only block places where we need to modify existing APIs.
Are we not able to add APIs for listing / cancelling tasks as new APIs
without such a dependency on microversions ?

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends

2015-02-02 Thread Andrew Laski


On 02/02/2015 05:58 AM, Daniel P. Berrange wrote:

On Sun, Feb 01, 2015 at 11:20:08AM -0800, Noel Burton-Krahn wrote:

Thanks for bringing this up, Daniel.  I don't think it makes sense to have
a timeout on live migration, but operators should be able to cancel it,
just like any other unbounded long-running process.  For example, there's
no timeout on file transfers, but they need an interface report progress
and to cancel them.  That would imply an option to cancel evacuation too.

There has been periodic talk about a generic "tasks API" in Nova for managing
long running operations and getting information about their progress, but I
am not sure what the status of that is. It would obviously be applicable to
migration if that's a route we took.


Currently the status of a tasks API is that it would happen after the 
API v2.1 microversions work has created a suitable framework in which to 
add tasks to the API.




Regards,
Daniel



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends

2015-02-02 Thread Vladik Romanovsky


- Original Message -
> From: "Daniel P. Berrange" 
> To: "Robert Collins" 
> Cc: "OpenStack Development Mailing List (not for usage questions)" 
> ,
> openstack-operat...@lists.openstack.org
> Sent: Monday, 2 February, 2015 5:56:56 AM
> Subject: Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration 
> ends
> 
> On Mon, Feb 02, 2015 at 08:24:20AM +1300, Robert Collins wrote:
> > On 31 January 2015 at 05:47, Daniel P. Berrange 
> > wrote:
> > > In working on a recent Nova migration bug
> > >
> > >   https://bugs.launchpad.net/nova/+bug/1414065
> > >
> > > I had cause to refactor the way the nova libvirt driver monitors live
> > > migration completion/failure/progress. This refactor has opened the
> > > door for doing more intelligent active management of the live migration
> > > process.
> > ...
> > > What kind of things would be the biggest win from Operators' or tenants'
> > > POV ?
> > 
> > Awesome. Couple thoughts from my perspective. Firstly, there's a bunch
> > of situation dependent tuning. One thing Crowbar does really nicely is
> > that you specify the host layout in broad abstract terms - e.g. 'first
> > 10G network link' and so on : some of your settings above like whether
> > to compress page are going to be heavily dependent on the bandwidth
> > available (I doubt that compression is a win on a 100G link for
> > instance, and would be suspect at 10G even). So it would be nice if
> > there was a single dial or two to set and Nova would auto-calculate
> > good defaults from that (with appropriate overrides being available).
> 
> I wonder how such an idea would fit into Nova, since it doesn't really
> have that kind of knowledge about the network deployment characteristics.
> 
> > Operationally avoiding trouble is better than being able to fix it, so
> > I quite like the idea of defaulting the auto-converge option on, or
> > perhaps making it controllable via flavours, so that operators can
> > offer (and identify!) those particularly performance sensitive
> > workloads rather than having to guess which instances are special and
> > which aren't.
> 
> I'll investigate the auto-converge further to find out what the
> potential downsides of it are. If we can unconditionally enable
> it, it would be simpler than adding yet more tunables.
> 
> > Being able to cancel the migration would be good. Relatedly being able
> > to restart nova-compute while a migration is going on would be good
> > (or put differently, a migration happening shouldn't prevent a deploy
> > of Nova code: interlocks like that make continuous deployment much
> > harder).
> > 
> > If we can't already, I'd like as a user to be able to see that the
> > migration is happening (allows diagnosis of transient issues during
> > the migration). Some ops folk may want to hide that of course.
> > 
> > I'm not sure that automatically rolling back after N minutes makes
> > sense : if the impact on the cluster is significant then 1 minute vs
> > 10 doesn't instrinsically matter: what matters more is preventing too
> > many concurrent migrations, so that would be another feature that I
> > don't think we have yet: don't allow more than some N inbound and M
> > outbound live migrations to a compute host at any time, to prevent IO
> > storms. We may want to log with NOTIFICATION migrations that are still
> > progressing but appear to be having trouble completing. And of course
> > an admin API to query all migrations in progress to allow API driven
> > health checks by monitoring tools - which gives the power to manage
> > things to admins without us having to write a probably-too-simple
> > config interface.
> 
> Interesting, the point about concurrent migrations hadn't occurred to
> me before, but it does of course make sense since migration is
> primarily network bandwidth limited, though disk bandwidth is relevant
> too if doing block migration.

Indeed, there was a lot time spent investigating this topic (in Ovirt again)
and eventually it was decided to expose a config option and allow 3 concurrent
migrations by default.

https://github.com/oVirt/vdsm/blob/master/lib/vdsm/config.py.in#L126

> 
> Regards,
> Daniel
> --
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org  -o- http://virt-manager.org :|
> |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
> |: http://entangle

Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends

2015-02-02 Thread Daniel P. Berrange
On Sun, Feb 01, 2015 at 11:20:08AM -0800, Noel Burton-Krahn wrote:
> Thanks for bringing this up, Daniel.  I don't think it makes sense to have
> a timeout on live migration, but operators should be able to cancel it,
> just like any other unbounded long-running process.  For example, there's
> no timeout on file transfers, but they need an interface report progress
> and to cancel them.  That would imply an option to cancel evacuation too.

There has been periodic talk about a generic "tasks API" in Nova for managing
long running operations and getting information about their progress, but I
am not sure what the status of that is. It would obviously be applicable to
migration if that's a route we took.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends

2015-02-02 Thread Daniel P. Berrange
On Mon, Feb 02, 2015 at 08:24:20AM +1300, Robert Collins wrote:
> On 31 January 2015 at 05:47, Daniel P. Berrange  wrote:
> > In working on a recent Nova migration bug
> >
> >   https://bugs.launchpad.net/nova/+bug/1414065
> >
> > I had cause to refactor the way the nova libvirt driver monitors live
> > migration completion/failure/progress. This refactor has opened the
> > door for doing more intelligent active management of the live migration
> > process.
> ...
> > What kind of things would be the biggest win from Operators' or tenants'
> > POV ?
> 
> Awesome. Couple thoughts from my perspective. Firstly, there's a bunch
> of situation dependent tuning. One thing Crowbar does really nicely is
> that you specify the host layout in broad abstract terms - e.g. 'first
> 10G network link' and so on : some of your settings above like whether
> to compress page are going to be heavily dependent on the bandwidth
> available (I doubt that compression is a win on a 100G link for
> instance, and would be suspect at 10G even). So it would be nice if
> there was a single dial or two to set and Nova would auto-calculate
> good defaults from that (with appropriate overrides being available).

I wonder how such an idea would fit into Nova, since it doesn't really
have that kind of knowledge about the network deployment characteristics.

> Operationally avoiding trouble is better than being able to fix it, so
> I quite like the idea of defaulting the auto-converge option on, or
> perhaps making it controllable via flavours, so that operators can
> offer (and identify!) those particularly performance sensitive
> workloads rather than having to guess which instances are special and
> which aren't.

I'll investigate the auto-converge further to find out what the
potential downsides of it are. If we can unconditionally enable
it, it would be simpler than adding yet more tunables.

> Being able to cancel the migration would be good. Relatedly being able
> to restart nova-compute while a migration is going on would be good
> (or put differently, a migration happening shouldn't prevent a deploy
> of Nova code: interlocks like that make continuous deployment much
> harder).
> 
> If we can't already, I'd like as a user to be able to see that the
> migration is happening (allows diagnosis of transient issues during
> the migration). Some ops folk may want to hide that of course.
> 
> I'm not sure that automatically rolling back after N minutes makes
> sense : if the impact on the cluster is significant then 1 minute vs
> 10 doesn't instrinsically matter: what matters more is preventing too
> many concurrent migrations, so that would be another feature that I
> don't think we have yet: don't allow more than some N inbound and M
> outbound live migrations to a compute host at any time, to prevent IO
> storms. We may want to log with NOTIFICATION migrations that are still
> progressing but appear to be having trouble completing. And of course
> an admin API to query all migrations in progress to allow API driven
> health checks by monitoring tools - which gives the power to manage
> things to admins without us having to write a probably-too-simple
> config interface.

Interesting, the point about concurrent migrations hadn't occurred to
me before, but it does of course make sense since migration is
primarily network bandwidth limited, though disk bandwidth is relevant
too if doing block migration.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends

2015-02-02 Thread Daniel P. Berrange
On Sat, Jan 31, 2015 at 03:55:23AM +0100, Vladik Romanovsky wrote:
> 
> 
> - Original Message -
> > From: "Daniel P. Berrange" 
> > To: openstack-dev@lists.openstack.org, 
> > openstack-operat...@lists.openstack.org
> > Sent: Friday, 30 January, 2015 11:47:16 AM
> > Subject: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends
> > 
> > In working on a recent Nova migration bug
> > 
> >   https://bugs.launchpad.net/nova/+bug/1414065
> > 
> > I had cause to refactor the way the nova libvirt driver monitors live
> > migration completion/failure/progress. This refactor has opened the
> > door for doing more intelligent active management of the live migration
> > process.
> > 
> > As it stands today, we launch live migration, with a possible bandwidth
> > limit applied and just pray that it succeeds eventually. It might take
> > until the end of the universe and we'll happily wait that long. This is
> > pretty dumb really and I think we really ought to do better. The problem
> > is that I'm not really sure what "better" should mean, except for ensuring
> > it doesn't run forever.
> > 
> > As a demo, I pushed a quick proof of concept showing how we could easily
> > just abort live migration after say 10 minutes
> > 
> >   https://review.openstack.org/#/c/151665/
> > 
> > There are a number of possible things to consider though...
> > 
> > First how to detect when live migration isn't going to succeeed.
> > 
> >  - Could do a crude timeout, eg allow 10 minutes to succeeed or else.
> > 
> >  - Look at data transfer stats (memory transferred, memory remaining to
> >transfer, disk transferred, disk remaining to transfer) to determine
> >if it is making forward progress.
> 
> I think this is a better option. We could define a timeout for the progress
> and cancel if there is no progress. IIRC there were similar debates about it
> in Ovirt, we could do something similar:
> https://github.com/oVirt/vdsm/blob/master/vdsm/virt/migration.py#L430

That looks like quite a good implementation to follow. They are monitoring
progress and if they see progress stalling, then they wait a configurable
time before aborting. That should avoid prematurely aborting migrations
that are actually working, while avoiding migrations getting stuck forever.
They also have a global timeout which is based on the number of GB of RAM
the guest has, which is also a good idea compared to a one-size-fits-all
timeout.

> > Fourth there's a question of whether we should give the tenant user or
> > cloud admin further APIs for influencing migration
> > 
> >  - Add an explicit API for cancelling migration ?
> > 
> >  - Add APIs for setting tunables like downtime, bandwidth on the fly ?
> > 
> >  - Or drive some of the tunables like downtime, bandwidth, or policies
> >like cancel vs paused from flavour or image metadata properties ?
> > 
> >  - Allow operations like evacuate to specify a live migration policy
> >eg switch non-live migrate after 5 minutes ?
> > 
> IMHO, an explicit API for cancelling migration is very much needed.
> I remember cases when migrations took about 8 or hours, leaving the
> admins helpless :)

The oVirt hueristics should avoid that stuck scenario, but I do think
we need an API anyway.

> Also, I very much like the idea of having tunables and policy to set
> in the flavours and image properties.
> To allow the administrators to set these as a "template" in the flavour
> and also to let the users to update/override or "request" these options
> as they should know the best (hopefully) what is running in their guests.

We do need to make sure the administrators can always force migration
to succeed regardless of what the user might have configured, so they
can be ensured of emergency evacuation if needed.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends

2015-02-01 Thread Robert Collins
On 31 January 2015 at 05:47, Daniel P. Berrange  wrote:
> In working on a recent Nova migration bug
>
>   https://bugs.launchpad.net/nova/+bug/1414065
>
> I had cause to refactor the way the nova libvirt driver monitors live
> migration completion/failure/progress. This refactor has opened the
> door for doing more intelligent active management of the live migration
> process.
...
> What kind of things would be the biggest win from Operators' or tenants'
> POV ?

Awesome. Couple thoughts from my perspective. Firstly, there's a bunch
of situation dependent tuning. One thing Crowbar does really nicely is
that you specify the host layout in broad abstract terms - e.g. 'first
10G network link' and so on : some of your settings above like whether
to compress page are going to be heavily dependent on the bandwidth
available (I doubt that compression is a win on a 100G link for
instance, and would be suspect at 10G even). So it would be nice if
there was a single dial or two to set and Nova would auto-calculate
good defaults from that (with appropriate overrides being available).

Operationally avoiding trouble is better than being able to fix it, so
I quite like the idea of defaulting the auto-converge option on, or
perhaps making it controllable via flavours, so that operators can
offer (and identify!) those particularly performance sensitive
workloads rather than having to guess which instances are special and
which aren't.

Being able to cancel the migration would be good. Relatedly being able
to restart nova-compute while a migration is going on would be good
(or put differently, a migration happening shouldn't prevent a deploy
of Nova code: interlocks like that make continuous deployment much
harder).

If we can't already, I'd like as a user to be able to see that the
migration is happening (allows diagnosis of transient issues during
the migration). Some ops folk may want to hide that of course.

I'm not sure that automatically rolling back after N minutes makes
sense : if the impact on the cluster is significant then 1 minute vs
10 doesn't instrinsically matter: what matters more is preventing too
many concurrent migrations, so that would be another feature that I
don't think we have yet: don't allow more than some N inbound and M
outbound live migrations to a compute host at any time, to prevent IO
storms. We may want to log with NOTIFICATION migrations that are still
progressing but appear to be having trouble completing. And of course
an admin API to query all migrations in progress to allow API driven
health checks by monitoring tools - which gives the power to manage
things to admins without us having to write a probably-too-simple
config interface.

HTH,
Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends

2015-02-01 Thread Noel Burton-Krahn
Thanks for bringing this up, Daniel.  I don't think it makes sense to have
a timeout on live migration, but operators should be able to cancel it,
just like any other unbounded long-running process.  For example, there's
no timeout on file transfers, but they need an interface report progress
and to cancel them.  That would imply an option to cancel evacuation too.

--
Noel


On Fri, Jan 30, 2015 at 8:47 AM, Daniel P. Berrange 
wrote:

> In working on a recent Nova migration bug
>
>   https://bugs.launchpad.net/nova/+bug/1414065
>
> I had cause to refactor the way the nova libvirt driver monitors live
> migration completion/failure/progress. This refactor has opened the
> door for doing more intelligent active management of the live migration
> process.
>
> As it stands today, we launch live migration, with a possible bandwidth
> limit applied and just pray that it succeeds eventually. It might take
> until the end of the universe and we'll happily wait that long. This is
> pretty dumb really and I think we really ought to do better. The problem
> is that I'm not really sure what "better" should mean, except for ensuring
> it doesn't run forever.
>
> As a demo, I pushed a quick proof of concept showing how we could easily
> just abort live migration after say 10 minutes
>
>   https://review.openstack.org/#/c/151665/
>
> There are a number of possible things to consider though...
>
> First how to detect when live migration isn't going to succeeed.
>
>  - Could do a crude timeout, eg allow 10 minutes to succeeed or else.
>
>  - Look at data transfer stats (memory transferred, memory remaining to
>transfer, disk transferred, disk remaining to transfer) to determine
>if it is making forward progress.
>
>  - Leave it upto the admin / user to decided if it has gone long enough
>
> The first is easy, while the second is harder but probably more reliable
> and useful for users.
>
> Second is a question of what todo when it looks to be failing
>
>  - Cancel the migration - leave it running on source. Not good if the
>admin is trying to evacuate a host.
>
>  - Pause the VM - make it complete as non-live migration. Not good if
>the guest workload doesn't like being paused
>
>  - Increase the bandwidth permitted. There is a built-in rate limit in
>QEMU overridable via nova.conf. Could argue that the admin should just
>set their desired limit in nova.conf and be done with it, but perhaps
>there's a case for increasing it in special circumstances. eg emergency
>evacuate of host it is better to waste bandwidth & complete the job,
>but for non-urgent scenarios better to limit bandwidth & accept failure
> ?
>
>  - Increase the maximum downtime permitted. This is the small time window
>when the guest switches from source to dest. To small and it'll never
>switch, too large and it'll suffer unacceptable interuption.
>
> We could do some of these things automatically based on some policy
> or leave them upto the cloud admin/tenant user via new APIs
>
> Third there's question of other QEMU features we could make use of to
> stop problems in the first place
>
>  - Auto-converge flag - if you set this QEMU throttles back the CPUs
>so the guest cannot dirty ram pages as quickly. This is nicer than
>pausing CPUs altogether, but could still be an issue for guests
>which have strong performance requirements
>
>  - Page compression flag - if you set this QEMU does compression of
>pages to reduce data that has to be sent. This is basically trading
>off network bandwidth vs CPU burn. Probably a win unless you are
>already highly overcomit on CPU on the host
>
> Fourth there's a question of whether we should give the tenant user or
> cloud admin further APIs for influencing migration
>
>  - Add an explicit API for cancelling migration ?
>
>  - Add APIs for setting tunables like downtime, bandwidth on the fly ?
>
>  - Or drive some of the tunables like downtime, bandwidth, or policies
>like cancel vs paused from flavour or image metadata properties ?
>
>  - Allow operations like evacuate to specify a live migration policy
>eg switch non-live migrate after 5 minutes ?
>
> The current code is so crude and there's a hell of alot of options we
> can take. I'm just not sure which is the best direction for us to go
> in.
>
> What kind of things would be the biggest win from Operators' or tenants'
> POV ?
>
> Regards,
> Daniel
> --
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/
> :|
> |: http://libvirt.org  -o- http://virt-manager.org
> :|
> |: http://autobuild.org   -o- http://search.cpan.org/~danberr/
> :|
> |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc
> :|
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>

Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends

2015-01-30 Thread Vladik Romanovsky


- Original Message -
> From: "Daniel P. Berrange" 
> To: openstack-dev@lists.openstack.org, openstack-operat...@lists.openstack.org
> Sent: Friday, 30 January, 2015 11:47:16 AM
> Subject: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends
> 
> In working on a recent Nova migration bug
> 
>   https://bugs.launchpad.net/nova/+bug/1414065
> 
> I had cause to refactor the way the nova libvirt driver monitors live
> migration completion/failure/progress. This refactor has opened the
> door for doing more intelligent active management of the live migration
> process.
> 
> As it stands today, we launch live migration, with a possible bandwidth
> limit applied and just pray that it succeeds eventually. It might take
> until the end of the universe and we'll happily wait that long. This is
> pretty dumb really and I think we really ought to do better. The problem
> is that I'm not really sure what "better" should mean, except for ensuring
> it doesn't run forever.
> 
> As a demo, I pushed a quick proof of concept showing how we could easily
> just abort live migration after say 10 minutes
> 
>   https://review.openstack.org/#/c/151665/
> 
> There are a number of possible things to consider though...
> 
> First how to detect when live migration isn't going to succeeed.
> 
>  - Could do a crude timeout, eg allow 10 minutes to succeeed or else.
> 
>  - Look at data transfer stats (memory transferred, memory remaining to
>transfer, disk transferred, disk remaining to transfer) to determine
>if it is making forward progress.

I think this is a better option. We could define a timeout for the progress
and cancel if there is no progress. IIRC there were similar debates about it
in Ovirt, we could do something similar:
https://github.com/oVirt/vdsm/blob/master/vdsm/virt/migration.py#L430

> 
>  - Leave it upto the admin / user to decided if it has gone long enough
> 
> The first is easy, while the second is harder but probably more reliable
> and useful for users.
> 
> Second is a question of what todo when it looks to be failing
> 
>  - Cancel the migration - leave it running on source. Not good if the
>admin is trying to evacuate a host.
> 
>  - Pause the VM - make it complete as non-live migration. Not good if
>the guest workload doesn't like being paused
> 
>  - Increase the bandwidth permitted. There is a built-in rate limit in
>QEMU overridable via nova.conf. Could argue that the admin should just
>set their desired limit in nova.conf and be done with it, but perhaps
>there's a case for increasing it in special circumstances. eg emergency
>evacuate of host it is better to waste bandwidth & complete the job,
>but for non-urgent scenarios better to limit bandwidth & accept failure ?
> 
>  - Increase the maximum downtime permitted. This is the small time window
>when the guest switches from source to dest. To small and it'll never
>switch, too large and it'll suffer unacceptable interuption.
> 

In my opinion, it would be great if we could play with bandwidth and downtime
before cancelling the migration or pausing.
However, It makes sense only if there is some kind of a progress in the transfer
stats and not a complete disconnect. In that case we should just cancel it.

> We could do some of these things automatically based on some policy
> or leave them upto the cloud admin/tenant user via new APIs
> 
> Third there's question of other QEMU features we could make use of to
> stop problems in the first place
> 
>  - Auto-converge flag - if you set this QEMU throttles back the CPUs
>so the guest cannot dirty ram pages as quickly. This is nicer than
>pausing CPUs altogether, but could still be an issue for guests
>which have strong performance requirements
> 
>  - Page compression flag - if you set this QEMU does compression of
>pages to reduce data that has to be sent. This is basically trading
>off network bandwidth vs CPU burn. Probably a win unless you are
>already highly overcomit on CPU on the host
> 
> Fourth there's a question of whether we should give the tenant user or
> cloud admin further APIs for influencing migration
> 
>  - Add an explicit API for cancelling migration ?
> 
>  - Add APIs for setting tunables like downtime, bandwidth on the fly ?
> 
>  - Or drive some of the tunables like downtime, bandwidth, or policies
>like cancel vs paused from flavour or image metadata properties ?
> 
>  - Allow operations like evacuate to specify a live migration policy
>eg switch non-live migrate after 5 minutes ?
> 
IMHO, an explicit API for cancelling migration is very much need

[openstack-dev] [nova][libvirt] RFC: ensuring live migration ends

2015-01-30 Thread Daniel P. Berrange
In working on a recent Nova migration bug

  https://bugs.launchpad.net/nova/+bug/1414065

I had cause to refactor the way the nova libvirt driver monitors live
migration completion/failure/progress. This refactor has opened the
door for doing more intelligent active management of the live migration
process.

As it stands today, we launch live migration, with a possible bandwidth
limit applied and just pray that it succeeds eventually. It might take
until the end of the universe and we'll happily wait that long. This is
pretty dumb really and I think we really ought to do better. The problem
is that I'm not really sure what "better" should mean, except for ensuring
it doesn't run forever.

As a demo, I pushed a quick proof of concept showing how we could easily
just abort live migration after say 10 minutes

  https://review.openstack.org/#/c/151665/

There are a number of possible things to consider though...

First how to detect when live migration isn't going to succeeed.

 - Could do a crude timeout, eg allow 10 minutes to succeeed or else.

 - Look at data transfer stats (memory transferred, memory remaining to
   transfer, disk transferred, disk remaining to transfer) to determine
   if it is making forward progress.

 - Leave it upto the admin / user to decided if it has gone long enough

The first is easy, while the second is harder but probably more reliable
and useful for users.

Second is a question of what todo when it looks to be failing

 - Cancel the migration - leave it running on source. Not good if the
   admin is trying to evacuate a host.

 - Pause the VM - make it complete as non-live migration. Not good if
   the guest workload doesn't like being paused

 - Increase the bandwidth permitted. There is a built-in rate limit in
   QEMU overridable via nova.conf. Could argue that the admin should just
   set their desired limit in nova.conf and be done with it, but perhaps
   there's a case for increasing it in special circumstances. eg emergency
   evacuate of host it is better to waste bandwidth & complete the job,
   but for non-urgent scenarios better to limit bandwidth & accept failure ?

 - Increase the maximum downtime permitted. This is the small time window
   when the guest switches from source to dest. To small and it'll never
   switch, too large and it'll suffer unacceptable interuption.

We could do some of these things automatically based on some policy
or leave them upto the cloud admin/tenant user via new APIs

Third there's question of other QEMU features we could make use of to
stop problems in the first place

 - Auto-converge flag - if you set this QEMU throttles back the CPUs
   so the guest cannot dirty ram pages as quickly. This is nicer than
   pausing CPUs altogether, but could still be an issue for guests
   which have strong performance requirements

 - Page compression flag - if you set this QEMU does compression of
   pages to reduce data that has to be sent. This is basically trading
   off network bandwidth vs CPU burn. Probably a win unless you are
   already highly overcomit on CPU on the host

Fourth there's a question of whether we should give the tenant user or
cloud admin further APIs for influencing migration

 - Add an explicit API for cancelling migration ?

 - Add APIs for setting tunables like downtime, bandwidth on the fly ?

 - Or drive some of the tunables like downtime, bandwidth, or policies
   like cancel vs paused from flavour or image metadata properties ?

 - Allow operations like evacuate to specify a live migration policy
   eg switch non-live migrate after 5 minutes ?

The current code is so crude and there's a hell of alot of options we
can take. I'm just not sure which is the best direction for us to go
in.

What kind of things would be the biggest win from Operators' or tenants'
POV ?

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev