Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends
On Mon, Feb 02, 2015 at 01:21:31PM -0500, Andrew Laski wrote: > > On 02/02/2015 11:26 AM, Daniel P. Berrange wrote: > >On Mon, Feb 02, 2015 at 11:19:45AM -0500, Andrew Laski wrote: > >>On 02/02/2015 05:58 AM, Daniel P. Berrange wrote: > >>>On Sun, Feb 01, 2015 at 11:20:08AM -0800, Noel Burton-Krahn wrote: > Thanks for bringing this up, Daniel. I don't think it makes sense to have > a timeout on live migration, but operators should be able to cancel it, > just like any other unbounded long-running process. For example, there's > no timeout on file transfers, but they need an interface report progress > and to cancel them. That would imply an option to cancel evacuation too. > >>>There has been periodic talk about a generic "tasks API" in Nova for > >>>managing > >>>long running operations and getting information about their progress, but I > >>>am not sure what the status of that is. It would obviously be applicable to > >>>migration if that's a route we took. > >>Currently the status of a tasks API is that it would happen after the API > >>v2.1 microversions work has created a suitable framework in which to add > >>tasks to the API. > >So is all work on tasks blocked by the microversions support ? I would have > >though that would only block places where we need to modify existing APIs. > >Are we not able to add APIs for listing / cancelling tasks as new APIs > >without such a dependency on microversions ? > > Tasks work is certainly not blocked on waiting for microversions. There is a > large amount of non API facing work that could be done to move forward the > idea of a task driving state changes within Nova. I would very likely be > working on that if I wasn't currently spending much of my time on cells v2. Ok, thanks for the info. So from the POV of migration, I'll focus on the non-API stuff, and expect the tasks work to provide the API mechanisms Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends
On 02/02/2015 11:26 AM, Daniel P. Berrange wrote: On Mon, Feb 02, 2015 at 11:19:45AM -0500, Andrew Laski wrote: On 02/02/2015 05:58 AM, Daniel P. Berrange wrote: On Sun, Feb 01, 2015 at 11:20:08AM -0800, Noel Burton-Krahn wrote: Thanks for bringing this up, Daniel. I don't think it makes sense to have a timeout on live migration, but operators should be able to cancel it, just like any other unbounded long-running process. For example, there's no timeout on file transfers, but they need an interface report progress and to cancel them. That would imply an option to cancel evacuation too. There has been periodic talk about a generic "tasks API" in Nova for managing long running operations and getting information about their progress, but I am not sure what the status of that is. It would obviously be applicable to migration if that's a route we took. Currently the status of a tasks API is that it would happen after the API v2.1 microversions work has created a suitable framework in which to add tasks to the API. So is all work on tasks blocked by the microversions support ? I would have though that would only block places where we need to modify existing APIs. Are we not able to add APIs for listing / cancelling tasks as new APIs without such a dependency on microversions ? Tasks work is certainly not blocked on waiting for microversions. There is a large amount of non API facing work that could be done to move forward the idea of a task driving state changes within Nova. I would very likely be working on that if I wasn't currently spending much of my time on cells v2. Regards, Daniel __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends
On Mon, Feb 02, 2015 at 11:19:45AM -0500, Andrew Laski wrote: > > On 02/02/2015 05:58 AM, Daniel P. Berrange wrote: > >On Sun, Feb 01, 2015 at 11:20:08AM -0800, Noel Burton-Krahn wrote: > >>Thanks for bringing this up, Daniel. I don't think it makes sense to have > >>a timeout on live migration, but operators should be able to cancel it, > >>just like any other unbounded long-running process. For example, there's > >>no timeout on file transfers, but they need an interface report progress > >>and to cancel them. That would imply an option to cancel evacuation too. > >There has been periodic talk about a generic "tasks API" in Nova for managing > >long running operations and getting information about their progress, but I > >am not sure what the status of that is. It would obviously be applicable to > >migration if that's a route we took. > > Currently the status of a tasks API is that it would happen after the API > v2.1 microversions work has created a suitable framework in which to add > tasks to the API. So is all work on tasks blocked by the microversions support ? I would have though that would only block places where we need to modify existing APIs. Are we not able to add APIs for listing / cancelling tasks as new APIs without such a dependency on microversions ? Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends
On 02/02/2015 05:58 AM, Daniel P. Berrange wrote: On Sun, Feb 01, 2015 at 11:20:08AM -0800, Noel Burton-Krahn wrote: Thanks for bringing this up, Daniel. I don't think it makes sense to have a timeout on live migration, but operators should be able to cancel it, just like any other unbounded long-running process. For example, there's no timeout on file transfers, but they need an interface report progress and to cancel them. That would imply an option to cancel evacuation too. There has been periodic talk about a generic "tasks API" in Nova for managing long running operations and getting information about their progress, but I am not sure what the status of that is. It would obviously be applicable to migration if that's a route we took. Currently the status of a tasks API is that it would happen after the API v2.1 microversions work has created a suitable framework in which to add tasks to the API. Regards, Daniel __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends
- Original Message - > From: "Daniel P. Berrange" > To: "Robert Collins" > Cc: "OpenStack Development Mailing List (not for usage questions)" > , > openstack-operat...@lists.openstack.org > Sent: Monday, 2 February, 2015 5:56:56 AM > Subject: Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration > ends > > On Mon, Feb 02, 2015 at 08:24:20AM +1300, Robert Collins wrote: > > On 31 January 2015 at 05:47, Daniel P. Berrange > > wrote: > > > In working on a recent Nova migration bug > > > > > > https://bugs.launchpad.net/nova/+bug/1414065 > > > > > > I had cause to refactor the way the nova libvirt driver monitors live > > > migration completion/failure/progress. This refactor has opened the > > > door for doing more intelligent active management of the live migration > > > process. > > ... > > > What kind of things would be the biggest win from Operators' or tenants' > > > POV ? > > > > Awesome. Couple thoughts from my perspective. Firstly, there's a bunch > > of situation dependent tuning. One thing Crowbar does really nicely is > > that you specify the host layout in broad abstract terms - e.g. 'first > > 10G network link' and so on : some of your settings above like whether > > to compress page are going to be heavily dependent on the bandwidth > > available (I doubt that compression is a win on a 100G link for > > instance, and would be suspect at 10G even). So it would be nice if > > there was a single dial or two to set and Nova would auto-calculate > > good defaults from that (with appropriate overrides being available). > > I wonder how such an idea would fit into Nova, since it doesn't really > have that kind of knowledge about the network deployment characteristics. > > > Operationally avoiding trouble is better than being able to fix it, so > > I quite like the idea of defaulting the auto-converge option on, or > > perhaps making it controllable via flavours, so that operators can > > offer (and identify!) those particularly performance sensitive > > workloads rather than having to guess which instances are special and > > which aren't. > > I'll investigate the auto-converge further to find out what the > potential downsides of it are. If we can unconditionally enable > it, it would be simpler than adding yet more tunables. > > > Being able to cancel the migration would be good. Relatedly being able > > to restart nova-compute while a migration is going on would be good > > (or put differently, a migration happening shouldn't prevent a deploy > > of Nova code: interlocks like that make continuous deployment much > > harder). > > > > If we can't already, I'd like as a user to be able to see that the > > migration is happening (allows diagnosis of transient issues during > > the migration). Some ops folk may want to hide that of course. > > > > I'm not sure that automatically rolling back after N minutes makes > > sense : if the impact on the cluster is significant then 1 minute vs > > 10 doesn't instrinsically matter: what matters more is preventing too > > many concurrent migrations, so that would be another feature that I > > don't think we have yet: don't allow more than some N inbound and M > > outbound live migrations to a compute host at any time, to prevent IO > > storms. We may want to log with NOTIFICATION migrations that are still > > progressing but appear to be having trouble completing. And of course > > an admin API to query all migrations in progress to allow API driven > > health checks by monitoring tools - which gives the power to manage > > things to admins without us having to write a probably-too-simple > > config interface. > > Interesting, the point about concurrent migrations hadn't occurred to > me before, but it does of course make sense since migration is > primarily network bandwidth limited, though disk bandwidth is relevant > too if doing block migration. Indeed, there was a lot time spent investigating this topic (in Ovirt again) and eventually it was decided to expose a config option and allow 3 concurrent migrations by default. https://github.com/oVirt/vdsm/blob/master/lib/vdsm/config.py.in#L126 > > Regards, > Daniel > -- > |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| > |: http://libvirt.org -o- http://virt-manager.org :| > |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| > |: http://entangle
Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends
On Sun, Feb 01, 2015 at 11:20:08AM -0800, Noel Burton-Krahn wrote: > Thanks for bringing this up, Daniel. I don't think it makes sense to have > a timeout on live migration, but operators should be able to cancel it, > just like any other unbounded long-running process. For example, there's > no timeout on file transfers, but they need an interface report progress > and to cancel them. That would imply an option to cancel evacuation too. There has been periodic talk about a generic "tasks API" in Nova for managing long running operations and getting information about their progress, but I am not sure what the status of that is. It would obviously be applicable to migration if that's a route we took. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends
On Mon, Feb 02, 2015 at 08:24:20AM +1300, Robert Collins wrote: > On 31 January 2015 at 05:47, Daniel P. Berrange wrote: > > In working on a recent Nova migration bug > > > > https://bugs.launchpad.net/nova/+bug/1414065 > > > > I had cause to refactor the way the nova libvirt driver monitors live > > migration completion/failure/progress. This refactor has opened the > > door for doing more intelligent active management of the live migration > > process. > ... > > What kind of things would be the biggest win from Operators' or tenants' > > POV ? > > Awesome. Couple thoughts from my perspective. Firstly, there's a bunch > of situation dependent tuning. One thing Crowbar does really nicely is > that you specify the host layout in broad abstract terms - e.g. 'first > 10G network link' and so on : some of your settings above like whether > to compress page are going to be heavily dependent on the bandwidth > available (I doubt that compression is a win on a 100G link for > instance, and would be suspect at 10G even). So it would be nice if > there was a single dial or two to set and Nova would auto-calculate > good defaults from that (with appropriate overrides being available). I wonder how such an idea would fit into Nova, since it doesn't really have that kind of knowledge about the network deployment characteristics. > Operationally avoiding trouble is better than being able to fix it, so > I quite like the idea of defaulting the auto-converge option on, or > perhaps making it controllable via flavours, so that operators can > offer (and identify!) those particularly performance sensitive > workloads rather than having to guess which instances are special and > which aren't. I'll investigate the auto-converge further to find out what the potential downsides of it are. If we can unconditionally enable it, it would be simpler than adding yet more tunables. > Being able to cancel the migration would be good. Relatedly being able > to restart nova-compute while a migration is going on would be good > (or put differently, a migration happening shouldn't prevent a deploy > of Nova code: interlocks like that make continuous deployment much > harder). > > If we can't already, I'd like as a user to be able to see that the > migration is happening (allows diagnosis of transient issues during > the migration). Some ops folk may want to hide that of course. > > I'm not sure that automatically rolling back after N minutes makes > sense : if the impact on the cluster is significant then 1 minute vs > 10 doesn't instrinsically matter: what matters more is preventing too > many concurrent migrations, so that would be another feature that I > don't think we have yet: don't allow more than some N inbound and M > outbound live migrations to a compute host at any time, to prevent IO > storms. We may want to log with NOTIFICATION migrations that are still > progressing but appear to be having trouble completing. And of course > an admin API to query all migrations in progress to allow API driven > health checks by monitoring tools - which gives the power to manage > things to admins without us having to write a probably-too-simple > config interface. Interesting, the point about concurrent migrations hadn't occurred to me before, but it does of course make sense since migration is primarily network bandwidth limited, though disk bandwidth is relevant too if doing block migration. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends
On Sat, Jan 31, 2015 at 03:55:23AM +0100, Vladik Romanovsky wrote: > > > - Original Message - > > From: "Daniel P. Berrange" > > To: openstack-dev@lists.openstack.org, > > openstack-operat...@lists.openstack.org > > Sent: Friday, 30 January, 2015 11:47:16 AM > > Subject: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends > > > > In working on a recent Nova migration bug > > > > https://bugs.launchpad.net/nova/+bug/1414065 > > > > I had cause to refactor the way the nova libvirt driver monitors live > > migration completion/failure/progress. This refactor has opened the > > door for doing more intelligent active management of the live migration > > process. > > > > As it stands today, we launch live migration, with a possible bandwidth > > limit applied and just pray that it succeeds eventually. It might take > > until the end of the universe and we'll happily wait that long. This is > > pretty dumb really and I think we really ought to do better. The problem > > is that I'm not really sure what "better" should mean, except for ensuring > > it doesn't run forever. > > > > As a demo, I pushed a quick proof of concept showing how we could easily > > just abort live migration after say 10 minutes > > > > https://review.openstack.org/#/c/151665/ > > > > There are a number of possible things to consider though... > > > > First how to detect when live migration isn't going to succeeed. > > > > - Could do a crude timeout, eg allow 10 minutes to succeeed or else. > > > > - Look at data transfer stats (memory transferred, memory remaining to > >transfer, disk transferred, disk remaining to transfer) to determine > >if it is making forward progress. > > I think this is a better option. We could define a timeout for the progress > and cancel if there is no progress. IIRC there were similar debates about it > in Ovirt, we could do something similar: > https://github.com/oVirt/vdsm/blob/master/vdsm/virt/migration.py#L430 That looks like quite a good implementation to follow. They are monitoring progress and if they see progress stalling, then they wait a configurable time before aborting. That should avoid prematurely aborting migrations that are actually working, while avoiding migrations getting stuck forever. They also have a global timeout which is based on the number of GB of RAM the guest has, which is also a good idea compared to a one-size-fits-all timeout. > > Fourth there's a question of whether we should give the tenant user or > > cloud admin further APIs for influencing migration > > > > - Add an explicit API for cancelling migration ? > > > > - Add APIs for setting tunables like downtime, bandwidth on the fly ? > > > > - Or drive some of the tunables like downtime, bandwidth, or policies > >like cancel vs paused from flavour or image metadata properties ? > > > > - Allow operations like evacuate to specify a live migration policy > >eg switch non-live migrate after 5 minutes ? > > > IMHO, an explicit API for cancelling migration is very much needed. > I remember cases when migrations took about 8 or hours, leaving the > admins helpless :) The oVirt hueristics should avoid that stuck scenario, but I do think we need an API anyway. > Also, I very much like the idea of having tunables and policy to set > in the flavours and image properties. > To allow the administrators to set these as a "template" in the flavour > and also to let the users to update/override or "request" these options > as they should know the best (hopefully) what is running in their guests. We do need to make sure the administrators can always force migration to succeed regardless of what the user might have configured, so they can be ensured of emergency evacuation if needed. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends
On 31 January 2015 at 05:47, Daniel P. Berrange wrote: > In working on a recent Nova migration bug > > https://bugs.launchpad.net/nova/+bug/1414065 > > I had cause to refactor the way the nova libvirt driver monitors live > migration completion/failure/progress. This refactor has opened the > door for doing more intelligent active management of the live migration > process. ... > What kind of things would be the biggest win from Operators' or tenants' > POV ? Awesome. Couple thoughts from my perspective. Firstly, there's a bunch of situation dependent tuning. One thing Crowbar does really nicely is that you specify the host layout in broad abstract terms - e.g. 'first 10G network link' and so on : some of your settings above like whether to compress page are going to be heavily dependent on the bandwidth available (I doubt that compression is a win on a 100G link for instance, and would be suspect at 10G even). So it would be nice if there was a single dial or two to set and Nova would auto-calculate good defaults from that (with appropriate overrides being available). Operationally avoiding trouble is better than being able to fix it, so I quite like the idea of defaulting the auto-converge option on, or perhaps making it controllable via flavours, so that operators can offer (and identify!) those particularly performance sensitive workloads rather than having to guess which instances are special and which aren't. Being able to cancel the migration would be good. Relatedly being able to restart nova-compute while a migration is going on would be good (or put differently, a migration happening shouldn't prevent a deploy of Nova code: interlocks like that make continuous deployment much harder). If we can't already, I'd like as a user to be able to see that the migration is happening (allows diagnosis of transient issues during the migration). Some ops folk may want to hide that of course. I'm not sure that automatically rolling back after N minutes makes sense : if the impact on the cluster is significant then 1 minute vs 10 doesn't instrinsically matter: what matters more is preventing too many concurrent migrations, so that would be another feature that I don't think we have yet: don't allow more than some N inbound and M outbound live migrations to a compute host at any time, to prevent IO storms. We may want to log with NOTIFICATION migrations that are still progressing but appear to be having trouble completing. And of course an admin API to query all migrations in progress to allow API driven health checks by monitoring tools - which gives the power to manage things to admins without us having to write a probably-too-simple config interface. HTH, Rob -- Robert Collins Distinguished Technologist HP Converged Cloud __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends
Thanks for bringing this up, Daniel. I don't think it makes sense to have a timeout on live migration, but operators should be able to cancel it, just like any other unbounded long-running process. For example, there's no timeout on file transfers, but they need an interface report progress and to cancel them. That would imply an option to cancel evacuation too. -- Noel On Fri, Jan 30, 2015 at 8:47 AM, Daniel P. Berrange wrote: > In working on a recent Nova migration bug > > https://bugs.launchpad.net/nova/+bug/1414065 > > I had cause to refactor the way the nova libvirt driver monitors live > migration completion/failure/progress. This refactor has opened the > door for doing more intelligent active management of the live migration > process. > > As it stands today, we launch live migration, with a possible bandwidth > limit applied and just pray that it succeeds eventually. It might take > until the end of the universe and we'll happily wait that long. This is > pretty dumb really and I think we really ought to do better. The problem > is that I'm not really sure what "better" should mean, except for ensuring > it doesn't run forever. > > As a demo, I pushed a quick proof of concept showing how we could easily > just abort live migration after say 10 minutes > > https://review.openstack.org/#/c/151665/ > > There are a number of possible things to consider though... > > First how to detect when live migration isn't going to succeeed. > > - Could do a crude timeout, eg allow 10 minutes to succeeed or else. > > - Look at data transfer stats (memory transferred, memory remaining to >transfer, disk transferred, disk remaining to transfer) to determine >if it is making forward progress. > > - Leave it upto the admin / user to decided if it has gone long enough > > The first is easy, while the second is harder but probably more reliable > and useful for users. > > Second is a question of what todo when it looks to be failing > > - Cancel the migration - leave it running on source. Not good if the >admin is trying to evacuate a host. > > - Pause the VM - make it complete as non-live migration. Not good if >the guest workload doesn't like being paused > > - Increase the bandwidth permitted. There is a built-in rate limit in >QEMU overridable via nova.conf. Could argue that the admin should just >set their desired limit in nova.conf and be done with it, but perhaps >there's a case for increasing it in special circumstances. eg emergency >evacuate of host it is better to waste bandwidth & complete the job, >but for non-urgent scenarios better to limit bandwidth & accept failure > ? > > - Increase the maximum downtime permitted. This is the small time window >when the guest switches from source to dest. To small and it'll never >switch, too large and it'll suffer unacceptable interuption. > > We could do some of these things automatically based on some policy > or leave them upto the cloud admin/tenant user via new APIs > > Third there's question of other QEMU features we could make use of to > stop problems in the first place > > - Auto-converge flag - if you set this QEMU throttles back the CPUs >so the guest cannot dirty ram pages as quickly. This is nicer than >pausing CPUs altogether, but could still be an issue for guests >which have strong performance requirements > > - Page compression flag - if you set this QEMU does compression of >pages to reduce data that has to be sent. This is basically trading >off network bandwidth vs CPU burn. Probably a win unless you are >already highly overcomit on CPU on the host > > Fourth there's a question of whether we should give the tenant user or > cloud admin further APIs for influencing migration > > - Add an explicit API for cancelling migration ? > > - Add APIs for setting tunables like downtime, bandwidth on the fly ? > > - Or drive some of the tunables like downtime, bandwidth, or policies >like cancel vs paused from flavour or image metadata properties ? > > - Allow operations like evacuate to specify a live migration policy >eg switch non-live migrate after 5 minutes ? > > The current code is so crude and there's a hell of alot of options we > can take. I'm just not sure which is the best direction for us to go > in. > > What kind of things would be the biggest win from Operators' or tenants' > POV ? > > Regards, > Daniel > -- > |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ > :| > |: http://libvirt.org -o- http://virt-manager.org > :| > |: http://autobuild.org -o- http://search.cpan.org/~danberr/ > :| > |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc > :| > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >
Re: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends
- Original Message - > From: "Daniel P. Berrange" > To: openstack-dev@lists.openstack.org, openstack-operat...@lists.openstack.org > Sent: Friday, 30 January, 2015 11:47:16 AM > Subject: [openstack-dev] [nova][libvirt] RFC: ensuring live migration ends > > In working on a recent Nova migration bug > > https://bugs.launchpad.net/nova/+bug/1414065 > > I had cause to refactor the way the nova libvirt driver monitors live > migration completion/failure/progress. This refactor has opened the > door for doing more intelligent active management of the live migration > process. > > As it stands today, we launch live migration, with a possible bandwidth > limit applied and just pray that it succeeds eventually. It might take > until the end of the universe and we'll happily wait that long. This is > pretty dumb really and I think we really ought to do better. The problem > is that I'm not really sure what "better" should mean, except for ensuring > it doesn't run forever. > > As a demo, I pushed a quick proof of concept showing how we could easily > just abort live migration after say 10 minutes > > https://review.openstack.org/#/c/151665/ > > There are a number of possible things to consider though... > > First how to detect when live migration isn't going to succeeed. > > - Could do a crude timeout, eg allow 10 minutes to succeeed or else. > > - Look at data transfer stats (memory transferred, memory remaining to >transfer, disk transferred, disk remaining to transfer) to determine >if it is making forward progress. I think this is a better option. We could define a timeout for the progress and cancel if there is no progress. IIRC there were similar debates about it in Ovirt, we could do something similar: https://github.com/oVirt/vdsm/blob/master/vdsm/virt/migration.py#L430 > > - Leave it upto the admin / user to decided if it has gone long enough > > The first is easy, while the second is harder but probably more reliable > and useful for users. > > Second is a question of what todo when it looks to be failing > > - Cancel the migration - leave it running on source. Not good if the >admin is trying to evacuate a host. > > - Pause the VM - make it complete as non-live migration. Not good if >the guest workload doesn't like being paused > > - Increase the bandwidth permitted. There is a built-in rate limit in >QEMU overridable via nova.conf. Could argue that the admin should just >set their desired limit in nova.conf and be done with it, but perhaps >there's a case for increasing it in special circumstances. eg emergency >evacuate of host it is better to waste bandwidth & complete the job, >but for non-urgent scenarios better to limit bandwidth & accept failure ? > > - Increase the maximum downtime permitted. This is the small time window >when the guest switches from source to dest. To small and it'll never >switch, too large and it'll suffer unacceptable interuption. > In my opinion, it would be great if we could play with bandwidth and downtime before cancelling the migration or pausing. However, It makes sense only if there is some kind of a progress in the transfer stats and not a complete disconnect. In that case we should just cancel it. > We could do some of these things automatically based on some policy > or leave them upto the cloud admin/tenant user via new APIs > > Third there's question of other QEMU features we could make use of to > stop problems in the first place > > - Auto-converge flag - if you set this QEMU throttles back the CPUs >so the guest cannot dirty ram pages as quickly. This is nicer than >pausing CPUs altogether, but could still be an issue for guests >which have strong performance requirements > > - Page compression flag - if you set this QEMU does compression of >pages to reduce data that has to be sent. This is basically trading >off network bandwidth vs CPU burn. Probably a win unless you are >already highly overcomit on CPU on the host > > Fourth there's a question of whether we should give the tenant user or > cloud admin further APIs for influencing migration > > - Add an explicit API for cancelling migration ? > > - Add APIs for setting tunables like downtime, bandwidth on the fly ? > > - Or drive some of the tunables like downtime, bandwidth, or policies >like cancel vs paused from flavour or image metadata properties ? > > - Allow operations like evacuate to specify a live migration policy >eg switch non-live migrate after 5 minutes ? > IMHO, an explicit API for cancelling migration is very much need
[openstack-dev] [nova][libvirt] RFC: ensuring live migration ends
In working on a recent Nova migration bug https://bugs.launchpad.net/nova/+bug/1414065 I had cause to refactor the way the nova libvirt driver monitors live migration completion/failure/progress. This refactor has opened the door for doing more intelligent active management of the live migration process. As it stands today, we launch live migration, with a possible bandwidth limit applied and just pray that it succeeds eventually. It might take until the end of the universe and we'll happily wait that long. This is pretty dumb really and I think we really ought to do better. The problem is that I'm not really sure what "better" should mean, except for ensuring it doesn't run forever. As a demo, I pushed a quick proof of concept showing how we could easily just abort live migration after say 10 minutes https://review.openstack.org/#/c/151665/ There are a number of possible things to consider though... First how to detect when live migration isn't going to succeeed. - Could do a crude timeout, eg allow 10 minutes to succeeed or else. - Look at data transfer stats (memory transferred, memory remaining to transfer, disk transferred, disk remaining to transfer) to determine if it is making forward progress. - Leave it upto the admin / user to decided if it has gone long enough The first is easy, while the second is harder but probably more reliable and useful for users. Second is a question of what todo when it looks to be failing - Cancel the migration - leave it running on source. Not good if the admin is trying to evacuate a host. - Pause the VM - make it complete as non-live migration. Not good if the guest workload doesn't like being paused - Increase the bandwidth permitted. There is a built-in rate limit in QEMU overridable via nova.conf. Could argue that the admin should just set their desired limit in nova.conf and be done with it, but perhaps there's a case for increasing it in special circumstances. eg emergency evacuate of host it is better to waste bandwidth & complete the job, but for non-urgent scenarios better to limit bandwidth & accept failure ? - Increase the maximum downtime permitted. This is the small time window when the guest switches from source to dest. To small and it'll never switch, too large and it'll suffer unacceptable interuption. We could do some of these things automatically based on some policy or leave them upto the cloud admin/tenant user via new APIs Third there's question of other QEMU features we could make use of to stop problems in the first place - Auto-converge flag - if you set this QEMU throttles back the CPUs so the guest cannot dirty ram pages as quickly. This is nicer than pausing CPUs altogether, but could still be an issue for guests which have strong performance requirements - Page compression flag - if you set this QEMU does compression of pages to reduce data that has to be sent. This is basically trading off network bandwidth vs CPU burn. Probably a win unless you are already highly overcomit on CPU on the host Fourth there's a question of whether we should give the tenant user or cloud admin further APIs for influencing migration - Add an explicit API for cancelling migration ? - Add APIs for setting tunables like downtime, bandwidth on the fly ? - Or drive some of the tunables like downtime, bandwidth, or policies like cancel vs paused from flavour or image metadata properties ? - Allow operations like evacuate to specify a live migration policy eg switch non-live migrate after 5 minutes ? The current code is so crude and there's a hell of alot of options we can take. I'm just not sure which is the best direction for us to go in. What kind of things would be the biggest win from Operators' or tenants' POV ? Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev