2016-02-13 0:21 GMT+08:00 Murray, Paul (HP Cloud) <pmur...@hpe.com>: > This time with a tag in case anyone is filtering… > > > > *From:* Murray, Paul (HP Cloud) > *Sent:* 12 February 2016 16:16 > *To:* openstack-dev@lists.openstack.org > *Subject:* [openstack-dev] Update on live migration priority > > > > The objective for the live migration priority is to improve the stability > of migrations based on operator experience. The high level approach is to > do the following: > > 1. Improve CI > > 2. Improve documentation > > 3. Improve manageability of migrations > > 4. Fix bugs > > > > In this cycle we targeted a few immediately implementable features that > would help, specifically giving operators commands to allow them to manage > migrations (inspect progress, force completion, and cancel) and improve > security (split-networks and remove ssh-based resize/migration; aka storage > pools). > > > > Most of these are on track to be completed in this cycle with the > exception of storage pools work which is being deferred. Further details > follow. > > > > *Expand CI coverage* – *in progress* > > > > There is a job in the experimental queue called: > *gate-tempest-dsvm-multinode-live-migrationqueued*. This will become the > job that performs live migration tests; any live migration tests in other > jobs will be removed. At present the job has been configured to cover > different storage configurations including cinder, NFS, ceph. Tests are now > being added to the job. Patches are currently up for live migration of > instances with swap and instances with ephemeral disks. > > > > Please trigger the experimental queue if your patches touch migrations in > some way so we can check the stability of the jobs. Once stable and with > sufficient tests we will promote the job from the experimental queue so > that it always runs. > > > > See: https://review.openstack.org/#/q/topic:lm_test > > > > *Improve API docs* - *done* > > > > Some changes were made to the API guide for moving servers, including > better descriptions for the server actions migrate, live migrate, shelve, > resize and evacuate ( > http://developer.openstack.org/api-guide/compute/server_concepts.html#server-actions > ) and a section that describes reasons for moving VMs with common use cases > outlined ( > http://developer.openstack.org/api-guide/compute/server_concepts.html#moving-servers > ) > > > > *Block live migration with attached volumes* – *done* > > > > The selective block device migration API in libvirt 1.2.17 is used to > allow block migration when volumes are attached. A follow on patch to allow > readonly drives to be copied in block migration has not been completed. > This patch is required to allow iso9600 format config drives to be > migrated. Without it only vfat config drives can be migrated. There is > still some thought going into that – see: > https://review.openstack.org/#/c/234659 > > > > *Force complete* – *requires python-novaclient change* > > > > Force-complete forces a live migration to complete by pausing the VM and > restarting it when it has completed migration. This is intended as a brute > force way to make a VM complete its migration when it is taking too long. > In the future auto-converge and post-copy will be looked at. These became > available in qemu 2.5. > > > > Force complete is done in nova but still requires a change to > python-novaclient to implement the CLI. > > > > *Cancel* – *in progress* > > > > Cancel stops a live migration, leaving it on the source host with the > migration status left as “cancelled”. This is in progress and follows the > pattern of force-complete. Unfortunately this needs to be bundled up into > one patch to avoid multiple API bumps. > > >
Yeah, we already have great api concept doc about live-migration, really thanks to Paul! Just reminder we should add the new things we have in this release to the api concept doc also, like migration resource/sub-resource, and new actions force-complete/cancel. > Patches for review: > > https://review.openstack.org/#/q/status:open+topic:bp/abort-live-migration > > > > *Progress reporting* – *in progress *(no pun intended) > > > > Progress reporting introduces migrations as a sub-resource of servers and > adds progress data to the migration record. There was some debate at the > mid cycle and on the mailing list about how to record this transient data. > It is a waste to keep writing it to the database, but as it is generated at > the compute manager but examined at the API it was felt that writing it to > the database is necessary to fit the existing architecture. The conclusions > was that writing to the database every 5 seconds would not cause a > significant overhead. Alternatives could be persued later if necessary. For > discussion see this ML thread: > http://lists.openstack.org/pipermail/openstack-dev/2016-February/085662.html > and the IRC meeting transcript here: > http://eavesdrop.openstack.org/meetings/nova_live_migration/2016/nova_live_migration.2016-02-09-14.01.log.html > > > > Patches for review: > > > https://review.openstack.org/#/q/status:open+topic:bp/live-migration-progress-report > > > > *Split networking* – *done* > > > > Split networking adds a configuration parameter to specify > live_migration_inbound_addr as the ip address or host name to be used as > the target for migration traffic. This allows migration traffic to be > isolated on a separate network to other management traffic, providing an > opportunity to islate service levels for the two networks and improve > security by moving unencrypted migration traffic to an isolated network. > > > > *Resize/cold migrate using storage pools* – *deferred* > > > > The objective here was to change the libvirt implementation of migrate and > resize to use libvirt storage pools instead of scp/rsync over ssh with > passwordless keys. Storage pools are supported in all versions of libvrit > supported by nova, so it was thought that by changing the implementation it > would be possible to drop the ssh based code. However two flaws in this > approach arose: the recently added ploop storage device does not work with > storage pools in libvirt and the libvirt data copy implementation is very > inefficient and so slower than scp or rsync. > > > > The guys at Parallels kindly agreed to implement storage pools support for > ploop in libvirt and this work is already making progress. Work was also > started in libvirt to improve the copy performance. These features will be > available in a future release, so we will need to maintain old ssh-based > migration for libvirt as well as refactor and implement the storage pools > based alternative. > > > > Work has started on refactoring the libvirt driver code but the following > blueprints will be deferred beyond mitaka: > > > http://specs.openstack.org/openstack/nova-specs/specs/mitaka/approved/use-libvirt-storage-pools.html > > > http://specs.openstack.org/openstack/nova-specs/specs/mitaka/approved/migrate-libvirt-volumes.html > > > > *Deprecate migration flags* - *done* > > > > There are a lot of migration flags used with libvirt that are either > redundant or can be inferred from the deployed configuration. These are > being deprecated and will be removed in the next cycle. > > > > See: > > > https://review.openstack.org/#/q/project:openstack/nova+branch:master+topic:deprecate-migration-flags-config > > > > > > Feel free to respond with corrections or additions. > > > > Regards, > > Paul > > > > Paul Murray > > Technical Lead, HPE Cloud > > Hewlett Packard Enterprise > > +44 117 316 2527 > > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev