Hi Jay, Timofei,

Thank you for the info.

On 10/27/2015 08:02 AM, Jay Pipes wrote:
On 10/22/2015 11:13 AM, Tang Chen wrote:
On 10/22/2015 05:17 AM, Joshua Harlow wrote:
Overall I'm very much inclined to have three state machines (one
for each type), vs the mix-mash of all three into one state machine
(which causes the confusion around states in the first diagram in
that paste).

That is an idea. But I would prefer to have one single state machine
for migration, because resize and evacuate are reusing migration.
They can be in one state machine.

Evacuate does *not* migrate/move anything. Evacuate *rebuilds* VMs from their original source image.

Well, I just dug into the source code. I think there could be some difference between evacuate in nova server side and client side. In nova compute, the evacuate API does call rebuild process as you said. But in novaclient, there is a command "nova host-evacuate-live", which will live-migrate all running VMs, which made me believe that evacuate also migrates VMs. Please refer to:

https://github.com/openstack/python-novaclient/blob/master/novaclient/v2/contrib/host_evacuate_live.py#L72

I think this is also a reason why I always got confused in all these concepts: cold-migrate, evacuate, evacuate-live, rebuild, resize.


About the migration type, I can see that Timofei has tried to split live-migration into 3 types:
1. block_live_migrate
2. live_migrate_file_level_storage
3. live_migrate_block_stroage

I think it is in driver level, not the user level. It is based on the type of the storage the VM is using. And I think migration type should be a multi-level thing.

Since I'm still a little confused with all the types of migration, I'd like to share some of my understanding and if they are correct, I think we can improve it like this.

1. OpenStack is now supporting resize a VM to another compute node. If we set "allow_resize_to_same-host", it also supports local resize. If we are not using memory/CPU hotplug, resize will result in a shutdown and reconfiguration of VM. So, there should be 2 types of resize: live (using hotplug) and cold (often resizing the primary disk).

2. Evacuate also has 2 types: live (equals to live-migrate) and cold (rebuild). But evacuate itself does nothing, I mean there is no actual process called evacuate. evacuate() is just an API calling rebuild_instance().

This is from the user level.

So finally, the migration type would be like this:

      user compute                                    driver

  live-migrate
  live-evacuate                     live-migrate
  live-resize                  memory/CPU hotplug

  cold-migrate           storage type, etc
  clod-evacuate                   cold-migrate
  cold-resize                      (to self or not)

    rebuild                               rebuild
                                  (this is not a migration)

I mean maybe we should handle different things in different levels. In compute, if the flow is too complex, we can define some more helper functions to make the main flow easier to understand.


I support Nikola in that I believe the different migration types should have different state machines entirely (but be as consistent as possible in the naming of terminal states like "finished" vs "done" etc)

OK. Agreed. And maybe also introduce state machines for task_state and vm_state.


It would be very helpful if the designer of the migration process
could share his idea. But if it is just some code modified by many
people many times, I think we should remove the confusing states and
give a easier, better state machine.

There isn't a designer of the migration process :( The original (crap, IMHO) API from Rackspace Cloud Servers API was used for the resize functionality in the compute API and it's been a source of confusion and frustration ever since. Relying on a manual confirmation or revert input from the user was and continues to be a horrible idea.

Agreed.


I believe strongly that we should deprecate the existing migrate, resize, an live-migrate APIs in favor of a single consolidated, consistent "move" REST API that would have the following characteristics:

* No manual or wait-input states in any FSM graph

Yes.

* Removal of the term "resize" from the API entirely (the target resource sizing is an attribute of the move operation, not a different type of API operation in and of itself)

Maybe we can define it in a different level, as I said above. Not sure.

* Transition to a task-based API for poll-state requests. This means that in order for a caller to determine the state of a VM the caller would call something like GET /servers/<UUID>/tasks/<UUID> in order to see the history of state changes or subtask operations for a particular request to move a VM

Yes.


Timofei Durakov (cc'd) has a blueprint for splitting the live-migration types into separate task classes here:

https://review.openstack.org/#/c/225910/

I think there's a lot of good ideas in that proposal. Please do have a look at it.

Thanks very much.


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to