On Wed, 08 Aug 2012 21:51:50 -0400, Edison Su <[email protected]> wrote:

From: Mice Xia [mailto:[email protected]]
...
For following scenarios I need some suggestions:
a) 'vm snapshot, detach volume and attach it to another VM, rollback
snapshot',

This is not a problem. A volume and it's (several) snapshots are described in the HYPERVISOR's native metadata file closely associated with the volume. If the hypervisor doesn't actually manage that relationship and associated housekeeping tasks, CloudStack has no business trying to implement it. The option is simply not available to the user.

Cloud automation is NOT about inventing new things, it's about automating what is currently possible within the capabilities of the particular hypervisor and storage mixture at hand. Where cloud software goes wrong is thinking it's the master of everything. It's NOT, and never will be. All configuration and runtime state must be pulled from the actual devices (hypervisors, storage arrays, network gear, etc.) because things will change underneath it. It is unforgivably naive to assume there won't be outside influence. Of course we'd all like to pretend that the cloud knows all and that it is always authoritative, but if you write the software under that assumption, the users will be mighty ticked off when it's not the case and things break left and right.

b) 'vm snapshot, detach and destroy volume, rollback snapshot',

That's not a valid operation. A snapshot is ALWAYS associated with it's parent 'disk'. If you delete the parent, all snapshots are deleted with it.

Three candidate solutions that I can figure out now,
1) disallow detach volumes if specified VM has VM snapshots.

uh, no.

2) allow to snapshot/rollback, for a), this will result in two volumes,
one attached to anther VM, one attached to VM that rollback from
snapshot; for

This is a matter for the hypervisor or storage array. AFAIK even ESX doesn't let you do this even if the parent disk is marked RO since it puts an exclusive lock on the volume. Storage arrays pull this off by thin-cloning the source so the VM thinks it has it's own disk even when it didn't start out that way.

Again, Cloudstack isn't about doing storage trickery. It's about making the proper calls to the hypervisor or (maybe?) the storage array to do such fancy cloning. So if you want to provision a VM and attach a particular volume/snapshot sequence that isn't already locked for use, then no problem; treat it like you own it outright and launch. Any successive creation/destruction of snapshots is the purview of that one active VM and doesn't require any fancy footwork. If on the other hand the source volume+snapshot is already open for use, then you have to ask the storage provider for a thick or thin clone and you lose the ability to go back in time without trashing the private copy and re-attaching/re-cloning to the original.

The use case for cloud is 99.95% "dumb". Let's not complicate the situation unnecessarily. If qcow/rbd/lvm can't do it with their standard command set, then it's not available. If you're using 3par/emc/netapp as the storage and they do have the proper calls, then you can permit such things. Actually, let me rephrase that. Any fancy cloning or snapshotting calls MUST be sent to the hypervisor only!! It is up to the hypervisor to issue qcow/rbd/lvm/netapp/emc commands to get it done. Why? Hypervisor vendors actually test this stuff and have the resources and incentive to make sure it works. Cloud software again has no business trying to take over the hypervisor's role. If KVM/Xen are deficient in some aspect, then fix the hypervisor; do not try to use cloud software to band-aid the matter.

This early in the CloudStack lifecycle, complicated or exotic operations should probably require deliberate manual involvement anyhow.

For VM based snapshot, we should not allow user to dynamically change(attach or detach) VM's disks if there are VM-based snapshot taken on this VM.

again, no. It is not up to the cloud to manage/track volume/snapshot state. It instead queries the hypervisor for that information.


It seems to me (admittedly a very new member here) that there is enormous feature creep being introduced. Cloud automation is really fairly straightforward.

for any hypervisor
  where is it (IP, logical/physical location)
what flavor (KVM, Xen, ESX) and version - tells us which command set to use what network gear is attached and what ports need to be modified for to add/drop vlan membership at the switch level
  what is it's utilization

for any storage provider
  how to interface with it
  what storage containers are defined
  to which hypervisors should they be attached, and attachment parameters

for any VM
  to which account does it belong
  which disks does it need and on what storage container to place them
  where to find the master disk templates (if any)
  what network bridge/portgroups it needs to have interfaces on
what MAC to assign the various interfaces (or have Hypervisor auto-assign)
  what VLANs to assign to interface or interface aliases
  what IP address to assign to the interfaces
are there any locality or performance factors to influence selection of deployment destination

Basically all the cloud does is dynamically create what amounts to a VMware .vmx file or a libvirt instantiation script and run it against the chosen hypervisor host. And if needed prep the hypervisor with network underpinnings and attach appropriate storage in order to run the guest. Yes, I realize I simplified a bit but there isn't a whole lot more to it.

Reply via email to