First, let me apologize for the confusing terms, because some words here are overloaded: A volume… In CloudStack terms is a disk attached to a VM. In NetApp terms is an NFS volume, analogous to CloudStack primary storage, where all the CloudStack volumes are stored.
A snapshot… In CloudStack terms is a backup of a VM. In NetApp terms is a copy of all the contents of a NetApp volume, taken at a point in time to create an analogous CloudStack snapshot for (up to) every CloudStack volume on that primary storage. There are several reasons that an API for snapshotting multiple volumes is more attractive to us than calling a single volume API over and over. A lot of it has to do with how we actually create the snapshots. Unlike a hypervisor snapshot, when we create a vm snapshot, the entire primary storage is backed up (but only the requested volume has an entry added to the db). To add on to this, our hardware has a hard limit of 255 storage volume level snapshots. So, if there were 255 vms on a single primary storage and each one of them performed a backup, no more backups could be taken before we start removing the oldest backup (without some trickery that we are currently working on). Some might say a solution to this would be queueing the requests and waiting till they're all finished, but that seems much more error prone and like hackish design compared to simply allowing multiple VM volumes to be specified. This is both a request for optimizing the backend and optimizing the experience for users. What happens when a user says they want to backup 30 vm volumes at the same time? Is it not a cleaner experience to simply select all the volumes they want to back up, then click backup once? This way, the storage provider is given all the volumes at once and if they have some way of optimizing the request based on their hardware or software, they can take advantage of that. It can even be designed in such a way that if storage providers don't want to be given all the volumes at once, they can be called with each one individually, as to remain backwards compatible. Now, I'm also not saying that these two solutions can't co-exist. Even if we have the ability to backup multiple volumes at once, nothing is stopping users from backing them up one by one, so queueing is still something we may have to implement. However, I think extending the subsystem API to grant storage providers the ability to leverage any optimization they can without having to queue is a cleaner solution. If the concern is how users interpret what is going on in the backend, I think we can find some way to make that clear to them. -Chris -- Chris Suich chris.su...@netapp.com NetApp Software Engineer Data Center Platforms – Cloud Solutions Citrix, Cisco & Red Hat On Sep 18, 2013, at 12:26 PM, Alex Huang <alex.hu...@citrix.com> wrote: > That's my read on the proposal also but, Chris, please clarify. I don't > think the end user will see the change. It's an optimization for interfacing > with the storage backend. > > --Alex > >> -----Original Message----- >> From: Marcus Sorensen [mailto:shadow...@gmail.com] >> Sent: Wednesday, September 18, 2013 9:22 AM >> To: dev@cloudstack.apache.org >> Subject: Re: [PROPOSAL] Storage Subsystem API Interface Additions >> >> Perhaps he needs to elaborate on the use case and what he means by more >> efficient. He may be referring to multiple volumes in the sense of >> snapshotting the ROOT disks for 10 different VMs. >> >> On Wed, Sep 18, 2013 at 10:10 AM, Darren Shepherd >> <darren.s.sheph...@gmail.com> wrote: >>> Here's my general concern about multiple volume snapshots at once. >>> Giving such a feature leads the user to believe that snapshotting >>> multiple volumes at once will give them consistency across the volumes in >> the snapshot. >>> This is not true, and difficult to do with many hypervisors, and >>> typically requires an agent in the VM. A single snapshot, as exists >>> today, is really crash consistent, meaning that there is may exist >>> unsync'd data. To do a true multi volume snapshot requires a "quiesce" >> functionality in the VM. >>> So you do pause I/O queues, fsync, fsync, snapshot, snapshot, unpause I/O. >>> >>> I'm might be fine with the option of allowing multiple volumeId's to >>> be specified in the snapshot API, but it needs to be clear that those >>> snapshots may be taken sequentially and they are all independently >>> crash consistent. But, if you make that clear, then why even have the API. >>> Essentially it is the same as doing multiple snapshot API commands. >>> >>> So really I would lean towards having the multiple snapshotting >>> supported in the driver or storage subsystem, but not exposed to the >>> user. You can easy accomplish it by having a timed window on >>> snapshotting. So every 10 seconds you do snapshots, if 5 requests >>> have queued in the last 10 seconds, you do them all at once. This could be >> implemented as a framework thing. >>> If your provider implements "SnapshotBatching" interface and that has >>> a getBatchWindowTime(), then the framework can detect that it should >>> try to queue up some snapshot requests and send them to the driver in >>> a batch. Or that could be implemented in the driver itself. I would >>> lean toward doing it in the driver and if that goes well, we look at >>> pulling the functionality into core ACS. >>> >>> Darren >>> >>> >>> On Wed, Sep 18, 2013 at 5:22 AM, SuichII, Christopher < >>> chris.su...@netapp.com> wrote: >>> >>>> I would like to raise for discussion the idea of adding a couple >>>> methods to the Storage Subsystem API interface. Currently, >>>> takeSnapshot() and >>>> revertSnapshot() only support single VM volumes. We have a use case >>>> for snapshotting multiple VM volumes at the same time. For us, it is >>>> more efficient to snapshot them all at once rather than snapshot VM >>>> Volumes individually and this seems like a more elegant solution than >>>> queueing the requests within our plugin. >>>> >>>> Base on my investigation, this should require: >>>> -Two additional API to be invoked from the UI -Two additional methods >>>> added to the Storage Subsystem API interface -Changes in between the >>>> API level and invoking the Storage Subsystem API implementations (I >>>> know this is broad and vague), mainly around the SnapshotManger/Impl >>>> >>>> There are a couple topics we would like discussion on: >>>> -Would this be beneficial/detrimental/neutral to other storage providers? >>>> -How should we handle the addition of new methods to the Storage >>>> Subsystem API interface? Default them to throw an >> UnsupportedOperationException? >>>> Default to calling the single VM volume version multiple times? >>>> -Does anyone see any issues with allowing multiple snapshots to be >>>> taken at the same time or letting storage providers have a list of >>>> all the requested volumes to backup? >>>> >>>> Please let me know if I've missed any major topics for discussion or >>>> if anything needs clarification. >>>> >>>> Thanks, >>>> Chris >>>> -- >>>> Chris Suich >>>> chris.su...@netapp.com >>>> NetApp Software Engineer >>>> Data Center Platforms - Cloud Solutions Citrix, Cisco & Red Hat >>>> >>>>