RE: Update on the storage refactor

Edison Su Fri, 09 Nov 2012 17:44:32 -0800


> -----Original Message-----
> From: Wido den Hollander [mailto:[email protected]]
> Sent: Friday, November 09, 2012 6:33 AM
> To: [email protected]
> Cc: Edison Su; Marcus Sorensen ([email protected]); 'Umasankar
> Mukkara'; 'John Burwell'
> Subject: Re: Update on the storage refactor
> 
> Hi Edison,
> 
> Thank you for the update!
> 
> On 11/08/2012 10:05 PM, Edison Su wrote:
> > Hi All,
> >      I send out storage refactor rfc few months
> ago(http://markmail.org/message/6sdig3hwt5puyvsc), but just starting code
> in the last few weeks.
> 
> I just checked out the Javelin branch. Could you try to make the commits a bit
> more descriptive or maybe rebase locally to squash multiple commits in one
> before pushing? It's kind of hard to follow.
> 
> Never the less, great work!
> 
> >      Here is my latest
> proposal(https://cwiki.apache.org/confluence/display/CLOUDSTACK/Storag
> e+subsystem+2.0), and sample code is at
> engine/storage/src/org/apache/cloudstack/storage/volume/,
> engine/storage/src/org/apache/cloudstack/storage/datastore/ and
> engine/storage/src/org/apache/cloudstack/storage/image/ on javelin
> branch.
> >     The ideal is very simple, delegate the logic into different components,
> into different identities. I'll try my best to get rid of the monolithic 
> managers,
> which are hard to extend.
> >     Feedback and comments are welcome.
> 
> Is the API on the Wiki definitive? Since it seems the SnapshotService is
> missing:
> - List
> - Destroy


Thanks, I'll add them into the code.

> 
> Apart from that:
> 
> "BackupService will manage the snapshot on backup storage(s3/nfs etc)
> 
> BackupMigrationService will handle how to move snapshot from primary
> storage to secondary storage, or vice versa. "
> 
> Does that mean we will have "Backup Storage" AND "Secondary Storage"

To me, the term "secondary storage" is too generalized. It has too many 
responsibilities, make the code become more and more complicated. 
In general, there are three main services:
Image service: where to get and create template and iso
Volume service: where to get and create volume
Backup service: where to get and put backup 
The three "where" can operate on different physical storages,  while they also 
can share the same physical storage. These services themselves should not know 
each other, even they are using the same physical storage.
Take Ceph as an example, I think it can be both used as backup storage and 
volume storage, while swift/s3/nfs can be used as both backup storage and image 
storage. 
The point here is that, one storage can have different roles(backup/image or 
volume), how cloudstack use the storage, is depended on how admin configures 
the storage.
The model I am trying to build is that: datastore provider, datastore, and 
driver.
Datastore provider is responsible for the life cycle a particular storage type, 
or a storage vendor.
Datastore represents a physical storage, it has three subclasses: 
primarydatastore, imagedatastore and backupdatastore. Each subclass has 
different APIs.
Driver represents the code dealing with actual storage device, it may directly 
talk to storage device, or just send a command to resource
In a system, there are many data store providers. One physical storage can be 
added into the system under one data store provider, the storage can be 
configured with multiple roles, and can be attached to one scope(either per 
zone, per pod, per cluster or per host).
When the services want to use the data store, first, service will ask data 
store provider to give it an object of datastore(either primarydatastore, 
imagedatastore, or backupdatastore), then call the object's API to do the 
actual work. Then sample code is at *.datastore* package.

To backward compatibility, "add secondary storage" can be implemented as "add 
image storage" and "add backup storage" altogether, which is the case that one 
storage has two roles.
To separate the roles from underlining storage, we can give admin more 
flexibilities to build their system.
 How do you think?

> 
> Imho it should be enough to have just Secondary Storage, but we should
> support both NFS, CIFS and object storage (S3, Swift, Ceph, etc, etc) as
> secondary storage.
> 
> Since we are actually only storing some objects with metadata that shouldn't
> be a problem.
> 
> On the KVM platform we are now using Qemu to convert images and that
> requires the destination to be a file, but we can always work around that I
> think.
> 
> Imho the requirement to have a actual filesystem as Secondary Storage
> should be gone.
> 
> NFS has it's issues, think about a NFS server dying while a qemu-img process
> is running. That will go into status D and will be blocking until the NFS 
> comes
> back.
> 
> Wido
>

RE: Update on the storage refactor

Reply via email to