On Fri, 07 Sep 2012 14:23:08 +0800, Shu Ming <shum...@linux.vnet.ibm.com> wrote:
> 于 2012-9-7 13:21, M. Mohan Kumar 写道:
> > On Thu, 6 Sep 2012 18:59:19 -0400 (EDT), Ayal Baron <aba...@redhat.com> 
> > wrote:
> >>
> >> ----- Original Message -----
> >>> ----- Original Message -----
> >>>> From: "M. Mohan Kumar" <mo...@in.ibm.com>
> >>>> To: vdsm-devel@lists.fedorahosted.org
> >>>> Sent: Wednesday, July 25, 2012 1:26:15 PM
> >>>> Subject: [vdsm] [RFC] GlusterFS domain specific changes
> >>>>
> >>>>
> >>>> We are developing a GlusterFS server translator to export block
> >>>> devices
> >>>> as regular files to the client. Using block devices to serve VM
> >>>> images
> >>>> gives performance improvements, since it avoids some file system
> >>>> bottlenecks in the host kernel. Goal is to use one block device(ie
> >>>> file
> >>>> at the client side) per VM image and feed this file to QEMU to get
> >>>> the
> >>>> performance improvements. QEMU will talk to glusterfs server
> >>>> directly
> >>>> using libgfapi.
> >>>>
> >>>> Currently we support only exporting Volume groups and Logical
> >>>> Volumes. Logical volumes are exported as regular files to the
> >>>> client.
> >> Are you actually using LVM behind the scenes?
> >> If so, why bother with exposing the LVs as files and not raw block devices?
> >>
> > Ayal,
> >
> > The idea is to provide a FS interface for managing block devices. One
> > can mount the Block Device Gluster Volume and create a LV and size it
> > just by
> >   $ touch lv1
> >   $ truncate -s5G lv1
> >
> > And other file commands can be used to clone LVs, snapshot LVs
> >   $ ln lv1 lv2 # clones
> >   $ ln -s lv1 lv1.sn # creates snapshot
> Do we have special reason to use "ln"?
> Why not use "cp" as the comannd to do the snapshot instead of "ln"?

cp involves opening source file in read-only mode, opening/creating
destination file with write-mode and issue series of read on source file
and write that into destination file till end of source file.

But we can't apply this to logical volume copy (or clone), because when
we create a logical volume we have to specify the size, but thats not
possible with above approach ie open/create does not take size as the
parameter so we can't create destination lv with required size.

But if I use link interface to copy LVs, VFS/FUSE/GlusterFS provides
link() interface that takes source file, destination file name. In BD
xlator link() code, I will get size of source LV and create destination
LV with that size and copy the contents.

This problem can be solved if we have a syscall copyfile(source, dest,
size). There have been discussions in the past on copyfile() interface which
could be made use of in this scenario copy.

> >
> > By enabling this feature GlusterFS can directly export storage in
> > SAN. We are planning to add feature to export LUNs also as regular files
> > in future.
> IMO, The major feature of GlusterFS is to export distributed local disks 
> to the clients.
> If we have SAN in the backend, that means the storage block devices 
> should be exported
> to clients natually.  Why do we need GlusterSF to export the block 
> devices in SAN?

By enabling this feature we are allowing GlusterFS to work with local
storage, NAS storage and SAN storage. ie it allows machines to access
block devices from the SAN which are not directly connected to SAN

Also providing block devices as vm disk image has some advantages like
 * it does not incur host side filesystem over head
 * if storage arrays provide storage offload features such as flashcopy,
   it can be exploited (these offloads will be usually at LUN level)

vdsm-devel mailing list

Reply via email to