On Mon, May 02, 2011 at 04:29:49PM -0500, Adam Litke wrote:
> After several long distractions, I am back to working on disk streaming.
> Before I hit the list with a new series of patches, I was hoping to
> reach some middle ground on the proposed streaming API.
>
> On Fri, 2011-04-08 at 14:31 +0100, Daniel P. Berrange wrote:
> > On Thu, Apr 07, 2011 at 04:31:59PM -0500, Adam Litke wrote:
> > > /*
> > > + * Disk Streaming
> > > + */
> > > +typedef enum {
> > > + VIR_STREAM_DISK_ONE = 1, /* Stream a single disk unit */
> > > + VIR_STREAM_DISK_START = 2, /* Stream the entire disk */
> > > + VIR_STREAM_DISK_STOP = 4, /* Stop streaming a disk */
> > > +} virDomainStreamDiskFlags;
> >
> > Using flags to combine two separate tasks into one single API
> > is rather unpleasant. As raised in the previous patch, the API
> > should also be taking a offset+length in bytes, then there is
> > no need for a special case transfer of an individual sector.
>
> This is a fair point. I will work with Stefan to support an
> offset/length qemu API. Since there doesn't seem to be a great way to
> query device sizes, I think we do need a convenient way to request that
> the entire disk be streamed. We could either do this with a flag or by
> overriding (offset==0 && length==0) to mean stream the entire device.
> Do you have a preference?
Since length==0 is otherwise meaningless, it is fine to use that
to indicate "until end of disk". This is consistent with
what we do for virStorageVolUpload/Download which allow length=0
to indicate "until end of disk".
> > > +#define VIR_STREAM_PATH_BUFLEN 1024
> > > +#define VIR_STREAM_DISK_MAX_STREAMS 10
> > > +
> > > +typedef struct _virStreamDiskState virStreamDiskState;
> > > +struct _virStreamDiskState {
> > > + char path[VIR_STREAM_PATH_BUFLEN];
> > > + /*
> > > + * The unit of measure for size and offset is unspecified. These
> > > fields
> > > + * are meant to indicate the progress of a continuous streaming
> > > operation.
> > > + */
> > > + unsigned long long offset; /* Current offset of active streaming */
> > > + unsigned long long size; /* Disk size */
> > > +};
> > > +typedef virStreamDiskState *virStreamDiskStatePtr;
> > > +
> > > +unsigned long long virDomainStreamDisk(virDomainPtr dom,
> > > + const char *path,
> > > + unsigned long long offset,
> > > + unsigned int flags);
> > > +
> > > +int virDomainStreamDiskInfo(virDomainPtr dom,
> > > + virStreamDiskStatePtr
> > > states,
> > > + unsigned int nr_states,
> > > + unsigned int flags);
> >
> > I would have liked it if we could use the existing JobInfo APIs for
> > getting progress information, but if we need to allow concurrent
> > usage for multiple disks per-VM, we can't. I think we should still
> > use a similar style of API though.
>
> The goal is to eventually support concurrent streams. Therefore, we
> will probably need to roll our own Status and Abort APIs (see my
> proposed API below).
>
> > There also doesn't appear to be a precise way to determine if the
> > copying of an entire disk failed part way through, and if so, how
> > much was actually copied.
> >
> > Taking all the previous points together, I think the API needs to
> > look like this:
> >
> > typedef enum {
> > /* If set, virDomainBlockAllocate() will return immediately
> > * allowing polling for operation completion & status
> > */
> > VIR_DOMAIN_DISK_ALLOCATE_NONBLOCK,
> > } virDomainBlockAllocateFlags;
> >
> > /*
> > * @path: fully qualified filename of the virtual disk
>
> I probably misnamed it, but path is actually the device alias, not a
> path to an image file.
Hmm, I wonder whether that is a good choice or not. The other
APIs all use the disk path. Perhaps we could use that as default
and have a flag to indicate whether the path should be treated as
a device alias instead. Thus getting both options.
>
> > * @offset: logical position in bytes, within the virtual disk
> > * @length: amount of data to copy, in bytes starting from @offset
> > * @flags: One of virDomainBlockAllocateFlags, or zero
> > *
> > * Ensure the virtual disk @path is fully allocated
> > * between @offset and @offset+@length. If a backing
> > * store is present, data will be filled from the
> > * backing store, otherwise data will be fileld with
> > * zeros
> > *
> > * If @flags contains VIR_DOMAIN_DISK_ALLOCATE_NONBLOCK,
> > * this API will return immediately after initiating the
> > * copy, otherwise it will block until copying is complete
> > *
> > */
> > int virDomainBlockAllocate(virDomainPtr dom,
> > const char *path,
> > unsigned long long offset,
> > unsigned long long length,
> > unsigned int flags);
> >
> >
> > /*
> > * @path: fully qualified filename of the virtual disk
> > * @info: allocated struct to return progress info
> > *
> > * Query the progress of a disk allocation job. This
> > * API must be used when virDomainBlockAllocate() was
> > * invoked with the VIR_DOMAIN_DISK_ALLOCATE_NONBLOCK
> > * flag set.
> > *
> > * The @info.type field will indicate whether the job
> > * was completed successfully, or failed part way
> > * through.
> > *
> > * The @info data progress fields will contain current
> > * progress information.
> > *
> > * The hypervisor driver may optionally chose to also
> > * fillin a time estimate for completion.
> > */
> > int virDomainBlockGetJobInfo(virDomainPtr dom,
> > const char *path,
> > virDomainJobInfoPtr info);
> >
> > /*
> > * @path: fully qualified filename of the virtual disk
> > *
> > * Request that a disk allocation job be aborted at
> > * the soonest opportunity. This API can only be used
> > * when virDomainBlockAllocate() was invoked with the
> > * VIR_DOMAIN_DISK_ALLOCATE_NONBLOCK flag set.
> > */
> > int virDomainBlockAbortJob(virDomainPtr dom,
> > const char *path);
> >
> >
> > typedef struct _virDomainBlockRegion virDomainBlockRegion;
> > typedef virDomainBlockRegion *virDomainBlockRegionPtr;
> > struct _virDomainBlockRegion {
> > /* The logical offset within the file of the allocated region */
> > unsigned long long offset;
> > /* The length of the allocated region */
> > unsigned long long length;
> > };
> >
> > /*
> > * @path: fully qualified filename of the virtual disk
> > * @nregions: filled in the number of @region structs
> > * @regions: filled with a list of allocated regions
> > *
> > * Query the extents of allocated regions within the
> > * virtual disk file. The offsets in the list of regions
> > * are not guarenteed to be sorted in any explicit order.
> > */
> > int virDomainBlockGetAllocationMap(virDomainPtr dom,
> > const char *path,
> > unsigned int *nregions,
> > virDomainBlockRegionPtr *regions);
>
> I am not convinced that we really need the block allocation map stuff.
> It's a fun toy, but the layout of data within a device is far too
> low-level of a concept to be exposing to users. In my opinion,
> streaming should really be done sequentially from the start of the
> device to the end (with arbitrary chunk sizes allowed for throttling
> purposes). If an application wants to stream according to some special
> pattern, let them maintain a data structure outside of libvirt to manage
> that extra complexity. It's not an error to stream a chunk that is
> already present. The populated areas will just complete immediately.
> Therefore, there isn't a need to maintain a strict record of outstanding
> regions.
Well we can always add something like this in at a later
date, so it is fine to drop it.
> > This takes care of things for running guests. It would be
> > desirable to have the same functionality available when a
> > guest is not running, via the virStorageVol APIs. Indeed,
> > this would allow access to the allocation functionality
> > for disks not explicitly associated with any VM yet.
>
> In light of previous discussion about the complexities around storage
> formats and filesystems, I am not sure how useful such an offline API is
> going to be in practice. At any rate, I would prefer to consider that
> issue separately.
>
> So, with my points taken into account I would like to counter with the
> following API proposal:
>
> ==== snip ====
>
> typedef enum {
> /* If set, virDomainBlockAllocate() will return immediately
> * allowing polling for operation completion & status
> */
> VIR_DOMAIN_DISK_STREAM_NONBLOCK,
> } virDomainBlockStreamFlags;
>
> /*
> * @device: alias of the target block device
> * @offset: logical position in bytes, within the virtual disk
> * @length: amount of data to copy, in bytes starting from @offset
> * @flags: One of virDomainBlockAllocateFlags, or zero
> *
> * Ensure the virtual disk @device is fully allocated
> * between @offset and @offset+@length. If a backing
> * store is present, data will be filled from the
> * backing store, otherwise data will be fileld with
> * zeros.
> *
> * If @flags contains VIR_DOMAIN_DISK_STREAM_NONBLOCK,
> * this API will return immediately after initiating the
> * copy, otherwise it will block until copying is complete
> *
> */
> int virDomainBlockStream(virDomainPtr dom,
> const char *device,
> unsigned long long offset,
> unsigned long long length,
> unsigned int flags);
>
> /*
> * @device: alias of the target block device
> *
> * Request that a disk streaming job be aborted at
> * the soonest opportunity. This API can only be used
> * when virDomainBlockStream() was invoked with the
> * VIR_DOMAIN_DISK_STREAM_NONBLOCK flag set.
> */
> int virDomainBlockStreamAbort(virDomainPtr dom,
> const char *device);
>
> typedef enum {
> VIR_BLOCK_STREAM_STATE_COMPLETED = 0,
> VIR_BLOCK_STREAM_STATE_ACTIVE = 1,
> VIR_BLOCK_STREAM_STATE_FAILED = 2,
> } virBlockStreamStatus;
>
> typedef struct _virBlockStreamInfo virBlockStreamInfo;
> struct _virBlockStreamInfo {
> virBlockStreamStatus status;
Coding guidelines don't allow use of enum types in structs
or as API parameters, instead requiring 'int'.
> unsigned long long remaining; /* number of bytes remaining */
> };
> typedef virBlockStreamInfo *virBlockStreamInfoPtr;
s/virBlock/virDomainBlock/ in several places here.
>
> /*
> * @device: alias of the target block device
> * @info: allocated struct to return progress info
> *
> * Query the progress of a disk streaming job. This
> * API must be used when virDomainBlockStream() was
> * invoked with the VIR_DOMAIN_DISK_STREAM_NONBLOCK
> * flag set.
> *
> */
> int virDomainBlockStreamInfo(virDomainPtr dom,
> const char *device,
> virBlockStreamInfoPtr info);
Any reason you didn't just use the existing 'virDomainJobInfoPtr'
struct, with this new API ? In particular I think we want many
of the other fields from that struct beyond just 'remaining'.
Regards,
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
--
libvir-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/libvir-list