subject:"Re\: \[Qemu\-devel\] \[patch 6\/7\] QEMU live block copy"

Re: [Qemu-devel] [patch 6/7] QEMU live block copy (update)

2011-06-19 Thread Marcelo Tosatti

On Tue, Jun 07, 2011 at 12:15:55PM +0200, Jiri Denemark wrote:
> > +}
> > +},
> 
> What about using the same form of progress reporting as used by query-migrate?
> That is, instead of

Done.

> One can trivially compute percentage from that but it's impossible to get this
> kind of data when only percentage is reported. And total can even change in
> time if needed (just like it changes during migration).
> 
> > +{"device":"ide1-hd1",
> > + "status":"failed"
> > +}
> 
> Is there any way we can get the exact error which made it fail, such as EPERM
> or ENOSPC?

The errors that can generate "failed" here are internal to QEMU, so i
don't think exporting them to management application makes sense. This 
are errors such as EIO from file system.

Error processing should be done in the block_copy command, that is where
checks of the destination image are performed.

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

2011-06-15 Thread Marcelo Tosatti

On Tue, Jun 07, 2011 at 01:15:02PM +0100, Stefan Hajnoczi wrote:
> On Mon, Jun 6, 2011 at 5:55 PM, Marcelo Tosatti  wrote:
> 
> I haven't reviewed this whole patch yet, but comments below.
> 
> This patch, like image streaming, may hit deadlocks due to synchronous
> I/O emulation.  I discovered this problem when working on image
> streaming and it should be solved by getting rid of the asynchronous
> context concept.  The problem is that async I/O emulation will push a
> new context, preventing existing requests to complete until the
> current context is popped again.  If the image format has dependencies
> between requests (e.g. QED allocating writes are serialized), then
> this leads to deadlock because the new request cannot complete until
> the old one does, but the old one needs to wait for the context to be
> popped.  I think you are not affected by the QED allocating write case
> since the source image is only read, not written, by live block copy.
> But you might encounter this problem in other places.

I see. This should be fixed in the context push/pop logic (or something
equivalent), as you mention.

Fixed other comments, thanks.

Re: [Qemu-devel] [patch 6/7] QEMU live block copy (update)

2011-06-15 Thread Marcelo Tosatti

On Tue, Jun 07, 2011 at 12:15:55PM +0200, Jiri Denemark wrote:
> On Mon, Jun 06, 2011 at 14:03:44 -0300, Marcelo Tosatti wrote:
> ...
> > +  "return":[
> > +{"device":"ide1-hd0",
> > +"status":"active",
> > +"info":{
> > +   "percentage":23,
> > +}
> > +},
> 
> What about using the same form of progress reporting as used by query-migrate?
> That is, instead of
> 
> "info":{
> "percentage":23
> }
> 
> the following would be reported:
> 
> "disk":{
> "transferred":123,
> "remaining":123,
> "total":246
> }
> 
> One can trivially compute percentage from that but it's impossible to get this
> kind of data when only percentage is reported. And total can even change in
> time if needed (just like it changes during migration).

Done.

> > +{"device":"ide1-hd1",
> > + "status":"failed"
> > +}
> 
> Is there any way we can get the exact error which made it fail, such as EPERM
> or ENOSPC?

The errors that can generate "failed" here are internal to QEMU, so i
don't think exporting them to management application makes sense. This
are errors such as EIO from file system.

Error processing should be done in the block_copy command, that is where
checks of the destination image are performed.

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

2011-06-09 Thread Jagane Sundar




Hello Stefan,

Can you expand on this some more? I have similar concerns for Livebackup.

At the beginning of your paragraph,  did you mean 'asynchronous I/O
emulation' instead of 'synchronous I/O emulation'?

Also, I don't understand the 'stack' construct that you refer to. When you
say 'push a new context', are you talking about what happens when a new
thread picks up a new async I/O req from the VM, and then proceeds to
execute the I/O req? What is this stack that you refer to?

Any design documents, code snippets that I can look, other pointers welcome.

See async.c.


Thanks - I will do so.

There is synchronous I/O emulation in block.c for BlockDrivers that
don't support .bdrv_read()/.bdrv_write() but only
.bdrv_aio_readv()/.bdrv_aio_writev().  The way it works is that it
pushes a new I/O context and then issues async I/O.  Then it runs a
special event loop waiting for that I/O to complete.  After the I/O
completes it pops the context again.


OK. This is the opposite of what I was thinking of. I was considering
the code that emulates Async I/O using multiple threads.

It sounds like the goal of this async.c mechanism is more than
serializing all synchronous I/O requests, right?

The point of the context is that completions only get called for the
current context.  Therefore callers of the synchronous I/O functions
don't need to worry that the state of the world might change during
their "synchronous" operation - only their own I/O operation can
complete.  If a pending async I/O completes during synchronous I/O
emulation, its callback is not invoked until the bottom half (BH) is
called after the async context is popped.  This guarantees that the
synchronous operation and its caller have completed before I/O
completion callbacks are invoked for pending async I/O.


OK. This might not be a problem for Livebackup, after all.
Livebackup transparently interposes the async I/O offered by the driver.
Hence the async.c mechanism is layered neatly above the
async_block_driver+livebackup layer and should work correctly.

I will take a closer look at this, and check for sanity...


Thanks,
Jagane

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

2011-06-08 Thread Stefan Hajnoczi

On Wed, Jun 8, 2011 at 4:10 PM, Jagane Sundar  wrote:
> On 6/7/2011 5:15 AM, Stefan Hajnoczi wrote:
>>
>> On Mon, Jun 6, 2011 at 5:55 PM, Marcelo Tosatti
>>  wrote:
>>
>> I haven't reviewed this whole patch yet, but comments below.
>>
>> This patch, like image streaming, may hit deadlocks due to synchronous
>> I/O emulation.  I discovered this problem when working on image
>> streaming and it should be solved by getting rid of the asynchronous
>> context concept.  The problem is that async I/O emulation will push a
>> new context, preventing existing requests to complete until the
>> current context is popped again.  If the image format has dependencies
>> between requests (e.g. QED allocating writes are serialized), then
>> this leads to deadlock because the new request cannot complete until
>> the old one does, but the old one needs to wait for the context to be
>> popped.  I think you are not affected by the QED allocating write case
>> since the source image is only read, not written, by live block copy.
>> But you might encounter this problem in other places.
>>
> Hello Stefan,
>
> Can you expand on this some more? I have similar concerns for Livebackup.
>
> At the beginning of your paragraph,  did you mean 'asynchronous I/O
> emulation' instead of 'synchronous I/O emulation'?
>
> Also, I don't understand the 'stack' construct that you refer to. When you
> say 'push a new context', are you talking about what happens when a new
> thread picks up a new async I/O req from the VM, and then proceeds to
> execute the I/O req? What is this stack that you refer to?
>
> Any design documents, code snippets that I can look, other pointers welcome.

See async.c.

There is synchronous I/O emulation in block.c for BlockDrivers that
don't support .bdrv_read()/.bdrv_write() but only
.bdrv_aio_readv()/.bdrv_aio_writev().  The way it works is that it
pushes a new I/O context and then issues async I/O.  Then it runs a
special event loop waiting for that I/O to complete.  After the I/O
completes it pops the context again.

The point of the context is that completions only get called for the
current context.  Therefore callers of the synchronous I/O functions
don't need to worry that the state of the world might change during
their "synchronous" operation - only their own I/O operation can
complete.  If a pending async I/O completes during synchronous I/O
emulation, its callback is not invoked until the bottom half (BH) is
called after the async context is popped.  This guarantees that the
synchronous operation and its caller have completed before I/O
completion callbacks are invoked for pending async I/O.

The problem is that QED allocating writes are serialized and cannot
make progress if a pending request is unable to complete.  Preventing
the pending request from completing deadlocks QEMU.  The same thing
could happen in other places.

I'm bringing this up in case anyone hits such a deadlock.  I think we
can solve this in the future by eliminating the async context concept.
 Kevin and I have discussed how that can be done but it's not possible
or trivial to do yet.

Stefan

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

2011-06-08 Thread Jagane Sundar


On 6/7/2011 5:15 AM, Stefan Hajnoczi wrote:

On Mon, Jun 6, 2011 at 5:55 PM, Marcelo Tosatti  wrote:

I haven't reviewed this whole patch yet, but comments below.

This patch, like image streaming, may hit deadlocks due to synchronous
I/O emulation.  I discovered this problem when working on image
streaming and it should be solved by getting rid of the asynchronous
context concept.  The problem is that async I/O emulation will push a
new context, preventing existing requests to complete until the
current context is popped again.  If the image format has dependencies
between requests (e.g. QED allocating writes are serialized), then
this leads to deadlock because the new request cannot complete until
the old one does, but the old one needs to wait for the context to be
popped.  I think you are not affected by the QED allocating write case
since the source image is only read, not written, by live block copy.
But you might encounter this problem in other places.


Hello Stefan,

Can you expand on this some more? I have similar concerns for Livebackup.

At the beginning of your paragraph,  did you mean 'asynchronous I/O 
emulation' instead of 'synchronous I/O emulation'?


Also, I don't understand the 'stack' construct that you refer to. When 
you say 'push a new context', are you talking about what happens when a 
new thread picks up a new async I/O req from the VM, and then proceeds 
to execute the I/O req? What is this stack that you refer to?


Any design documents, code snippets that I can look, other pointers welcome.

Thanks,
Jagane

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

2011-06-07 Thread Stefan Hajnoczi

On Mon, Jun 6, 2011 at 5:55 PM, Marcelo Tosatti  wrote:

I haven't reviewed this whole patch yet, but comments below.

This patch, like image streaming, may hit deadlocks due to synchronous
I/O emulation.  I discovered this problem when working on image
streaming and it should be solved by getting rid of the asynchronous
context concept.  The problem is that async I/O emulation will push a
new context, preventing existing requests to complete until the
current context is popped again.  If the image format has dependencies
between requests (e.g. QED allocating writes are serialized), then
this leads to deadlock because the new request cannot complete until
the old one does, but the old one needs to wait for the context to be
popped.  I think you are not affected by the QED allocating write case
since the source image is only read, not written, by live block copy.
But you might encounter this problem in other places.

> +#define BLOCK_SIZE (BDRV_SECTORS_PER_DIRTY_CHUNK << BDRV_SECTOR_BITS)

Should this be called DIRTY_CHUNK_SIZE?

> +static void alloc_aio_bitmap(BdrvCopyState *s)
> +{
> +    BlockDriverState *bs = s->src;
> +    int64_t bitmap_size;
> +
> +    bitmap_size = (bdrv_getlength(bs) >> BDRV_SECTOR_BITS) +
> +            BDRV_SECTORS_PER_DIRTY_CHUNK * 8 - 1;
> +    bitmap_size /= BDRV_SECTORS_PER_DIRTY_CHUNK * 8;
> +
> +    s->aio_bitmap = qemu_mallocz(bitmap_size);
> +}

There is bitmap.h, which has a bunch of common bitmap code that could be reused.

> +static void blk_copy_issue_read(BdrvCopyState *s, int64_t sector,
> +                                int nr_sectors)
> +{
> +    BdrvCopyBlock *blk = qemu_mallocz(sizeof(BdrvCopyBlock));
> +    blk->buf = qemu_mallocz(BLOCK_SIZE);

qemu_blockalign() should be used to meet block device alignment requirements.

> +    blk->state = s;
> +    blk->sector = sector;
> +    blk->nr_sectors = nr_sectors;
> +    QLIST_INSERT_HEAD(&s->io_list, blk, list);
> +
> +    blk->iov.iov_base = blk->buf;
> +    blk->iov.iov_len = nr_sectors * BDRV_SECTOR_SIZE;
> +    qemu_iovec_init_external(&blk->qiov, &blk->iov, 1);
> +
> +    s->inflight_reads++;
> +    blk->time = qemu_get_clock_ns(rt_clock);
> +    blk->aiocb = bdrv_aio_readv(s->src, sector, &blk->qiov, nr_sectors,
> +                                blk_copy_read_cb, blk);
> +    if (!blk->aiocb) {
> +        s->error = 1;
> +        goto error;
> +    }
> +
> +    return;
> +
> +error:
> +    s->inflight_reads--;
> +    QLIST_REMOVE(blk, list);
> +    qemu_free(blk->buf);

qemu_vfree() after you switch to qemu_blockalign() above.

> +    qemu_free(blk);
> +}

> +static void blkcopy_cleanup(BdrvCopyState *s)
> +{

Need to free dst, at least on error?

> +    assert(s->inflight_reads == 0);
> +    assert(QLIST_EMPTY(&s->io_list));
> +    bdrv_set_dirty_tracking(s->src, 0);
> +    drive_put_ref(drive_get_by_blockdev(s->src));
> +    bdrv_set_in_use(s->src, 0);
> +    if (s->stage >= STAGE_DIRTY) {
> +        qemu_free(s->aio_bitmap);
> +    }
> +    qemu_del_timer(s->aio_timer);

Missing qemu_free_timer().

> +}

> +static int bdrv_copy_setup(const char *device, const char *filename,
> +                           bool shared_base)
> +{
> +    int64_t sectors;
> +    BdrvCopyState *blkcopy, *safe;
> +    BlockDriverState *src, *dst;
> +
> +    src = bdrv_find(device);
> +    if (src) {
> +        qerror_report(QERR_DEVICE_NOT_FOUND, device);
> +        return -1;
> +    }
> +
> +    dst = bdrv_new("");
> +    if (bdrv_open(dst, filename, src->open_flags, NULL) < 0) {
> +        bdrv_delete(dst);
> +        qerror_report(QERR_OPEN_FILE_FAILED, filename);
> +        return -1;
> +    }
> +
> +    QLIST_FOREACH_SAFE(blkcopy, &block_copy_list, list, safe) {
> +        if (!strcmp(blkcopy->device_name, src->device_name)) {
> +            if (blkcopy->stage == STAGE_SWITCH_FINISHED || blkcopy->failed) {
> +                blkcopy_free(blkcopy);
> +            } else {
> +                qerror_report(QERR_IN_PROGRESS, "block copy");

dst is leaked.

> +                return -1;
> +            }
> +        }
> +    }
> +
> +    sectors = bdrv_getlength(src) >> BDRV_SECTOR_BITS;
> +    if (sectors != bdrv_getlength(dst) >> BDRV_SECTOR_BITS) {
> +        qerror_report(QERR_BLOCKCOPY_IMAGE_SIZE_DIFFERS);
> +        return -1;

dst is leaked.

Stefan

Re: [Qemu-devel] [patch 6/7] QEMU live block copy (update)

2011-06-07 Thread Jiri Denemark

On Mon, Jun 06, 2011 at 14:03:44 -0300, Marcelo Tosatti wrote:
...
>  SQMP
> +query-block-copy
> +-
> +
> +Live block copy status.
> +
> +Each block copy instance information is stored in a json-object and the 
> returned
> +value is a json-array of all instances.
> +
> +Each json-object contains the following:
> +
> +- "device": device name (json-string)
> +- "status": block copy status (json-string)
> +- Possible values: "active", "failed", "mirrored", "completed", meaning:
> +- failed: block copy failed.
> +- active: block copy active, copying to destination 
> image.
> +- mirrored: block copy active, finished copying to 
> destination
> +  image, writes are mirrored.
> +- completed: block copy completed.
> +
> +- "info": A json-object with the statistics information, if status is 
> "active":
> +- "percentage": percentage completed (json-int)
> +
> +Example:
> +
> +Block copy for "ide1-hd0" active and block copy for "ide1-hd1" failed:
> +
> +-> { "execute": "query-block-copy" }
> +<- {
> +  "return":[
> +{"device":"ide1-hd0",
> +"status":"active",
> +"info":{
> +   "percentage":23,
> +}
> +},

What about using the same form of progress reporting as used by query-migrate?
That is, instead of

"info":{
"percentage":23
}

the following would be reported:

"disk":{
"transferred":123,
"remaining":123,
"total":246
}

One can trivially compute percentage from that but it's impossible to get this
kind of data when only percentage is reported. And total can even change in
time if needed (just like it changes during migration).

> +{"device":"ide1-hd1",
> + "status":"failed"
> +}

Is there any way we can get the exact error which made it fail, such as EPERM
or ENOSPC?

> +  ]
> +   }
> +
> +EQMP

Jirka

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

2011-06-05 Thread Marcelo Tosatti

On Tue, May 31, 2011 at 07:53:44PM +0300, Avi Kivity wrote:
> >Don't see the need for that, management can simply wait for livecopy to
> >finish or cancel livecopy and restart on destination after migration.
> 
> They can do it, but I imagine it's pretty hard for them.

Well, they have to handle migration failures anyway.

> 
> >But yeah, it should be implemented sometime...
> >
> 
> 
> We should make an effort not to introduce interdependencies which
> make management's life harder.

While this is a good principle there is no way around complexity for
management here. If migration is supported, the block copy target image
filenames must be specified on the destination.

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

2011-06-05 Thread Marcelo Tosatti

On Tue, May 24, 2011 at 10:15:09PM +0300, Blue Swirl wrote:
> > +static bool aio_inflight(BdrvCopyState *s, int64_t sector)
> > +{
> > +    int64_t chunk = sector / (int64_t)BDRV_SECTORS_PER_DIRTY_CHUNK;
> > +
> > +    if (s->aio_bitmap &&
> > +        (sector << BDRV_SECTOR_BITS) < bdrv_getlength(s->src)) {
> > +        return !!(s->aio_bitmap[chunk / (sizeof(unsigned long) * 8)] &
> > +            (1UL << (chunk % (sizeof(unsigned long) * 8;
> 
> Please use the bitmap functions in bitmap.h, also in the next function.

Can't do, this bitmap can be larger than 2^32, while bitmap.h uses 'int'
for index.

> > +    if (bdrv_open(s->src, s->src_filename, open_flags, NULL) < 0) {
> > +        error_report("%s: %s: cannot fallback to source image\n", __func__,
> > +                     s->src_filename);
> 
> Below qerror_report() is used.

What is the problem? Don't get the point.

Fixed other comments, thanks for reviewing.

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

2011-05-31 Thread Marcelo Tosatti

On Sun, May 29, 2011 at 11:54:25AM +0300, Avi Kivity wrote:
> On 05/24/2011 12:31 AM, Marcelo Tosatti wrote:
> >Support live image copy + switch. That is, copy an image backing
> >a guest hard disk to a destination image (destination image must
> >be created separately), and switch to this copy.
> >
> >Command syntax:
> >
> >block_copy device filename [-i] -- live block copy device to image
> >  -i for incremental copy (base image shared between src and 
> > destination)
> >
> >Please refer to qmp-commands diff for more details.
> 
> IMO it would have been nicer to use the mirror driver for all
> copying; there would be no STAGE_DIRTY; simply a background process
> that copies over all blocks, taking care not to conflict with
> ongoing writes.  

Don't see the advantage of doing that. Disadvantages:

- Guest write performance is affected during copying (guest writes
  compete with stream of writes from copy).
- Complexity to handle copy write conflict (there is no need to handle
  that with current solution).
- Unability to have the mirror functionality in a separate driver.

> It would also remove the conflict with migration.

There is no fundamental problem with migration, its simply unsupported
now.

> But that's an implementation detail and can be changed later.

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

2011-05-31 Thread Marcelo Tosatti

On Tue, May 31, 2011 at 07:14:57PM +0300, Avi Kivity wrote:
> On 05/31/2011 07:06 PM, Marcelo Tosatti wrote:
> >On Sun, May 29, 2011 at 11:54:25AM +0300, Avi Kivity wrote:
> >>  On 05/24/2011 12:31 AM, Marcelo Tosatti wrote:
> >>  >Support live image copy + switch. That is, copy an image backing
> >>  >a guest hard disk to a destination image (destination image must
> >>  >be created separately), and switch to this copy.
> >>  >
> >>  >Command syntax:
> >>  >
> >>  >block_copy device filename [-i] -- live block copy device to image
> >>  >   -i for incremental copy (base image shared between src 
> >> and destination)
> >>  >
> >>  >Please refer to qmp-commands diff for more details.
> >>
> >>  IMO it would have been nicer to use the mirror driver for all
> >>  copying; there would be no STAGE_DIRTY; simply a background process
> >>  that copies over all blocks, taking care not to conflict with
> >>  ongoing writes.
> >
> >Don't see the advantage of doing that.
> 
> No dirty logging
> No conflict with migration
> Simpler conceptually (IMO)
> 
> >Disadvantages:
> >
> >- Guest write performance is affected during copying (guest writes
> >   compete with stream of writes from copy).
> 
> Competes anyway with your background task?

No because guest writes are to the source and copy writes are to
destination (which are potentially different disks or set of disks).

> >- Complexity to handle copy write conflict (there is no need to handle
> >   that with current solution).
> 
> If new write from source A overlaps an in-flight write from source
> B, hold it in a list off the B request, and re-issue it on B
> completion.
> 
> >- Unability to have the mirror functionality in a separate driver.
> 
> Why?

Because you need to handle the interaction between guest writes and 
copy writes someplace which can be aware of both.

For example, in-flight copy writes must be invalidated in the guest
write path.

> >>  It would also remove the conflict with migration.
> >
> >There is no fundamental problem with migration, its simply unsupported
> >now.
> 
> Why?

Don't see the need for that, management can simply wait for livecopy to
finish or cancel livecopy and restart on destination after migration.

But yeah, it should be implemented sometime...

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

2011-05-31 Thread Avi Kivity

On 05/31/2011 07:38 PM, Marcelo Tosatti wrote:

>
>  >Disadvantages:
>  >
>  >- Guest write performance is affected during copying (guest writes
>  >compete with stream of writes from copy).
>
>  Competes anyway with your background task?

No because guest writes are to the source and copy writes are to
destination (which are potentially different disks or set of disks).

Your copy task reads from the source disk and writes to the destination 
disk.  A guest write doesn't touch the destination disk, but schedules a 
later write due to dirty tracking.  In all, the total number of writes 
should be similar.  Background copy is a more efficient since multiple 
writes to the same location are coalesced, but that's not very common 
(except perhaps for transaction logs).

>
>  >- Unability to have the mirror functionality in a separate driver.
>
>  Why?

Because you need to handle the interaction between guest writes and
copy writes someplace which can be aware of both.

For example, in-flight copy writes must be invalidated in the guest
write path.

guest -> device -> mirror driver -> [ source disk, destination disk ]

>  >>   It would also remove the conflict with migration.
>  >
>  >There is no fundamental problem with migration, its simply unsupported
>  >now.
>
>  Why?

Don't see the need for that, management can simply wait for livecopy to
finish or cancel livecopy and restart on destination after migration.

They can do it, but I imagine it's pretty hard for them.

But yeah, it should be implemented sometime...

We should make an effort not to introduce interdependencies which make 
management's life harder.

--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

2011-05-31 Thread Avi Kivity


On 05/31/2011 07:06 PM, Marcelo Tosatti wrote:

On Sun, May 29, 2011 at 11:54:25AM +0300, Avi Kivity wrote:
>  On 05/24/2011 12:31 AM, Marcelo Tosatti wrote:
>  >Support live image copy + switch. That is, copy an image backing
>  >a guest hard disk to a destination image (destination image must
>  >be created separately), and switch to this copy.
>  >
>  >Command syntax:
>  >
>  >block_copy device filename [-i] -- live block copy device to image
>  >   -i for incremental copy (base image shared between src and 
destination)
>  >
>  >Please refer to qmp-commands diff for more details.
>
>  IMO it would have been nicer to use the mirror driver for all
>  copying; there would be no STAGE_DIRTY; simply a background process
>  that copies over all blocks, taking care not to conflict with
>  ongoing writes.

Don't see the advantage of doing that.


No dirty logging
No conflict with migration
Simpler conceptually (IMO)


Disadvantages:

- Guest write performance is affected during copying (guest writes
   compete with stream of writes from copy).


Competes anyway with your background task?


- Complexity to handle copy write conflict (there is no need to handle
   that with current solution).


If new write from source A overlaps an in-flight write from source B, 
hold it in a list off the B request, and re-issue it on B completion.



- Unability to have the mirror functionality in a separate driver.


Why?


>  It would also remove the conflict with migration.

There is no fundamental problem with migration, its simply unsupported
now.


Why?


>  But that's an implementation detail and can be changed later.




--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

2011-05-29 Thread Avi Kivity


On 05/24/2011 12:31 AM, Marcelo Tosatti wrote:

Support live image copy + switch. That is, copy an image backing
a guest hard disk to a destination image (destination image must
be created separately), and switch to this copy.

Command syntax:

block_copy device filename [-i] -- live block copy device to image
  -i for incremental copy (base image shared between src and 
destination)

Please refer to qmp-commands diff for more details.


IMO it would have been nicer to use the mirror driver for all copying; 
there would be no STAGE_DIRTY; simply a background process that copies 
over all blocks, taking care not to conflict with ongoing writes.  It 
would also remove the conflict with migration.


But that's an implementation detail and can be changed later.

--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

2011-05-24 Thread Blue Swirl

On Tue, May 24, 2011 at 12:31 AM, Marcelo Tosatti  wrote:
> Support live image copy + switch. That is, copy an image backing
> a guest hard disk to a destination image (destination image must
> be created separately), and switch to this copy.
>
> Command syntax:
>
> block_copy device filename [-i] -- live block copy device to image
>             -i for incremental copy (base image shared between src and 
> destination)
>
> Please refer to qmp-commands diff for more details.
>
> Signed-off-by: Marcelo Tosatti 
>
> Index: qemu-block-copy/block-copy.c
> ===
> --- /dev/null
> +++ qemu-block-copy/block-copy.c
> @@ -0,0 +1,754 @@
> +/*
> + * QEMU live block copy
> + *
> + * Copyright (C) 2010 Red Hat Inc.
> + *
> + * Authors: Marcelo Tosatti 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu-common.h"
> +#include "block_int.h"
> +#include "blockdev.h"
> +#include "qemu-queue.h"
> +#include "qemu-timer.h"
> +#include "monitor.h"
> +#include "block-copy.h"
> +#include "migration.h"
> +#include "sysemu.h"
> +#include "qjson.h"
> +#include 
> +
> +#define BLOCK_SIZE (BDRV_SECTORS_PER_DIRTY_CHUNK << BDRV_SECTOR_BITS)
> +#define MAX_IS_ALLOCATED_SEARCH 65536
> +
> +/*
> + * Stages:
> + *
> + * STAGE_BULK: bulk reads/writes in progress
> + * STAGE_BULK_FINISHED: bulk reads finished, bulk writes in progress
> + * STAGE_DIRTY: bulk writes finished, dirty reads/writes in progress
> + * STAGE_MIRROR_WRITES: copy finished, writes mirrored to both images.
> + * STAGE_SWITCH_FINISHED: switched to new image.
> + */
> +
> +enum BdrvCopyStage {
> +    STAGE_BULK,
> +    STAGE_BULK_FINISHED,
> +    STAGE_DIRTY,
> +    STAGE_MIRROR_WRITES,
> +    STAGE_SWITCH_FINISHED,
> +};
> +
> +typedef struct BdrvCopyState {
> +    BlockDriverState *src;
> +    BlockDriverState *dst;
> +    bool shared_base;
> +
> +    int64_t curr_sector;
> +    int64_t completed_sectors;
> +    int64_t nr_sectors;
> +
> +    enum BdrvCopyStage stage;
> +    int inflight_reads;
> +    int error;
> +    int failed;
> +    int cancelled;
> +    QLIST_HEAD(, BdrvCopyBlock) io_list;
> +    unsigned long *aio_bitmap;
> +    QEMUTimer *aio_timer;
> +    QLIST_ENTRY(BdrvCopyState) list;
> +
> +    int64_t blocks;
> +    int64_t total_time;
> +
> +    char src_device_name[32];
> +    char src_filename[1024];

A #define for the buffer size would be nice.

> +} BdrvCopyState;
> +
> +typedef struct BdrvCopyBlock {
> +    BdrvCopyState *state;
> +    uint8_t *buf;
> +    int64_t sector;
> +    int64_t nr_sectors;
> +    struct iovec iov;
> +    QEMUIOVector qiov;
> +    BlockDriverAIOCB *aiocb;
> +    int64_t time;
> +    QLIST_ENTRY(BdrvCopyBlock) list;
> +} BdrvCopyBlock;
> +
> +static QLIST_HEAD(, BdrvCopyState) block_copy_list =
> +    QLIST_HEAD_INITIALIZER(block_copy_list);
> +
> +static void alloc_aio_bitmap(BdrvCopyState *s)
> +{
> +    BlockDriverState *bs = s->src;
> +    int64_t bitmap_size;
> +
> +    bitmap_size = (bdrv_getlength(bs) >> BDRV_SECTOR_BITS) +
> +            BDRV_SECTORS_PER_DIRTY_CHUNK * 8 - 1;
> +    bitmap_size /= BDRV_SECTORS_PER_DIRTY_CHUNK * 8;
> +
> +    s->aio_bitmap = qemu_mallocz(bitmap_size);
> +}
> +
> +static bool aio_inflight(BdrvCopyState *s, int64_t sector)
> +{
> +    int64_t chunk = sector / (int64_t)BDRV_SECTORS_PER_DIRTY_CHUNK;
> +
> +    if (s->aio_bitmap &&
> +        (sector << BDRV_SECTOR_BITS) < bdrv_getlength(s->src)) {
> +        return !!(s->aio_bitmap[chunk / (sizeof(unsigned long) * 8)] &
> +            (1UL << (chunk % (sizeof(unsigned long) * 8;

Please use the bitmap functions in bitmap.h, also in the next function.

> +    } else {
> +        return 0;
> +    }
> +}
> +
> +static void set_aio_inflight(BdrvCopyState *s, int64_t sector_num,
> +                             int nb_sectors, int set)
> +{
> +    int64_t start, end;
> +    unsigned long val, idx, bit;
> +
> +    start = sector_num / BDRV_SECTORS_PER_DIRTY_CHUNK;
> +    end = (sector_num + nb_sectors - 1) / BDRV_SECTORS_PER_DIRTY_CHUNK;
> +
> +    for (; start <= end; start++) {
> +        idx = start / (sizeof(unsigned long) * 8);
> +        bit = start % (sizeof(unsigned long) * 8);
> +        val = s->aio_bitmap[idx];
> +        if (set) {
> +            if (!(val & (1UL << bit))) {
> +                val |= 1UL << bit;
> +            }
> +        } else {
> +            if (val & (1UL << bit)) {
> +                val &= ~(1UL << bit);
> +            }
> +        }
> +        s->aio_bitmap[idx] = val;
> +    }
> +}
> +
> +static void blkcopy_set_stage(BdrvCopyState *s, enum BdrvCopyStage stage)
> +{
> +    s->stage = stage;
> +
> +    switch (stage) {
> +    case STAGE_BULK:
> +        BLKDBG_EVENT(s->dst->file, BLKDBG_BLKCOPY_STAGE_BULK);
> +        break;
> +    case STAGE_BULK_FINISHED:
> +        BLKDBG_EVENT(s->dst->file, BLKDBG_BLKCOPY_STAGE_BULK_FINISHED);
> +

Re: [Qemu-devel] [patch 6/7] QEMU live block copy (update)

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

Re: [Qemu-devel] [patch 6/7] QEMU live block copy (update)

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

Re: [Qemu-devel] [patch 6/7] QEMU live block copy (update)

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

Re: [Qemu-devel] [patch 6/7] QEMU live block copy

16 matches

Site Navigation

Mail list logo

Footer information