Re: [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command

2012-09-13 Thread Kevin Wolf
Am 24.07.2012 13:04, schrieb Paolo Bonzini:
 This adds the monitor commands that start the mirroring job.
 
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  blockdev.c   |  133 
 --
  hmp-commands.hx  |   21 +
  hmp.c|   28 
  hmp.h|1 +
  qapi-schema.json |   35 ++
  qmp-commands.hx  |   42 +
  trace-events |2 +-
  7 files changed, 258 insertions(+), 4 deletions(-)
 
 diff --git a/blockdev.c b/blockdev.c
 index 192a9db..4b4574a 100644
 --- a/blockdev.c
 +++ b/blockdev.c
 @@ -21,6 +21,8 @@
  #include trace.h
  #include arch_init.h
  
 +static void block_job_cb(void *opaque, int ret);
 +
  static QTAILQ_HEAD(drivelist, DriveInfo) drives = 
 QTAILQ_HEAD_INITIALIZER(drives);
  
  static const char *const if_name[IF_COUNT] = {
 @@ -825,6 +827,131 @@ exit:
  return;
  }
  
 +void qmp_drive_mirror(const char *device, const char *target,
 +  bool has_format, const char *format,
 +  enum MirrorSyncMode sync,
 +  bool has_mode, enum NewImageMode mode,
 +  bool has_speed, int64_t speed, Error **errp)
 +{
 +BlockDriverInfo bdi;
 +BlockDriverState *bs;
 +BlockDriverState *source, *target_bs;
 +BlockDriver *proto_drv;
 +BlockDriver *drv = NULL;
 +Error *local_err = NULL;
 +int flags;
 +uint64_t size;
 +int ret;
 +
 +if (!has_speed) {
 +speed = 0;
 +}
 +if (!has_mode) {
 +mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS;
 +}
 +
 +bs = bdrv_find(device);
 +if (!bs) {
 +error_set(errp, QERR_DEVICE_NOT_FOUND, device);
 +return;
 +}
 +
 +if (!has_format) {
 +format = mode == NEW_IMAGE_MODE_EXISTING ? NULL : 
 bs-drv-format_name;
 +}
 +if (format) {
 +drv = bdrv_find_format(format);
 +if (!drv) {
 +error_set(errp, QERR_INVALID_BLOCK_FORMAT, format);
 +return;
 +}
 +}
 +
 +if (!bdrv_is_inserted(bs)) {
 +error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
 +return;
 +}
 +
 +if (bdrv_in_use(bs)) {
 +error_set(errp, QERR_DEVICE_IN_USE, device);
 +return;
 +}
 +
 +flags = bs-open_flags | BDRV_O_RDWR;

Do we take care to make the image read-only again after completion?

Kevin



Re: [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command

2012-09-13 Thread Paolo Bonzini
Il 13/09/2012 15:15, Kevin Wolf ha scritto:
  +flags = bs-open_flags | BDRV_O_RDWR;
 Do we take care to make the image read-only again after completion?

Not at the file-descriptor level, but bs-read_only is indeed restored
to true via bdrv_swap.

Doing it on the file descriptor is possible with Jeff's bdrv_reopen patches.

Paolo



Re: [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command

2012-09-13 Thread Kevin Wolf
Am 13.09.2012 15:24, schrieb Paolo Bonzini:
 Il 13/09/2012 15:15, Kevin Wolf ha scritto:
 +flags = bs-open_flags | BDRV_O_RDWR;
 Do we take care to make the image read-only again after completion?
 
 Not at the file-descriptor level, but bs-read_only is indeed restored
 to true via bdrv_swap.
 
 Doing it on the file descriptor is possible with Jeff's bdrv_reopen patches.

Ah, right, obviously you can't do it before Jeff's patches are in. But
yes, this is what I meant.

Kevin



Re: [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command

2012-09-13 Thread Paolo Bonzini
Il 13/09/2012 15:26, Kevin Wolf ha scritto:
 +flags = bs-open_flags | BDRV_O_RDWR;
  Do we take care to make the image read-only again after completion?
  
  Not at the file-descriptor level, but bs-read_only is indeed restored
  to true via bdrv_swap.
  
  Doing it on the file descriptor is possible with Jeff's bdrv_reopen 
  patches.
 Ah, right, obviously you can't do it before Jeff's patches are in. But
 yes, this is what I meant.

Still, the guest won't be able to issue writes after switching to the
destination, if it couldn't do so before.

Paolo



Re: [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command

2012-07-31 Thread Kevin Wolf
Am 24.07.2012 13:04, schrieb Paolo Bonzini:
 This adds the monitor commands that start the mirroring job.
 
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com

[ Moving the discussion upstream ]

 Why make all of it inaccessible?  Everything except target device access
 does have a stable API.  The target device access can be delayed to 1.3,
 together with the much-needed QMP schema introspection.

I'm not even sure about the QMP mirror command itself.

I don't really like it, it does too many things at once: It can create
the target image file, it opens the target and it actually starts the
mirroring. It's rather bad at the first two steps, because it doesn't
take any options. This means that it can't create qcow2v3 images, for
example. Or you can't mirror into a backup with cache=unsafe while
running your real VM on cache=writethrough.

Having an all-in-one mirror command is a nice feature for HMP, but for
QMP it's more like a design problem.

Now I see you have called it drive-mirror, so that kind of implies that
it's not the final blockdev-mirror but just a QMP version of a command
primarily designed for HMP. As such this restricted functionality may be
acceptable, but it's not like everything is already perfect and there's
no room for discussion.

Kevin



Re: [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command

2012-07-31 Thread Paolo Bonzini
Il 31/07/2012 11:26, Kevin Wolf ha scritto:
 Am 24.07.2012 13:04, schrieb Paolo Bonzini:
 This adds the monitor commands that start the mirroring job.

 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 
 [ Moving the discussion upstream ]
 
 Why make all of it inaccessible?  Everything except target device access
 does have a stable API.  The target device access can be delayed to 1.3,
 together with the much-needed QMP schema introspection.
 
 I'm not even sure about the QMP mirror command itself.
 
 I don't really like it, it does too many things at once: It can create
 the target image file, it opens the target and it actually starts the
 mirroring. It's rather bad at the first two steps, because it doesn't
 take any options. This means that it can't create qcow2v3 images, for
 example. Or you can't mirror into a backup with cache=unsafe while
 running your real VM on cache=writethrough.

Yes, though this can be worked around with mode: 'existing'.

 Having an all-in-one mirror command is a nice feature for HMP, but for
 QMP it's more like a design problem.
 
 Now I see you have called it drive-mirror

I thought this was your idea. :)

 , so that kind of implies that
 it's not the final blockdev-mirror but just a QMP version of a command
 primarily designed for HMP. As such this restricted functionality may be
 acceptable, but it's not like everything is already perfect and there's
 no room for discussion.

We keep going back to the same point that we do not have -blockdev, but
it's becoming a bit frustrating to always rehash this same point...

Paolo



Re: [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command

2012-07-31 Thread Kevin Wolf
Am 31.07.2012 11:33, schrieb Paolo Bonzini:
 Il 31/07/2012 11:26, Kevin Wolf ha scritto:
 Am 24.07.2012 13:04, schrieb Paolo Bonzini:
 This adds the monitor commands that start the mirroring job.

 Signed-off-by: Paolo Bonzini pbonz...@redhat.com

 [ Moving the discussion upstream ]

 Why make all of it inaccessible?  Everything except target device access
 does have a stable API.  The target device access can be delayed to 1.3,
 together with the much-needed QMP schema introspection.

 I'm not even sure about the QMP mirror command itself.

 I don't really like it, it does too many things at once: It can create
 the target image file, it opens the target and it actually starts the
 mirroring. It's rather bad at the first two steps, because it doesn't
 take any options. This means that it can't create qcow2v3 images, for
 example. Or you can't mirror into a backup with cache=unsafe while
 running your real VM on cache=writethrough.
 
 Yes, though this can be worked around with mode: 'existing'.

True. Only the problem with image creation, though, not the one with
bdrv_open() flags, right?

 Having an all-in-one mirror command is a nice feature for HMP, but for
 QMP it's more like a design problem.

 Now I see you have called it drive-mirror
 
 I thought this was your idea. :)

Hm, then probably we discussed similar things before. :-)

 , so that kind of implies that
 it's not the final blockdev-mirror but just a QMP version of a command
 primarily designed for HMP. As such this restricted functionality may be
 acceptable, but it's not like everything is already perfect and there's
 no room for discussion.
 
 We keep going back to the same point that we do not have -blockdev, but
 it's becoming a bit frustrating to always rehash this same point...

The question is whether we need it at all. We do have a drive_add
if=none, and for creating a mirror target that should actually be enough.

Kevin



Re: [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command

2012-07-31 Thread Paolo Bonzini
Il 31/07/2012 11:46, Kevin Wolf ha scritto:
 I'm not even sure about the QMP mirror command itself.

 I don't really like it, it does too many things at once: It can create
 the target image file, it opens the target and it actually starts the
 mirroring. It's rather bad at the first two steps, because it doesn't
 take any options. This means that it can't create qcow2v3 images, for
 example. Or you can't mirror into a backup with cache=unsafe while
 running your real VM on cache=writethrough.

 Yes, though this can be worked around with mode: 'existing'.
 
 True. Only the problem with image creation, though, not the one with
 bdrv_open() flags, right?

Yeah, but do you really care about for example io=threads vs. io=native?
 The only interesting one is cache=unsafe; the mirror should enable
writeback caching on the target (bdrv_swap will disable it if needed;
I'll change this in the next submission), so cache=writethrough vs.
writeback doesn't matter.

 Having an all-in-one mirror command is a nice feature for HMP, but for
 QMP it's more like a design problem.

 Now I see you have called it drive-mirror

 I thought this was your idea. :)
 
 Hm, then probably we discussed similar things before. :-)
 
 , so that kind of implies that
 it's not the final blockdev-mirror but just a QMP version of a command
 primarily designed for HMP. As such this restricted functionality may be
 acceptable, but it's not like everything is already perfect and there's
 no room for discussion.

 We keep going back to the same point that we do not have -blockdev, but
 it's becoming a bit frustrating to always rehash this same point...
 
 The question is whether we need it at all. We do have a drive_add
 if=none, and for creating a mirror target that should actually be enough.

But not for creating images.  That would require qemu-img invocation.

If you're okay with always using an existing image in the QMP case (and
moving image creation to the HMP implementation), we can do it.  But I'm
not sure I like it, I think it's excessive in the other direction.

Paolo



Re: [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command

2012-07-31 Thread Kevin Wolf
Am 31.07.2012 12:02, schrieb Paolo Bonzini:
 Il 31/07/2012 11:46, Kevin Wolf ha scritto:
 I'm not even sure about the QMP mirror command itself.

 I don't really like it, it does too many things at once: It can create
 the target image file, it opens the target and it actually starts the
 mirroring. It's rather bad at the first two steps, because it doesn't
 take any options. This means that it can't create qcow2v3 images, for
 example. Or you can't mirror into a backup with cache=unsafe while
 running your real VM on cache=writethrough.

 Yes, though this can be worked around with mode: 'existing'.

 True. Only the problem with image creation, though, not the one with
 bdrv_open() flags, right?
 
 Yeah, but do you really care about for example io=threads vs. io=native?
  The only interesting one is cache=unsafe; the mirror should enable
 writeback caching on the target (bdrv_swap will disable it if needed;
 I'll change this in the next submission), so cache=writethrough vs.
 writeback doesn't matter.

Can we really make it writeback unconditionally? For a passive mirror it
probably doesn't make a difference, but what happens when the user stops
the mirroring and switches to the target? Will it stay writeback?

The same goes for aio=native/threads. Probably not interesting for the
mirror, but well afterwards.

Another interesting thing is I/O throttling. The mirror currently
implements rate limiting itself, but is there really a reason why we
can't reuse regular I/O throttling on the target?

 Having an all-in-one mirror command is a nice feature for HMP, but for
 QMP it's more like a design problem.

 Now I see you have called it drive-mirror

 I thought this was your idea. :)

 Hm, then probably we discussed similar things before. :-)

 , so that kind of implies that
 it's not the final blockdev-mirror but just a QMP version of a command
 primarily designed for HMP. As such this restricted functionality may be
 acceptable, but it's not like everything is already perfect and there's
 no room for discussion.

 We keep going back to the same point that we do not have -blockdev, but
 it's becoming a bit frustrating to always rehash this same point...

 The question is whether we need it at all. We do have a drive_add
 if=none, and for creating a mirror target that should actually be enough.
 
 But not for creating images.  That would require qemu-img invocation.

Yeah, either qemu-img or another monitor command. I believe that in
practice libvirt will do this anyway if this is the only way to specify
image creation options.

 If you're okay with always using an existing image in the QMP case (and
 moving image creation to the HMP implementation), we can do it.  But I'm
 not sure I like it, I think it's excessive in the other direction.

If you think it's helpful, we could make it optional and have a mode
'blockdev' where you don't specify a file name but a blockdev name. But
this is an approach that feels a bit HMPish...

Kevin



Re: [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command

2012-07-31 Thread Paolo Bonzini
Il 31/07/2012 12:25, Kevin Wolf ha scritto:
 Yeah, but do you really care about for example io=threads vs. io=native?
  The only interesting one is cache=unsafe; the mirror should enable
 writeback caching on the target (bdrv_swap will disable it if needed;
 I'll change this in the next submission), so cache=writethrough vs.
 writeback doesn't matter.
 
 Can we really make it writeback unconditionally? For a passive mirror it
 probably doesn't make a difference, but what happens when the user stops
 the mirroring and switches to the target? Will it stay writeback?

bdrv_swap takes care of it just fine.

 The same goes for aio=native/threads. Probably not interesting for the
 mirror, but well afterwards.

Actually it is interesting for the mirror.  Passive mirroring can only
benefit from lower latency.

But yes, bdrv_swap would not copy this one.  Right now we always use the
same aio method as the source (at worst it is ignored by the protocol),
so it is not a problem.

 Another interesting thing is I/O throttling. The mirror currently
 implements rate limiting itself, but is there really a reason why we
 can't reuse regular I/O throttling on the target?

I thought about it, but I'm worried of what happens when I/O throttling
kicks in, and how it interacts with pause/resume/cancel.

 Having an all-in-one mirror command is a nice feature for HMP, but for
 QMP it's more like a design problem.

 Now I see you have called it drive-mirror

 I thought this was your idea. :)

 Hm, then probably we discussed similar things before. :-)

 , so that kind of implies that
 it's not the final blockdev-mirror but just a QMP version of a command
 primarily designed for HMP. As such this restricted functionality may be
 acceptable, but it's not like everything is already perfect and there's
 no room for discussion.

 We keep going back to the same point that we do not have -blockdev, but
 it's becoming a bit frustrating to always rehash this same point...

 The question is whether we need it at all. We do have a drive_add
 if=none, and for creating a mirror target that should actually be enough.

 But not for creating images.  That would require qemu-img invocation.
 
 Yeah, either qemu-img or another monitor command. I believe that in
 practice libvirt will do this anyway if this is the only way to specify
 image creation options.

Playing devil's advocate because you've almost convinced me, we have the
same problem for blockdev-snapshot-sync.  Now drive-mirror is a bit
different because you can use it to reshape an image to something
else, but the same could be done with snapshot + streaming in many cases.

 If you're okay with always using an existing image in the QMP case (and
 moving image creation to the HMP implementation), we can do it.  But I'm
 not sure I like it, I think it's excessive in the other direction.
 
 If you think it's helpful, we could make it optional and have a mode
 'blockdev' where you don't specify a file name but a blockdev name. But
 this is an approach that feels a bit HMPish...

I think having a few limited knobs for image creation make some sense
(not all QMP clients need to be as sophisticated as libvirt), but that's
actually an interesting idea (as it is in general to piggyback on
drive_add).

Still, it leaves something to be desired.  It's not that it feels
HMP-ish, it's that it's overloading target a bit too much.  I would
prefer to keep drive-mirror for simple clients, and have a separate
blockdev-mirror that must have a blockdev target.  But doing the same
with blockdev-snapshot-sync will always look like duct-tape, because the
blockdev name is already taken. :(  Man, sometimes it feels like we're
not getting one thing right.

Paolo



Re: [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command

2012-07-31 Thread Kevin Wolf
Am 31.07.2012 12:51, schrieb Paolo Bonzini:
 Il 31/07/2012 12:25, Kevin Wolf ha scritto:
 Yeah, but do you really care about for example io=threads vs. io=native?
  The only interesting one is cache=unsafe; the mirror should enable
 writeback caching on the target (bdrv_swap will disable it if needed;
 I'll change this in the next submission), so cache=writethrough vs.
 writeback doesn't matter.

 Can we really make it writeback unconditionally? For a passive mirror it
 probably doesn't make a difference, but what happens when the user stops
 the mirroring and switches to the target? Will it stay writeback?
 
 bdrv_swap takes care of it just fine.

Ah, the switch uses bdrv_swap? Then that one is fine indeed.

 The same goes for aio=native/threads. Probably not interesting for the
 mirror, but well afterwards.
 
 Actually it is interesting for the mirror.  Passive mirroring can only
 benefit from lower latency.
 
 But yes, bdrv_swap would not copy this one.  Right now we always use the
 same aio method as the source (at worst it is ignored by the protocol),
 so it is not a problem.

Fair enough, though there may be cases where you'd really want to
switch, like migrating from a block device to a file on NFS.

 Another interesting thing is I/O throttling. The mirror currently
 implements rate limiting itself, but is there really a reason why we
 can't reuse regular I/O throttling on the target?
 
 I thought about it, but I'm worried of what happens when I/O throttling
 kicks in, and how it interacts with pause/resume/cancel.

bdrv_co_write won't return until the request has been throttled, so it
should be mostly transparent. The effect that it could have is that
pausing the mirror could take a little bit longer to complete (though
not much, as there is only one mirror request at the same time). But
iirc, pausing a block job was async anyway.

Any other aspect I'm missing?

 Having an all-in-one mirror command is a nice feature for HMP, but for
 QMP it's more like a design problem.

 Now I see you have called it drive-mirror

 I thought this was your idea. :)

 Hm, then probably we discussed similar things before. :-)

 , so that kind of implies that
 it's not the final blockdev-mirror but just a QMP version of a command
 primarily designed for HMP. As such this restricted functionality may be
 acceptable, but it's not like everything is already perfect and there's
 no room for discussion.

 We keep going back to the same point that we do not have -blockdev, but
 it's becoming a bit frustrating to always rehash this same point...

 The question is whether we need it at all. We do have a drive_add
 if=none, and for creating a mirror target that should actually be enough.

 But not for creating images.  That would require qemu-img invocation.

 Yeah, either qemu-img or another monitor command. I believe that in
 practice libvirt will do this anyway if this is the only way to specify
 image creation options.
 
 Playing devil's advocate because you've almost convinced me, we have the
 same problem for blockdev-snapshot-sync.  Now drive-mirror is a bit
 different because you can use it to reshape an image to something
 else, but the same could be done with snapshot + streaming in many cases.

Yes, blockdev-snapshot-sync is more or less the same case. We were aware
from the beginning that it's not the right command, but apparently
didn't think of drive_add.

 If you're okay with always using an existing image in the QMP case (and
 moving image creation to the HMP implementation), we can do it.  But I'm
 not sure I like it, I think it's excessive in the other direction.

 If you think it's helpful, we could make it optional and have a mode
 'blockdev' where you don't specify a file name but a blockdev name. But
 this is an approach that feels a bit HMPish...
 
 I think having a few limited knobs for image creation make some sense
 (not all QMP clients need to be as sophisticated as libvirt), but that's
 actually an interesting idea (as it is in general to piggyback on
 drive_add).
 
 Still, it leaves something to be desired.  It's not that it feels
 HMP-ish, it's that it's overloading target a bit too much.  I would
 prefer to keep drive-mirror for simple clients, and have a separate
 blockdev-mirror that must have a blockdev target.  But doing the same
 with blockdev-snapshot-sync will always look like duct-tape, because the
 blockdev name is already taken. :(  Man, sometimes it feels like we're
 not getting one thing right.

blockdev-snapshot isn't taken yet. However, having the two side by side
would imply that blockdev-snapshot is async, which I believe is
currently not the most urgent of our concerns...

Or actually, it might not even matter any more, because the thing that
really takes some time is creating and opening the image. Once you have
the blockdev, there's no point in making things async any more.

Kevin



Re: [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command

2012-07-31 Thread Paolo Bonzini
Il 31/07/2012 13:13, Kevin Wolf ha scritto:
 Am 31.07.2012 12:51, schrieb Paolo Bonzini:
 Il 31/07/2012 12:25, Kevin Wolf ha scritto:
 Yeah, but do you really care about for example io=threads vs. io=native?
  The only interesting one is cache=unsafe; the mirror should enable
 writeback caching on the target (bdrv_swap will disable it if needed;
 I'll change this in the next submission), so cache=writethrough vs.
 writeback doesn't matter.

 Can we really make it writeback unconditionally? For a passive mirror it
 probably doesn't make a difference, but what happens when the user stops
 the mirroring and switches to the target? Will it stay writeback?

 bdrv_swap takes care of it just fine.
 
 Ah, the switch uses bdrv_swap? Then that one is fine indeed.
 
 The same goes for aio=native/threads. Probably not interesting for the
 mirror, but well afterwards.

 Actually it is interesting for the mirror.  Passive mirroring can only
 benefit from lower latency.

 But yes, bdrv_swap would not copy this one.  Right now we always use the
 same aio method as the source (at worst it is ignored by the protocol),
 so it is not a problem.
 
 Fair enough, though there may be cases where you'd really want to
 switch, like migrating from a block device to a file on NFS.
 
 Another interesting thing is I/O throttling. The mirror currently
 implements rate limiting itself, but is there really a reason why we
 can't reuse regular I/O throttling on the target?

 I thought about it, but I'm worried of what happens when I/O throttling
 kicks in, and how it interacts with pause/resume/cancel.
 
 bdrv_co_write won't return until the request has been throttled, so it
 should be mostly transparent.

At the end of this series I use bdrv_aio_readv/writev.

 The effect that it could have is that
 pausing the mirror could take a little bit longer to complete (though
 not much, as there is only one mirror request at the same time).

Not anymore. :)

 But iirc, pausing a block job was async anyway.

Yes, it is, and job-busy nicely abstracts the hairy parts.

 Any other aspect I'm missing?

No, that should be ok.  Though I'm not sure if it's so useful to apply
throttling on the target.  It's more useful to throttle the source
(making writes slower than reads will help the job's convergence) and
copy at full steam to the target.

 If you're okay with always using an existing image in the QMP case (and
 moving image creation to the HMP implementation), we can do it.  But I'm
 not sure I like it, I think it's excessive in the other direction.

 If you think it's helpful, we could make it optional and have a mode
 'blockdev' where you don't specify a file name but a blockdev name. But
 this is an approach that feels a bit HMPish...

 I think having a few limited knobs for image creation make some sense
 (not all QMP clients need to be as sophisticated as libvirt), but that's
 actually an interesting idea (as it is in general to piggyback on
 drive_add).

 Still, it leaves something to be desired.  It's not that it feels
 HMP-ish, it's that it's overloading target a bit too much.  I would
 prefer to keep drive-mirror for simple clients, and have a separate
 blockdev-mirror that must have a blockdev target.  But doing the same
 with blockdev-snapshot-sync will always look like duct-tape, because the
 blockdev name is already taken. :(  Man, sometimes it feels like we're
 not getting one thing right.
 
 blockdev-snapshot isn't taken yet. However, having the two side by side
 would imply that blockdev-snapshot is async, which I believe is
 currently not the most urgent of our concerns...
 
 Or actually, it might not even matter any more, because the thing that
 really takes some time is creating and opening the image. Once you have
 the blockdev, there's no point in making things async any more.

Right, blockdev-snapshot would really be just a bdrv_append operation.
/me smiles. :)  So let's keep drive-mirror as is, and later add
blockdev-mirror.

Paolo



Re: [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command

2012-07-31 Thread Kevin Wolf
Am 31.07.2012 13:25, schrieb Paolo Bonzini:
 Il 31/07/2012 13:13, Kevin Wolf ha scritto:
 Am 31.07.2012 12:51, schrieb Paolo Bonzini:
 Il 31/07/2012 12:25, Kevin Wolf ha scritto:
 Another interesting thing is I/O throttling. The mirror currently
 implements rate limiting itself, but is there really a reason why we
 can't reuse regular I/O throttling on the target?

 I thought about it, but I'm worried of what happens when I/O throttling
 kicks in, and how it interacts with pause/resume/cancel.

 bdrv_co_write won't return until the request has been throttled, so it
 should be mostly transparent.
 
 At the end of this series I use bdrv_aio_readv/writev.
 
 The effect that it could have is that
 pausing the mirror could take a little bit longer to complete (though
 not much, as there is only one mirror request at the same time).
 
 Not anymore. :)

Hm, I see. Makes it a bit more involved, but then the logic should be
almost the same as you already need to complete the mirror.

 But iirc, pausing a block job was async anyway.
 
 Yes, it is, and job-busy nicely abstracts the hairy parts.
 
 Any other aspect I'm missing?
 
 No, that should be ok.  Though I'm not sure if it's so useful to apply
 throttling on the target.  It's more useful to throttle the source
 (making writes slower than reads will help the job's convergence) and
 copy at full steam to the target.

But doesn't the rate limiting of the mirror already throttle the target?
Which isn't too bad, because I think at least in the initial phase
you'll want to have both source and target throttled (later the target
is automatically throttled to the level of the source, except for bitmap
granularity artefacts).

 If you're okay with always using an existing image in the QMP case (and
 moving image creation to the HMP implementation), we can do it.  But I'm
 not sure I like it, I think it's excessive in the other direction.

 If you think it's helpful, we could make it optional and have a mode
 'blockdev' where you don't specify a file name but a blockdev name. But
 this is an approach that feels a bit HMPish...

 I think having a few limited knobs for image creation make some sense
 (not all QMP clients need to be as sophisticated as libvirt), but that's
 actually an interesting idea (as it is in general to piggyback on
 drive_add).

 Still, it leaves something to be desired.  It's not that it feels
 HMP-ish, it's that it's overloading target a bit too much.  I would
 prefer to keep drive-mirror for simple clients, and have a separate
 blockdev-mirror that must have a blockdev target.  But doing the same
 with blockdev-snapshot-sync will always look like duct-tape, because the
 blockdev name is already taken. :(  Man, sometimes it feels like we're
 not getting one thing right.

 blockdev-snapshot isn't taken yet. However, having the two side by side
 would imply that blockdev-snapshot is async, which I believe is
 currently not the most urgent of our concerns...

 Or actually, it might not even matter any more, because the thing that
 really takes some time is creating and opening the image. Once you have
 the blockdev, there's no point in making things async any more.
 
 Right, blockdev-snapshot would really be just a bdrv_append operation.
 /me smiles. :)  So let's keep drive-mirror as is, and later add
 blockdev-mirror.

Ok, that's fair enough.

Kevin



Re: [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command

2012-07-31 Thread Paolo Bonzini
Il 31/07/2012 14:17, Kevin Wolf ha scritto:
 No, that should be ok.  Though I'm not sure if it's so useful to apply
 throttling on the target.  It's more useful to throttle the source
 (making writes slower than reads will help the job's convergence) and
 copy at full steam to the target.
 
 But doesn't the rate limiting of the mirror already throttle the target?

Of course whatever you throttle (any of job, source, target) will have
an effect on the other two as well.

IMO, the target is perhaps the least useful to throttle.  It is more
interesting to play with the source, because that's guest visible.
Slowing down the target, while letting the guest run at full speed is
unlikely to help convergence of the job.

On the other hand, the job and target speeds are really duplicates of
each other, so the job speed is really just as useless.

So it sounds like removing the job speed is a good idea.  If needed,
libvirt can implement it later with a named block device for the target
+ I/O throttling.

 Which isn't too bad, because I think at least in the initial phase
 you'll want to have both source and target throttled (later the target
 is automatically throttled to the level of the source, except for bitmap
 granularity artefacts).

The target is always throttled to the level of the source and vice
versa.  The target can never be written faster than you read the source;
and slowing down the target will keep buffers busy so you cannot read
more from the source.

Paolo



Re: [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command

2012-07-27 Thread Paolo Bonzini
Il 27/07/2012 01:42, Eric Blake ha scritto:
 On 07/24/2012 05:04 AM, Paolo Bonzini wrote:
 This adds the monitor commands that start the mirroring job.

 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  blockdev.c   |  133 
 --
  hmp-commands.hx  |   21 +
  hmp.c|   28 
  hmp.h|1 +
  qapi-schema.json |   35 ++
  qmp-commands.hx  |   42 +
  trace-events |2 +-
  7 files changed, 258 insertions(+), 4 deletions(-)
 
 This command only allows mirroring from the top or the complete disk,
 but no intermediate points.  That is, given:
 base - sn1 - sn2
 I can mirror sn2 in isolation, or the entire chain including base, but I
 cannot mirror exactly sn1+sn2.  Instead, I would have to do a mirror of
 sn2 then follow up with a partial block-stream to rebase that mirror on
 top of base.  I guess this is okay, but it feels a bit limited to have
 to do it in two steps instead of one.

I couldn't think of a use case for this:

- preparing for a failing disk requires to keep all snapshots
unmodified, so mirroring only the top

- migration with non-shared storage requires to copy the whole disk
contents (unless you want to do part of the copy outside QEMU)

- collapsing a tower of images to raw for performance requires a copy
of the whole disk contents

And we need the sync parameter anyway (for 'none' and in the future for
the dirty bitmap), so I preferred to keep the API simple.

 [Hmm, your later patches introduce the ability to have a persistent
 mapping file; perhaps this can be exploited to have the user pre-create
 a mapping file that shows the portions allocated by sn1+sn2 as the
 starting point, but I'm not there in the patch series yet to know if
 this is the case.]

These are not in the posted patches (and in fact have not been developed
yet :)).

 +++ b/qapi-schema.json
 @@ -1367,6 +1367,41 @@
'returns': 'str' }
  
  ##
 +# @drive-mirror
 +#
 +# Start mirroring a block device's writes to a new destination.
 +#
 +# @device:  the name of the device whose writes should be mirrored.
 +#
 +# @target: the target of the new image. If the file exists, or if it
 +#  is a device, the existing file/device will be used as the new
 +#  destination.  If it does not exist, a new file will be created.
 +#
 +# @format: #optional the format of the new destination, default is to
 +#  probe is @mode is 'existing', else the format of the source
 
 s/probe is/probe if/
 
 +#
 +# @mode: #optional whether and how QEMU should create a new image, default 
 is
 +#'absolute-paths'.
 +#
 +# @speed:  #optional the maximum speed, in bytes per second
 +#
 +# @sync: what parts of the disk image should be copied to the destination
 +#(all the disk, only the sectors allocated in the topmost image, or
 +#only new I/O).
 
 Is there any reason 'sync' is listed last here, but before 'mode' in the
 JSON?

No, no reason.

 +#
 +# Returns: nothing on success
 +#  If @device is not a valid block device, DeviceNotFound
 +#  If @target can't be opened, OpenFileFailed
 +#  If @format is invalid, InvalidBlockFormat
 +#
 +# Since 1.2
 +##
 +{ 'command': 'drive-mirror',
 +  'data': { 'device': 'str', 'target': 'str', '*format': 'str',
 +'sync': 'MirrorSyncMode', '*mode': 'NewImageMode',
 +'*speed': 'int' } }
 +
 
 +drive-mirror
 +
 +
 +Start mirroring a block device's writes to a new destination. target
 +specifies the target of the new image. If the file exists, or if it is
 +a device, it will be used as the new destination for writes. If does not
 +exist, a new file will be created. format specifies the format of the
 +mirror image, default is to probe if mode='existing', else the format
 +of the source.
 +
 +Arguments:
 +
 +- device: device name to operate on (json-string)
 +- target: name of new image file (json-string)
 +- format: format of new image (json-string, optional)
 +- mode: how an image file should be created into the target
 +  file/device (NewImageMode, optional, default 'absolute-paths')
 
 Should we call out all the possible 'NewImageMode' strings here,...
 
 +- speed: maximum speed of the streaming job, in bytes per second
 +  (json-int)
 
 Mention that this is optional.
 
 +- sync: what parts of the disk image should be copied to the destination;
 +  possibilities include full for all the disk, top for only the sectors
 +  allocated in the topmost image, or none to only replicate new I/O
 +  (MirrorSyncMode).
 
 ...given that we did the same for all the possible MirrorSyncMode strings?

We could.  The difference between the two is that NewImageMode is used
also in other commands.  In the end, I'm not even sure of the usefulness
of this documentation since the .json schema is much better and could be
converted to a format with hyperlinks.

Paolo



Re: [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command

2012-07-26 Thread Eric Blake
On 07/24/2012 05:04 AM, Paolo Bonzini wrote:
 This adds the monitor commands that start the mirroring job.
 
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  blockdev.c   |  133 
 --
  hmp-commands.hx  |   21 +
  hmp.c|   28 
  hmp.h|1 +
  qapi-schema.json |   35 ++
  qmp-commands.hx  |   42 +
  trace-events |2 +-
  7 files changed, 258 insertions(+), 4 deletions(-)

This command only allows mirroring from the top or the complete disk,
but no intermediate points.  That is, given:
base - sn1 - sn2
I can mirror sn2 in isolation, or the entire chain including base, but I
cannot mirror exactly sn1+sn2.  Instead, I would have to do a mirror of
sn2 then follow up with a partial block-stream to rebase that mirror on
top of base.  I guess this is okay, but it feels a bit limited to have
to do it in two steps instead of one.

[Hmm, your later patches introduce the ability to have a persistent
mapping file; perhaps this can be exploited to have the user pre-create
a mapping file that shows the portions allocated by sn1+sn2 as the
starting point, but I'm not there in the patch series yet to know if
this is the case.]

 +++ b/qapi-schema.json
 @@ -1367,6 +1367,41 @@
'returns': 'str' }
  
  ##
 +# @drive-mirror
 +#
 +# Start mirroring a block device's writes to a new destination.
 +#
 +# @device:  the name of the device whose writes should be mirrored.
 +#
 +# @target: the target of the new image. If the file exists, or if it
 +#  is a device, the existing file/device will be used as the new
 +#  destination.  If it does not exist, a new file will be created.
 +#
 +# @format: #optional the format of the new destination, default is to
 +#  probe is @mode is 'existing', else the format of the source

s/probe is/probe if/

 +#
 +# @mode: #optional whether and how QEMU should create a new image, default is
 +#'absolute-paths'.
 +#
 +# @speed:  #optional the maximum speed, in bytes per second
 +#
 +# @sync: what parts of the disk image should be copied to the destination
 +#(all the disk, only the sectors allocated in the topmost image, or
 +#only new I/O).

Is there any reason 'sync' is listed last here, but before 'mode' in the
JSON?

 +#
 +# Returns: nothing on success
 +#  If @device is not a valid block device, DeviceNotFound
 +#  If @target can't be opened, OpenFileFailed
 +#  If @format is invalid, InvalidBlockFormat
 +#
 +# Since 1.2
 +##
 +{ 'command': 'drive-mirror',
 +  'data': { 'device': 'str', 'target': 'str', '*format': 'str',
 +'sync': 'MirrorSyncMode', '*mode': 'NewImageMode',
 +'*speed': 'int' } }
 +

 +drive-mirror
 +
 +
 +Start mirroring a block device's writes to a new destination. target
 +specifies the target of the new image. If the file exists, or if it is
 +a device, it will be used as the new destination for writes. If does not
 +exist, a new file will be created. format specifies the format of the
 +mirror image, default is to probe if mode='existing', else the format
 +of the source.
 +
 +Arguments:
 +
 +- device: device name to operate on (json-string)
 +- target: name of new image file (json-string)
 +- format: format of new image (json-string, optional)
 +- mode: how an image file should be created into the target
 +  file/device (NewImageMode, optional, default 'absolute-paths')

Should we call out all the possible 'NewImageMode' strings here,...

 +- speed: maximum speed of the streaming job, in bytes per second
 +  (json-int)

Mention that this is optional.

 +- sync: what parts of the disk image should be copied to the destination;
 +  possibilities include full for all the disk, top for only the sectors
 +  allocated in the topmost image, or none to only replicate new I/O
 +  (MirrorSyncMode).

...given that we did the same for all the possible MirrorSyncMode strings?

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command

2012-07-24 Thread Paolo Bonzini
This adds the monitor commands that start the mirroring job.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 blockdev.c   |  133 --
 hmp-commands.hx  |   21 +
 hmp.c|   28 
 hmp.h|1 +
 qapi-schema.json |   35 ++
 qmp-commands.hx  |   42 +
 trace-events |2 +-
 7 files changed, 258 insertions(+), 4 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 192a9db..4b4574a 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -21,6 +21,8 @@
 #include trace.h
 #include arch_init.h
 
+static void block_job_cb(void *opaque, int ret);
+
 static QTAILQ_HEAD(drivelist, DriveInfo) drives = 
QTAILQ_HEAD_INITIALIZER(drives);
 
 static const char *const if_name[IF_COUNT] = {
@@ -825,6 +827,131 @@ exit:
 return;
 }
 
+void qmp_drive_mirror(const char *device, const char *target,
+  bool has_format, const char *format,
+  enum MirrorSyncMode sync,
+  bool has_mode, enum NewImageMode mode,
+  bool has_speed, int64_t speed, Error **errp)
+{
+BlockDriverInfo bdi;
+BlockDriverState *bs;
+BlockDriverState *source, *target_bs;
+BlockDriver *proto_drv;
+BlockDriver *drv = NULL;
+Error *local_err = NULL;
+int flags;
+uint64_t size;
+int ret;
+
+if (!has_speed) {
+speed = 0;
+}
+if (!has_mode) {
+mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS;
+}
+
+bs = bdrv_find(device);
+if (!bs) {
+error_set(errp, QERR_DEVICE_NOT_FOUND, device);
+return;
+}
+
+if (!has_format) {
+format = mode == NEW_IMAGE_MODE_EXISTING ? NULL : bs-drv-format_name;
+}
+if (format) {
+drv = bdrv_find_format(format);
+if (!drv) {
+error_set(errp, QERR_INVALID_BLOCK_FORMAT, format);
+return;
+}
+}
+
+if (!bdrv_is_inserted(bs)) {
+error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
+return;
+}
+
+if (bdrv_in_use(bs)) {
+error_set(errp, QERR_DEVICE_IN_USE, device);
+return;
+}
+
+flags = bs-open_flags | BDRV_O_RDWR;
+source = bs-backing_hd;
+if (!source  sync == MIRROR_SYNC_MODE_TOP) {
+sync = MIRROR_SYNC_MODE_FULL;
+}
+
+proto_drv = bdrv_find_protocol(target);
+if (!proto_drv) {
+error_set(errp, QERR_INVALID_BLOCK_FORMAT, format);
+return;
+}
+
+if (sync == MIRROR_SYNC_MODE_FULL  mode != NEW_IMAGE_MODE_EXISTING) {
+/* create new image w/o backing file */
+assert(format  drv);
+bdrv_get_geometry(bs, size);
+size *= 512;
+ret = bdrv_img_create(target, format,
+  NULL, NULL, NULL, size, flags);
+} else {
+switch (mode) {
+case NEW_IMAGE_MODE_EXISTING:
+ret = 0;
+break;
+case NEW_IMAGE_MODE_ABSOLUTE_PATHS:
+/* create new image with backing file */
+ret = bdrv_img_create(target, format,
+  source-filename,
+  source-drv-format_name,
+  NULL, -1, flags);
+break;
+default:
+abort();
+}
+}
+
+if (ret) {
+error_set(errp, QERR_OPEN_FILE_FAILED, target);
+return;
+}
+
+target_bs = bdrv_new();
+ret = bdrv_open(target_bs, target, flags | BDRV_O_NO_BACKING, drv);
+
+if (ret  0) {
+bdrv_delete(target_bs);
+error_set(errp, QERR_OPEN_FILE_FAILED, target);
+return;
+}
+
+/* We need a backing file if we will copy parts of a cluster.  */
+if (bdrv_get_info(target_bs, bdi) = 0  bdi.cluster_size != 0 
+bdi.cluster_size = BDRV_SECTORS_PER_DIRTY_CHUNK * 512) {
+ret = bdrv_ensure_backing_file(target_bs);
+if (ret  0) {
+bdrv_delete(target_bs);
+error_set(errp, QERR_OPEN_FILE_FAILED, target);
+return;
+}
+}
+
+mirror_start(bs, target_bs, speed, sync, block_job_cb, bs, local_err);
+if (local_err != NULL) {
+bdrv_delete(target_bs);
+error_propagate(errp, local_err);
+return;
+}
+
+/* Grab a reference so hotplug does not delete the BlockDriverState from
+ * underneath us.
+ */
+drive_get_ref(drive_get_by_blockdev(bs));
+}
+
+
 
 static void eject_device(BlockDriverState *bs, int force, Error **errp)
 {
@@ -1049,12 +1176,12 @@ void qmp_block_resize(const char *device, int64_t size, 
Error **errp)
 }
 }
 
-static void block_stream_cb(void *opaque, int ret)
+static void block_job_cb(void *opaque, int ret)
 {
 BlockDriverState *bs = opaque;
 QObject *obj;
 
-trace_block_stream_cb(bs, bs-job, ret);
+trace_block_job_cb(bs, bs-job, ret);
 
 assert(bs-job);
 obj = qobject_from_block_job(bs-job);
@@ -1101,7 +1228,7 @@ void