Re: [Qemu-block] [PATCH 3/3] block: mirror - zero unallocated target sectors when zero init not present

2015-10-01 Thread Kevin Wolf
Am 30.09.2015 um 17:11 hat Jeff Cody geschrieben:
> On Mon, Sep 28, 2015 at 04:23:16PM +0100, Stefan Hajnoczi wrote:
> > On Sun, Sep 27, 2015 at 11:29:18PM -0400, Jeff Cody wrote:
> > > +if (s->zero_cycle) {
> > > +ret = bdrv_get_block_status(s->target, sector_num, nb_sectors, 
> > > );
> > > +if (!(ret & BDRV_BLOCK_ZERO)) {
> > > +bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
> > > +  s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
> > > +  mirror_write_complete, op);
> > 
> > mirror_write_complete will advance s->common.offset.  Won't the progress
> > be incorrect if we do that for both zeroing and regular mirroring?
> 
> Good point.  However, Is it really wrong to count it in the progress,
> if we do the zero mirror pass?  I

It's wrong as long as you increment the progress (offset), but don't
consider it in the expected value for completion (length).

Kevin



Re: [Qemu-block] [PATCH 3/3] block: mirror - zero unallocated target sectors when zero init not present

2015-10-01 Thread Jeff Cody
On Mon, Sep 28, 2015 at 04:23:16PM +0100, Stefan Hajnoczi wrote:
> On Sun, Sep 27, 2015 at 11:29:18PM -0400, Jeff Cody wrote:
> > +if (s->zero_cycle) {
> > +ret = bdrv_get_block_status(s->target, sector_num, nb_sectors, 
> > );
> > +if (!(ret & BDRV_BLOCK_ZERO)) {
> > +bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
> > +  s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
> > +  mirror_write_complete, op);
> 
> mirror_write_complete will advance s->common.offset.  Won't the progress
> be incorrect if we do that for both zeroing and regular mirroring?

Good point.  However, Is it really wrong to count it in the progress,
if we do the zero mirror pass?  I



Re: [Qemu-block] [PATCH 3/3] block: mirror - zero unallocated target sectors when zero init not present

2015-09-30 Thread Paolo Bonzini
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256



On 29/09/2015 10:39, Kevin Wolf wrote:
> bdrv_has_zero_init() takes care of that, in theory. The "problem"
> here is that the target is opened with BDRV_O_NO_BACKING, so the
> block layer doesn't consider this an image with a backing file.

I think bdrv_has_zero_init() is working right. If you read the qcow2
file as it was opened (i.e. with BDRV_O_NO_BACKING), unallocated areas
will indeed read as zeroes.

Of course if the file is opened with BDRV_O_NO_BACKING but does have a
backing file, you ought not to read unallocated areas at all.

So it's not the answer (of bdrv_has_zero_init) that is wrong, but the
question that was not well-specified.

> Is there anything better than bs->backing_hd that we could check?

It's simply sync == 'full', I think.  Then the problematic case never
even reaches bdrv_has_zero_init.

Paolo
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQEcBAEBCAAGBQJWCqRqAAoJEL/70l94x66DgcUH/jN8VFkGpZxXS5b+TnU8BeGV
Xmr3AjqICYS4K1mKcuu20GKZO5QSTh4Z7p/Igo2KmiGqven2kT/NIvjPRlSv4tqZ
Vov6AaamO6OIme+nA0hYbc3ANUY+b/7CqL8tDb3rKzah0FeFMSi1x7Who7aOCTQs
IjsJ37/ay+mGmPR9akDAfqjJjGPBJFL9dxz/0pgdPDUyj7IwvyolgGZ49rGNzoHE
/86Dy23ET16HQHDOz3afsrLHf9gxGZFCMsLJostqH0cuMs2sk1qnY9i9xEXYUM00
XNoaVafwCeH1ypXHNcP+GWtbbHBaMJJtmoRFB72VDRPq39XvpzWhifbKK3+c2Qw=
=/2aT
-END PGP SIGNATURE-



Re: [Qemu-block] [PATCH 3/3] block: mirror - zero unallocated target sectors when zero init not present

2015-09-29 Thread Kevin Wolf
Am 28.09.2015 um 19:32 hat Max Reitz geschrieben:
> On 28.09.2015 05:29, Jeff Cody wrote:
> > During mirror, if the target device does not have support zero
> > initialization, a mirror may result in a corrupt image.
> > 
> > For instance, on mirror to a host device with format = raw, whatever
> > random data is on the target device will still be there for unallocated
> > sectors.
> > 
> > This is because during the mirror, we set the dirty bitmap to copy only
> > sectors allocated above 'base'.  In the case of target devices where we
> > cannot assume unallocated sectors will be read as zeroes, we need to
> > explicitely zero out this data.
> > 
> > In order to avoid zeroing out all sectors of the target device prior to
> > mirroring, we do zeroing as part of the block job.  A second dirty
> > bitmap cache is created, to track sectors that are unallocated above
> > 'base'.  These sectors are then checked for status of BDRV_BLOCK_ZERO
> > on the target - if they are not, then zeroes are explicitly written.
> > 
> > This only occurs under two conditions:
> > 
> > 1. 'mode' != "existing"
> > 2. bdrv_has_zero_init(target) == NULL
> > 
> > We perform the mirroring through mirror_iteration() as before, except
> > in two passes.  If the above two conditions are met, the first pass
> > is using the bitmap tracking unallocated sectors, to write the needed
> > zeroes.  Then, the second pass is performed, to mirror the actual data
> > as before.
> > 
> > If the above two conditions are not met, then the first pass is skipped,
> > and only the second pass (the one with the actual data) is performed.
> > 
> > Signed-off-by: Jeff Cody 

> > @@ -737,6 +787,7 @@ static void mirror_start_job(BlockDriverState *bs, 
> > BlockDriverState *target,
> >  return;
> >  }
> >  
> > +s->zero_unallocated = !existing && !bdrv_has_zero_init(target);
> 
> I think this should be set only if we're doing a full mirror operation.
> For instance, I could do a none, top or incremental mirror to a new
> qcow2 file, which would give it a backing file, obviously. You're lucky
> in that qcow2 claims to always have zero initialization, when this is in
> fact not true (someone's ought to fix that...): With a backing file, an
> overlay file just cannot have zero initialization, it's impossible
> (well, unless the backing file is completely zero).

bdrv_has_zero_init() takes care of that, in theory. The "problem" here
is that the target is opened with BDRV_O_NO_BACKING, so the block layer
doesn't consider this an image with a backing file.

Is there anything better than bs->backing_hd that we could check?

Kevin


pgp_RaNnppJDl.pgp
Description: PGP signature


Re: [Qemu-block] [PATCH 3/3] block: mirror - zero unallocated target sectors when zero init not present

2015-09-28 Thread Paolo Bonzini


On 28/09/2015 05:29, Jeff Cody wrote:
> This only occurs under two conditions:
> 
> 1. 'mode' != "existing"
> 2. bdrv_has_zero_init(target) == NULL
> 

I'm not sure if mode != "existing" actually matters.  I think what
actually matters is sync == "full".

The reasons are:

1) with sync != "full", unallocated target sectors should remain
unallocated on the destination because they are supposed to point to the
backing file.

2) even with mode == "existing" you expect the data to be consistent at
the end of the mirroring

Paolo



Re: [Qemu-block] [PATCH 3/3] block: mirror - zero unallocated target sectors when zero init not present

2015-09-28 Thread Kevin Wolf
Am 28.09.2015 um 05:29 hat Jeff Cody geschrieben:
> During mirror, if the target device does not have support zero
> initialization, a mirror may result in a corrupt image.

I think you want to check this sentence. ("During mirror [...], a
mirror may result [...]")

> For instance, on mirror to a host device with format = raw, whatever
> random data is on the target device will still be there for unallocated
> sectors.
> 
> This is because during the mirror, we set the dirty bitmap to copy only
> sectors allocated above 'base'.  In the case of target devices where we
> cannot assume unallocated sectors will be read as zeroes, we need to
> explicitely zero out this data.
> 
> In order to avoid zeroing out all sectors of the target device prior to
> mirroring, we do zeroing as part of the block job.  A second dirty
> bitmap cache is created, to track sectors that are unallocated above
> 'base'.  These sectors are then checked for status of BDRV_BLOCK_ZERO
> on the target - if they are not, then zeroes are explicitly written.

Why do you need a bitmap? You never change the bitmap after initialising
it, so couldn't you instead just check the allocation status when you
need it?

In fact, why do we need two passes? I would have expected that commit
dcfb3beb already does the trick, with checking allocation status and
writing zeroes during the normal single pass.

If that commit fails to solve the problem, I guess I first need to
understand why before I can continue reviewing this one...

> This only occurs under two conditions:
> 
> 1. 'mode' != "existing"
> 2. bdrv_has_zero_init(target) == NULL
> 
> We perform the mirroring through mirror_iteration() as before, except
> in two passes.  If the above two conditions are met, the first pass
> is using the bitmap tracking unallocated sectors, to write the needed
> zeroes.  Then, the second pass is performed, to mirror the actual data
> as before.
> 
> If the above two conditions are not met, then the first pass is skipped,
> and only the second pass (the one with the actual data) is performed.
> 
> Signed-off-by: Jeff Cody 
> ---
>  block/mirror.c| 109 
> ++
>  blockdev.c|   2 +-
>  include/block/block_int.h |   3 +-
>  qapi/block-core.json  |   6 ++-
>  4 files changed, 87 insertions(+), 33 deletions(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index 405e5c4..b599176 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -50,7 +50,9 @@ typedef struct MirrorBlockJob {
>  int64_t bdev_length;
>  unsigned long *cow_bitmap;
>  BdrvDirtyBitmap *dirty_bitmap;
> -HBitmapIter hbi;
> +HBitmapIter zero_hbi;
> +HBitmapIter allocated_hbi;
> +HBitmapIter *hbi;
>  uint8_t *buf;
>  QSIMPLEQ_HEAD(, MirrorBuffer) buf_free;
>  int buf_free_count;
> @@ -60,6 +62,8 @@ typedef struct MirrorBlockJob {
>  int sectors_in_flight;
>  int ret;
>  bool unmap;
> +bool zero_unallocated;
> +bool zero_cycle;
>  bool waiting_for_io;
>  } MirrorBlockJob;
>  
> @@ -166,10 +170,10 @@ static uint64_t coroutine_fn 
> mirror_iteration(MirrorBlockJob *s)
>  int pnum;
>  int64_t ret;
>  
> -s->sector_num = hbitmap_iter_next(>hbi);
> +s->sector_num = hbitmap_iter_next(s->hbi);
>  if (s->sector_num < 0) {
> -bdrv_dirty_iter_init(s->dirty_bitmap, >hbi);
> -s->sector_num = hbitmap_iter_next(>hbi);
> +bdrv_dirty_iter_init(s->dirty_bitmap, s->hbi);
> +s->sector_num = hbitmap_iter_next(s->hbi);
>  trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap));
>  assert(s->sector_num >= 0);
>  }
> @@ -287,7 +291,7 @@ static uint64_t coroutine_fn 
> mirror_iteration(MirrorBlockJob *s)
>   */
>  if (next_sector > hbitmap_next_sector
>  && bdrv_get_dirty(source, s->dirty_bitmap, next_sector)) {
> -hbitmap_next_sector = hbitmap_iter_next(>hbi);
> +hbitmap_next_sector = hbitmap_iter_next(s->hbi);
>  }
>  
>  next_sector += sectors_per_chunk;
> @@ -300,25 +304,34 @@ static uint64_t coroutine_fn 
> mirror_iteration(MirrorBlockJob *s)
>  s->sectors_in_flight += nb_sectors;
>  trace_mirror_one_iteration(s, sector_num, nb_sectors);
>  
> -ret = bdrv_get_block_status_above(source, NULL, sector_num,
> -  nb_sectors, );
> -if (ret < 0 || pnum < nb_sectors ||
> -(ret & BDRV_BLOCK_DATA && !(ret & BDRV_BLOCK_ZERO))) {
> -bdrv_aio_readv(source, sector_num, >qiov, nb_sectors,
> -   mirror_read_complete, op);
> -} else if (ret & BDRV_BLOCK_ZERO) {
> -bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
> -  s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
> -  mirror_write_complete, op);
> +if (s->zero_cycle) {
> +ret = 

Re: [Qemu-block] [PATCH 3/3] block: mirror - zero unallocated target sectors when zero init not present

2015-09-28 Thread Kevin Wolf
Am 28.09.2015 um 05:29 hat Jeff Cody geschrieben:
> During mirror, if the target device does not have support zero
> initialization, a mirror may result in a corrupt image.
> 
> For instance, on mirror to a host device with format = raw, whatever
> random data is on the target device will still be there for unallocated
> sectors.
> 
> This is because during the mirror, we set the dirty bitmap to copy only
> sectors allocated above 'base'.  In the case of target devices where we
> cannot assume unallocated sectors will be read as zeroes, we need to
> explicitely zero out this data.
> 
> In order to avoid zeroing out all sectors of the target device prior to
> mirroring, we do zeroing as part of the block job.  A second dirty
> bitmap cache is created, to track sectors that are unallocated above
> 'base'.  These sectors are then checked for status of BDRV_BLOCK_ZERO
> on the target - if they are not, then zeroes are explicitly written.
> 
> This only occurs under two conditions:
> 
> 1. 'mode' != "existing"
> 2. bdrv_has_zero_init(target) == NULL
> 
> We perform the mirroring through mirror_iteration() as before, except
> in two passes.  If the above two conditions are met, the first pass
> is using the bitmap tracking unallocated sectors, to write the needed
> zeroes.  Then, the second pass is performed, to mirror the actual data
> as before.
> 
> If the above two conditions are not met, then the first pass is skipped,
> and only the second pass (the one with the actual data) is performed.
> 
> Signed-off-by: Jeff Cody 

Also, this makes qemu-iotests 097 fail for me.

Kevin



Re: [Qemu-block] [PATCH 3/3] block: mirror - zero unallocated target sectors when zero init not present

2015-09-28 Thread Stefan Hajnoczi
On Sun, Sep 27, 2015 at 11:29:18PM -0400, Jeff Cody wrote:
> +if (s->zero_cycle) {
> +ret = bdrv_get_block_status(s->target, sector_num, nb_sectors, 
> );
> +if (!(ret & BDRV_BLOCK_ZERO)) {
> +bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
> +  s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
> +  mirror_write_complete, op);

mirror_write_complete will advance s->common.offset.  Won't the progress
be incorrect if we do that for both zeroing and regular mirroring?



Re: [Qemu-block] [PATCH 3/3] block: mirror - zero unallocated target sectors when zero init not present

2015-09-28 Thread Max Reitz
On 28.09.2015 05:29, Jeff Cody wrote:
> During mirror, if the target device does not have support zero
> initialization, a mirror may result in a corrupt image.
> 
> For instance, on mirror to a host device with format = raw, whatever
> random data is on the target device will still be there for unallocated
> sectors.
> 
> This is because during the mirror, we set the dirty bitmap to copy only
> sectors allocated above 'base'.  In the case of target devices where we
> cannot assume unallocated sectors will be read as zeroes, we need to
> explicitely zero out this data.
> 
> In order to avoid zeroing out all sectors of the target device prior to
> mirroring, we do zeroing as part of the block job.  A second dirty
> bitmap cache is created, to track sectors that are unallocated above
> 'base'.  These sectors are then checked for status of BDRV_BLOCK_ZERO
> on the target - if they are not, then zeroes are explicitly written.
> 
> This only occurs under two conditions:
> 
> 1. 'mode' != "existing"
> 2. bdrv_has_zero_init(target) == NULL
> 
> We perform the mirroring through mirror_iteration() as before, except
> in two passes.  If the above two conditions are met, the first pass
> is using the bitmap tracking unallocated sectors, to write the needed
> zeroes.  Then, the second pass is performed, to mirror the actual data
> as before.
> 
> If the above two conditions are not met, then the first pass is skipped,
> and only the second pass (the one with the actual data) is performed.
> 
> Signed-off-by: Jeff Cody 
> ---
>  block/mirror.c| 109 
> ++
>  blockdev.c|   2 +-
>  include/block/block_int.h |   3 +-
>  qapi/block-core.json  |   6 ++-
>  4 files changed, 87 insertions(+), 33 deletions(-)
> 
> diff --git a/block/mirror.c b/block/mirror.c
> index 405e5c4..b599176 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -50,7 +50,9 @@ typedef struct MirrorBlockJob {
>  int64_t bdev_length;
>  unsigned long *cow_bitmap;
>  BdrvDirtyBitmap *dirty_bitmap;
> -HBitmapIter hbi;
> +HBitmapIter zero_hbi;
> +HBitmapIter allocated_hbi;
> +HBitmapIter *hbi;
>  uint8_t *buf;
>  QSIMPLEQ_HEAD(, MirrorBuffer) buf_free;
>  int buf_free_count;
> @@ -60,6 +62,8 @@ typedef struct MirrorBlockJob {
>  int sectors_in_flight;
>  int ret;
>  bool unmap;
> +bool zero_unallocated;
> +bool zero_cycle;
>  bool waiting_for_io;
>  } MirrorBlockJob;
>  
> @@ -166,10 +170,10 @@ static uint64_t coroutine_fn 
> mirror_iteration(MirrorBlockJob *s)
>  int pnum;
>  int64_t ret;
>  
> -s->sector_num = hbitmap_iter_next(>hbi);
> +s->sector_num = hbitmap_iter_next(s->hbi);
>  if (s->sector_num < 0) {
> -bdrv_dirty_iter_init(s->dirty_bitmap, >hbi);
> -s->sector_num = hbitmap_iter_next(>hbi);
> +bdrv_dirty_iter_init(s->dirty_bitmap, s->hbi);
> +s->sector_num = hbitmap_iter_next(s->hbi);
>  trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap));
>  assert(s->sector_num >= 0);
>  }
> @@ -287,7 +291,7 @@ static uint64_t coroutine_fn 
> mirror_iteration(MirrorBlockJob *s)
>   */
>  if (next_sector > hbitmap_next_sector
>  && bdrv_get_dirty(source, s->dirty_bitmap, next_sector)) {
> -hbitmap_next_sector = hbitmap_iter_next(>hbi);
> +hbitmap_next_sector = hbitmap_iter_next(s->hbi);
>  }
>  
>  next_sector += sectors_per_chunk;
> @@ -300,25 +304,34 @@ static uint64_t coroutine_fn 
> mirror_iteration(MirrorBlockJob *s)
>  s->sectors_in_flight += nb_sectors;
>  trace_mirror_one_iteration(s, sector_num, nb_sectors);
>  
> -ret = bdrv_get_block_status_above(source, NULL, sector_num,
> -  nb_sectors, );
> -if (ret < 0 || pnum < nb_sectors ||
> -(ret & BDRV_BLOCK_DATA && !(ret & BDRV_BLOCK_ZERO))) {
> -bdrv_aio_readv(source, sector_num, >qiov, nb_sectors,
> -   mirror_read_complete, op);
> -} else if (ret & BDRV_BLOCK_ZERO) {
> -bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
> -  s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
> -  mirror_write_complete, op);
> +if (s->zero_cycle) {
> +ret = bdrv_get_block_status(s->target, sector_num, nb_sectors, 
> );
> +if (!(ret & BDRV_BLOCK_ZERO)) {
> +bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
> +  s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
> +  mirror_write_complete, op);
> +}
>  } else {
> -assert(!(ret & BDRV_BLOCK_DATA));
> -bdrv_aio_discard(s->target, sector_num, op->nb_sectors,
> - mirror_write_complete, op);
> +ret = bdrv_get_block_status_above(source, NULL, sector_num,
> +