Re: [PATCH v7 33/47] mirror: Deal with filters

2020-08-20 Thread Max Reitz
On 19.08.20 18:50, Kevin Wolf wrote:
> Am 25.06.2020 um 17:22 hat Max Reitz geschrieben:
>> This includes some permission limiting (for example, we only need to
>> take the RESIZE permission for active commits where the base is smaller
>> than the top).
>>
>> Use this opportunity to rename qmp_drive_mirror()'s "source" BDS to
>> "target_backing_bs", because that is what it really refers to.
>>
>> Signed-off-by: Max Reitz 
> 
>> @@ -1682,6 +1721,7 @@ static BlockJob *mirror_start_job(
>>  s->zero_target = zero_target;
>>  s->copy_mode = copy_mode;
>>  s->base = base;
>> +s->base_overlay = bdrv_find_overlay(bs, base);
>>  s->granularity = granularity;
>>  s->buf_size = ROUND_UP(buf_size, granularity);
>>  s->unmap = unmap;
> 
> Is this valid without freezing the links between base_overlay and base?

Er...

> Actually, I guess we should freeze everything between bs and base (for
> base != NULL) and it's a preexisting problem that just happens to affect
> this code, too.

Yes, that’s how it looks to me, too.  I don’t think that has anything to
do with this patch.

> Or maybe freezing everything is too much. We only want to make sure that
> no non-filter is inserted between base and base_overlay and that base
> (and now base_overlay) always stay in the backing chain of bs. But what
> options apart from freezing do we have to achieve this?

I don’t know of any, and I don’t know whether anyone would actually care
if we were to just freeze everything.

> Why is using base_overlay even better than using base? Assuming there is
> a good reason, maybe the commit message could spell it out.

The problem is that querying the block status for a filter node falls
through to the underlying data-carrying node.  So if there’s a filter on
top of @base, and we query for is_allocated_above above @base, then
we’ll include @base, which we do not want.

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v7 33/47] mirror: Deal with filters

2020-08-19 Thread Kevin Wolf
Am 25.06.2020 um 17:22 hat Max Reitz geschrieben:
> This includes some permission limiting (for example, we only need to
> take the RESIZE permission for active commits where the base is smaller
> than the top).
> 
> Use this opportunity to rename qmp_drive_mirror()'s "source" BDS to
> "target_backing_bs", because that is what it really refers to.
> 
> Signed-off-by: Max Reitz 

> @@ -1682,6 +1721,7 @@ static BlockJob *mirror_start_job(
>  s->zero_target = zero_target;
>  s->copy_mode = copy_mode;
>  s->base = base;
> +s->base_overlay = bdrv_find_overlay(bs, base);
>  s->granularity = granularity;
>  s->buf_size = ROUND_UP(buf_size, granularity);
>  s->unmap = unmap;

Is this valid without freezing the links between base_overlay and base?

Actually, I guess we should freeze everything between bs and base (for
base != NULL) and it's a preexisting problem that just happens to affect
this code, too.

Or maybe freezing everything is too much. We only want to make sure that
no non-filter is inserted between base and base_overlay and that base
(and now base_overlay) always stay in the backing chain of bs. But what
options apart from freezing do we have to achieve this?

Why is using base_overlay even better than using base? Assuming there is
a good reason, maybe the commit message could spell it out.

Kevin




Re: [PATCH v7 33/47] mirror: Deal with filters

2020-07-24 Thread Andrey Shinkevich

On 24.07.2020 12:49, Max Reitz wrote:

On 22.07.20 20:31, Andrey Shinkevich wrote:

On 25.06.2020 18:22, Max Reitz wrote:

This includes some permission limiting (for example, we only need to
take the RESIZE permission for active commits where the base is smaller
than the top).

Use this opportunity to rename qmp_drive_mirror()'s "source" BDS to
"target_backing_bs", because that is what it really refers to.

Signed-off-by: Max Reitz 
---
   qapi/block-core.json |   6 ++-
   block/mirror.c   | 118 +--
   blockdev.c   |  36 +
   3 files changed, 121 insertions(+), 39 deletions(-)


...

diff --git a/block/mirror.c b/block/mirror.c
index 469acf4600..770de3b34e 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -42,6 +42,7 @@ typedef struct MirrorBlockJob {
   BlockBackend *target;
   BlockDriverState *mirror_top_bs;
   BlockDriverState *base;
+    BlockDriverState *base_overlay;
     /* The name of the graph node to replace */
   char *replaces;
@@ -677,8 +678,10 @@ static int mirror_exit_common(Job *job)
    _abort);
   if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
   BlockDriverState *backing = s->is_none_mode ? src : s->base;
-    if (backing_bs(target_bs) != backing) {
-    bdrv_set_backing_hd(target_bs, backing, _err);
+    BlockDriverState *unfiltered_target =
bdrv_skip_filters(target_bs);
+
+    if (bdrv_cow_bs(unfiltered_target) != backing) {


I just worry about a filter node of the concurrent job right below the
unfiltered_target.

Having a concurrent job on the target sounds extremely problematic in
itself (because at least for most of the mirror job, the target isn’t in
a consistent state).  Is that a real use case?



It might be at the TestParallelOps of iotests #30 but I am not sure now. 
I am going to apply my series with copy-on-read filter for the stream 
job above this one and will see then.


Andrey





The filter has unfiltered_target in its parent list.
Will that filter node be replaced correctly then?

I’m also not quite sure what you mean.  We need to attach the source’s
backing chain to the target here, so we go down to the first node that
might accept COW backing files (by invoking bdrv_skip_filters()).  That
should be correct no matter what kind of filters are on it.



I ment when a filter is removed with the bdrv_replace_node() afterwards. 
As I mentioned above, I am going to test the case later.


Andrey



+    /*
+ * The topmost node with
+ * bdrv_skip_filters(filtered_target) ==
bdrv_skip_filters(target)
+ */
+    filtered_target = bdrv_cow_bs(bdrv_find_overlay(bs, target));
+
+    assert(bdrv_skip_filters(filtered_target) ==
+   bdrv_skip_filters(target));
+
+    /*
+ * XXX BLK_PERM_WRITE needs to be allowed so we don't block
+ * ourselves at s->base (if writes are blocked for a node,
they are
+ * also blocked for its backing file). The other options
would be a
+ * second filter driver above s->base (== target).
+ */
+    iter_shared_perms = BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE;
+
+    for (iter = bdrv_filter_or_cow_bs(bs); iter != target;
+ iter = bdrv_filter_or_cow_bs(iter))
+    {
+    if (iter == filtered_target) {


For one filter node only?

No, iter_shared_perms is never reset, so it retains the
BLK_PERM_CONSISTENT_READ flag until the end of the loop.



Yes, that's right. Clear.

Andrey





+    /*
+ * From here on, all nodes are filters on the base.
+ * This allows us to share BLK_PERM_CONSISTENT_READ.
+ */
+    iter_shared_perms |= BLK_PERM_CONSISTENT_READ;
+    }
+
   ret = block_job_add_bdrv(>common, "intermediate
node", iter, 0,
- BLK_PERM_WRITE_UNCHANGED |
BLK_PERM_WRITE,
- errp);
+ iter_shared_perms, errp);
   if (ret < 0) {
   goto fail;
   }

...

@@ -3042,6 +3053,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error
**errp)
    " named node of the graph");
   goto out;
   }
+    replaces_node_name = arg->replaces;


What is the idea behind the variables substitution?

Looks like a remnant from v6, where there was an

if (arg->has_replaces) {
 ...
 replaces_node_name = arg->replaces;
} else if (unfiltered_bs != bs) {
 replaces_node_name = unfiltered_bs->node_name;
}

But I moved that logic to blockdev_mirror_common() in this version.

So it’s just useless now and replaces_node_name shouldn’t exist.

Max





Re: [PATCH v7 33/47] mirror: Deal with filters

2020-07-24 Thread Max Reitz
On 22.07.20 20:31, Andrey Shinkevich wrote:
> On 25.06.2020 18:22, Max Reitz wrote:
>> This includes some permission limiting (for example, we only need to
>> take the RESIZE permission for active commits where the base is smaller
>> than the top).
>>
>> Use this opportunity to rename qmp_drive_mirror()'s "source" BDS to
>> "target_backing_bs", because that is what it really refers to.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>   qapi/block-core.json |   6 ++-
>>   block/mirror.c   | 118 +--
>>   blockdev.c   |  36 +
>>   3 files changed, 121 insertions(+), 39 deletions(-)
>>
> ...
>> diff --git a/block/mirror.c b/block/mirror.c
>> index 469acf4600..770de3b34e 100644
>> --- a/block/mirror.c
>> +++ b/block/mirror.c
>> @@ -42,6 +42,7 @@ typedef struct MirrorBlockJob {
>>   BlockBackend *target;
>>   BlockDriverState *mirror_top_bs;
>>   BlockDriverState *base;
>> +    BlockDriverState *base_overlay;
>>     /* The name of the graph node to replace */
>>   char *replaces;
>> @@ -677,8 +678,10 @@ static int mirror_exit_common(Job *job)
>>    _abort);
>>   if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
>>   BlockDriverState *backing = s->is_none_mode ? src : s->base;
>> -    if (backing_bs(target_bs) != backing) {
>> -    bdrv_set_backing_hd(target_bs, backing, _err);
>> +    BlockDriverState *unfiltered_target =
>> bdrv_skip_filters(target_bs);
>> +
>> +    if (bdrv_cow_bs(unfiltered_target) != backing) {
> 
> 
> I just worry about a filter node of the concurrent job right below the
> unfiltered_target.

Having a concurrent job on the target sounds extremely problematic in
itself (because at least for most of the mirror job, the target isn’t in
a consistent state).  Is that a real use case?

> The filter has unfiltered_target in its parent list.
> Will that filter node be replaced correctly then?

I’m also not quite sure what you mean.  We need to attach the source’s
backing chain to the target here, so we go down to the first node that
might accept COW backing files (by invoking bdrv_skip_filters()).  That
should be correct no matter what kind of filters are on it.
>> +    /*
>> + * The topmost node with
>> + * bdrv_skip_filters(filtered_target) ==
>> bdrv_skip_filters(target)
>> + */
>> +    filtered_target = bdrv_cow_bs(bdrv_find_overlay(bs, target));
>> +
>> +    assert(bdrv_skip_filters(filtered_target) ==
>> +   bdrv_skip_filters(target));
>> +
>> +    /*
>> + * XXX BLK_PERM_WRITE needs to be allowed so we don't block
>> + * ourselves at s->base (if writes are blocked for a node,
>> they are
>> + * also blocked for its backing file). The other options
>> would be a
>> + * second filter driver above s->base (== target).
>> + */
>> +    iter_shared_perms = BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE;
>> +
>> +    for (iter = bdrv_filter_or_cow_bs(bs); iter != target;
>> + iter = bdrv_filter_or_cow_bs(iter))
>> +    {
>> +    if (iter == filtered_target) {
> 
> 
> For one filter node only?

No, iter_shared_perms is never reset, so it retains the
BLK_PERM_CONSISTENT_READ flag until the end of the loop.

>> +    /*
>> + * From here on, all nodes are filters on the base.
>> + * This allows us to share BLK_PERM_CONSISTENT_READ.
>> + */
>> +    iter_shared_perms |= BLK_PERM_CONSISTENT_READ;
>> +    }
>> +
>>   ret = block_job_add_bdrv(>common, "intermediate
>> node", iter, 0,
>> - BLK_PERM_WRITE_UNCHANGED |
>> BLK_PERM_WRITE,
>> - errp);
>> + iter_shared_perms, errp);
>>   if (ret < 0) {
>>   goto fail;
>>   }
> ...
>> @@ -3042,6 +3053,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error
>> **errp)
>>    " named node of the graph");
>>   goto out;
>>   }
>> +    replaces_node_name = arg->replaces;
> 
> 
> What is the idea behind the variables substitution?

Looks like a remnant from v6, where there was an

if (arg->has_replaces) {
...
replaces_node_name = arg->replaces;
} else if (unfiltered_bs != bs) {
replaces_node_name = unfiltered_bs->node_name;
}

But I moved that logic to blockdev_mirror_common() in this version.

So it’s just useless now and replaces_node_name shouldn’t exist.

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v7 33/47] mirror: Deal with filters

2020-07-22 Thread Andrey Shinkevich

On 25.06.2020 18:22, Max Reitz wrote:

This includes some permission limiting (for example, we only need to
take the RESIZE permission for active commits where the base is smaller
than the top).

Use this opportunity to rename qmp_drive_mirror()'s "source" BDS to
"target_backing_bs", because that is what it really refers to.

Signed-off-by: Max Reitz 
---
  qapi/block-core.json |   6 ++-
  block/mirror.c   | 118 +--
  blockdev.c   |  36 +
  3 files changed, 121 insertions(+), 39 deletions(-)


...

diff --git a/block/mirror.c b/block/mirror.c
index 469acf4600..770de3b34e 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -42,6 +42,7 @@ typedef struct MirrorBlockJob {
  BlockBackend *target;
  BlockDriverState *mirror_top_bs;
  BlockDriverState *base;
+BlockDriverState *base_overlay;
  
  /* The name of the graph node to replace */

  char *replaces;
@@ -677,8 +678,10 @@ static int mirror_exit_common(Job *job)
   _abort);
  if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
  BlockDriverState *backing = s->is_none_mode ? src : s->base;
-if (backing_bs(target_bs) != backing) {
-bdrv_set_backing_hd(target_bs, backing, _err);
+BlockDriverState *unfiltered_target = bdrv_skip_filters(target_bs);
+
+if (bdrv_cow_bs(unfiltered_target) != backing) {



I just worry about a filter node of the concurrent job right below the 
unfiltered_target. The filter has unfiltered_target in its parent list. 
Will that filter node be replaced correctly then?



Andrey

...


+/*
+ * The topmost node with
+ * bdrv_skip_filters(filtered_target) == bdrv_skip_filters(target)
+ */
+filtered_target = bdrv_cow_bs(bdrv_find_overlay(bs, target));
+
+assert(bdrv_skip_filters(filtered_target) ==
+   bdrv_skip_filters(target));
+
+/*
+ * XXX BLK_PERM_WRITE needs to be allowed so we don't block
+ * ourselves at s->base (if writes are blocked for a node, they are
+ * also blocked for its backing file). The other options would be a
+ * second filter driver above s->base (== target).
+ */
+iter_shared_perms = BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE;
+
+for (iter = bdrv_filter_or_cow_bs(bs); iter != target;
+ iter = bdrv_filter_or_cow_bs(iter))
+{
+if (iter == filtered_target) {



For one filter node only?



+/*
+ * From here on, all nodes are filters on the base.
+ * This allows us to share BLK_PERM_CONSISTENT_READ.
+ */
+iter_shared_perms |= BLK_PERM_CONSISTENT_READ;
+}
+
  ret = block_job_add_bdrv(>common, "intermediate node", iter, 0,
- BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE,
- errp);
+ iter_shared_perms, errp);
  if (ret < 0) {
  goto fail;
  }

...

@@ -3042,6 +3053,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
   " named node of the graph");
  goto out;
  }
+replaces_node_name = arg->replaces;



What is the idea behind the variables substitution?

Probably, the patch might be split out.

Andrey





[PATCH v7 33/47] mirror: Deal with filters

2020-06-25 Thread Max Reitz
This includes some permission limiting (for example, we only need to
take the RESIZE permission for active commits where the base is smaller
than the top).

Use this opportunity to rename qmp_drive_mirror()'s "source" BDS to
"target_backing_bs", because that is what it really refers to.

Signed-off-by: Max Reitz 
---
 qapi/block-core.json |   6 ++-
 block/mirror.c   | 118 +--
 blockdev.c   |  36 +
 3 files changed, 121 insertions(+), 39 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index df87855429..0b8ccd30aa 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1943,7 +1943,8 @@
 #
 # @replaces: with sync=full graph node name to be replaced by the new
 #image when a whole image copy is done. This can be used to repair
-#broken Quorum files. (Since 2.1)
+#broken Quorum files.  By default, @device is replaced, although
+#implicitly created filters on it are kept. (Since 2.1)
 #
 # @mode: whether and how QEMU should create a new image, default is
 #'absolute-paths'.
@@ -2254,7 +2255,8 @@
 #
 # @replaces: with sync=full graph node name to be replaced by the new
 #image when a whole image copy is done. This can be used to repair
-#broken Quorum files.
+#broken Quorum files.  By default, @device is replaced, although
+#implicitly created filters on it are kept.
 #
 # @speed:  the maximum speed, in bytes per second
 #
diff --git a/block/mirror.c b/block/mirror.c
index 469acf4600..770de3b34e 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -42,6 +42,7 @@ typedef struct MirrorBlockJob {
 BlockBackend *target;
 BlockDriverState *mirror_top_bs;
 BlockDriverState *base;
+BlockDriverState *base_overlay;
 
 /* The name of the graph node to replace */
 char *replaces;
@@ -677,8 +678,10 @@ static int mirror_exit_common(Job *job)
  _abort);
 if (!abort && s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
 BlockDriverState *backing = s->is_none_mode ? src : s->base;
-if (backing_bs(target_bs) != backing) {
-bdrv_set_backing_hd(target_bs, backing, _err);
+BlockDriverState *unfiltered_target = bdrv_skip_filters(target_bs);
+
+if (bdrv_cow_bs(unfiltered_target) != backing) {
+bdrv_set_backing_hd(unfiltered_target, backing, _err);
 if (local_err) {
 error_report_err(local_err);
 local_err = NULL;
@@ -740,7 +743,7 @@ static int mirror_exit_common(Job *job)
  * valid.
  */
 block_job_remove_all_bdrv(bjob);
-bdrv_replace_node(mirror_top_bs, backing_bs(mirror_top_bs), _abort);
+bdrv_replace_node(mirror_top_bs, mirror_top_bs->backing->bs, _abort);
 
 /* We just changed the BDS the job BB refers to (with either or both of the
  * bdrv_replace_node() calls), so switch the BB back so the cleanup does
@@ -786,7 +789,6 @@ static void coroutine_fn mirror_throttle(MirrorBlockJob *s)
 static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
 {
 int64_t offset;
-BlockDriverState *base = s->base;
 BlockDriverState *bs = s->mirror_top_bs->backing->bs;
 BlockDriverState *target_bs = blk_bs(s->target);
 int ret;
@@ -837,7 +839,8 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
 return 0;
 }
 
-ret = bdrv_is_allocated_above(bs, base, false, offset, bytes, );
+ret = bdrv_is_allocated_above(bs, s->base_overlay, true, offset, bytes,
+  );
 if (ret < 0) {
 return ret;
 }
@@ -936,7 +939,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
 } else {
 s->target_cluster_size = BDRV_SECTOR_SIZE;
 }
-if (backing_filename[0] && !target_bs->backing &&
+if (backing_filename[0] && !bdrv_backing_chain_next(target_bs) &&
 s->granularity < s->target_cluster_size) {
 s->buf_size = MAX(s->buf_size, s->target_cluster_size);
 s->cow_bitmap = bitmap_new(length);
@@ -1116,8 +1119,9 @@ static void mirror_complete(Job *job, Error **errp)
 if (s->backing_mode == MIRROR_OPEN_BACKING_CHAIN) {
 int ret;
 
-assert(!target->backing);
-ret = bdrv_open_backing_file(target, NULL, "backing", errp);
+assert(!bdrv_backing_chain_next(target));
+ret = bdrv_open_backing_file(bdrv_skip_filters(target), NULL,
+ "backing", errp);
 if (ret < 0) {
 return;
 }
@@ -1565,8 +1569,8 @@ static BlockJob *mirror_start_job(
 MirrorBlockJob *s;
 MirrorBDSOpaque *bs_opaque;
 BlockDriverState *mirror_top_bs;
-bool target_graph_mod;
 bool target_is_backing;
+uint64_t target_perms, target_shared_perms;
 Error *local_err = NULL;
 int ret;
 
@@ -1585,7 +1589,7 @@ static BlockJob