Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben: > Rewrite bdrv_check_perm(), bdrv_abort_perm_update() and bdrv_set_perm() > to update nodes in topological sort order instead of simple DFS. With > topologically sorted nodes, we update a node only when all its parents > already updated. With DFS it's not so. > > Consider the following example: > > A -+ > | | > | v > | B > | | > v | > C<-+ > > A is parent for B and C, B is parent for C. > > Obviously, to update permissions, we should go in order A B C, so, when > we update C, all parent permissions already updated.
I wondered for a moment why this order is obvious. Taking a permission on A may mean that we need to take the permisson on C, too. The answer is (or so I think) that the whole operation is atomic so the half-updated state will never be visible to a caller, but this is about calculating the right permissions. Permissions a node needs on its children may depend on what its parents requested, but parent permissions never depend on what children request. Ok, makes sense. > But with current > approach (simple recursion) we can update in sequence A C B C (C is > updated twice). On first update of C, we consider old B permissions, so > doing wrong thing. If it succeed, all is OK, on second C update we will > finish with correct graph. But if the wrong thing failed, we break the > whole process for no reason (it's possible that updated B permission > will be less strict, but we will never check it). > > Also new approach gives a way to simultaneously and correctly update > several nodes, we just need to run bdrv_topological_dfs() several times > to add all nodes and their subtrees into one topologically sorted list > (next patch will update bdrv_replace_node() in this manner). > > Test test_parallel_perm_update() is now passing, so move it out of > debugging "if". > > We also need to support ignore_children in > bdrv_check_parents_compliance(). > > For test 283 order of parents compliance check is changed. > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com> > --- > block.c | 103 +++++++++++++++++++++++++++++------- > tests/test-bdrv-graph-mod.c | 4 +- > tests/qemu-iotests/283.out | 2 +- > 3 files changed, 86 insertions(+), 23 deletions(-) > > diff --git a/block.c b/block.c > index 92bfcbedc9..81ccf51605 100644 > --- a/block.c > +++ b/block.c > @@ -1994,7 +1994,9 @@ static bool bdrv_a_allow_b(BdrvChild *a, BdrvChild *b, > Error **errp) > return false; > } > > -static bool bdrv_check_parents_compliance(BlockDriverState *bs, Error **errp) > +static bool bdrv_check_parents_compliance(BlockDriverState *bs, > + GSList *ignore_children, > + Error **errp) > { > BdrvChild *a, *b; > > @@ -2005,7 +2007,9 @@ static bool > bdrv_check_parents_compliance(BlockDriverState *bs, Error **errp) > */ > QLIST_FOREACH(a, &bs->parents, next_parent) { > QLIST_FOREACH(b, &bs->parents, next_parent) { > - if (a == b) { > + if (a == b || g_slist_find(ignore_children, a) || > + g_slist_find(ignore_children, b)) 'a' should be checked in the outer loop, no reason to repeat the same check all the time in the inner loop. > + { > continue; > } > > @@ -2034,6 +2038,29 @@ static void bdrv_child_perm(BlockDriverState *bs, > BlockDriverState *child_bs, > } > } > > +static GSList *bdrv_topological_dfs(GSList *list, GHashTable *found, > + BlockDriverState *bs) It would be good to have a comment that explains the details of the contract. In particular, this seems to require that @list is already topologically sorted, and it's complete in the sense that if a node is in the list, all of its children are in the list, too. > +{ > + BdrvChild *child; > + g_autoptr(GHashTable) local_found = NULL; > + > + if (!found) { > + assert(!list); > + found = local_found = g_hash_table_new(NULL, NULL); > + } > + > + if (g_hash_table_contains(found, bs)) { > + return list; > + } > + g_hash_table_add(found, bs); > + > + QLIST_FOREACH(child, &bs->children, next) { > + list = bdrv_topological_dfs(list, found, child->bs); > + } > + > + return g_slist_prepend(list, bs); > +} > + > static void bdrv_child_set_perm_commit(void *opaque) > { > BdrvChild *c = opaque; > @@ -2098,10 +2125,10 @@ static void bdrv_child_set_perm_safe(BdrvChild *c, > uint64_t perm, > * A call to this function must always be followed by a call to > bdrv_set_perm() > * or bdrv_abort_perm_update(). > */ One big source of confusion for me when trying to understand this was that bdrv_check_perm() is a misnomer since commit f962e96150e and the above comment isn't really accurate any more. The function doesn't only check the validity of the new permissions in advance to actually making the change, but it already updates the permissions of all child nodes (however not of its root node). So we have gone from the original check/set/abort model (which the function names still suggest) to a prepare/commit/rollback model. I think some comment updates are in order, and possibly we should rename some functions, too. > -static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q, > - uint64_t cumulative_perms, > - uint64_t cumulative_shared_perms, > - GSList *ignore_children, Error **errp) > +static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q, > + uint64_t cumulative_perms, > + uint64_t cumulative_shared_perms, > + GSList *ignore_children, Error **errp) > { > BlockDriver *drv = bs->drv; > BdrvChild *c; > @@ -2166,21 +2193,43 @@ static int bdrv_check_perm(BlockDriverState *bs, > BlockReopenQueue *q, > /* Check all children */ > QLIST_FOREACH(c, &bs->children, next) { > uint64_t cur_perm, cur_shared; > - GSList *cur_ignore_children; > > bdrv_child_perm(bs, c->bs, c, c->role, q, > cumulative_perms, cumulative_shared_perms, > &cur_perm, &cur_shared); > + bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL); This "added" line is actually old code. What is removed here is the recursive call of bdrv_check_update_perm(). This is what the code below will have to replace. > + } > + > + return 0; > +} > + > +static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q, > + uint64_t cumulative_perms, > + uint64_t cumulative_shared_perms, > + GSList *ignore_children, Error **errp) > +{ > + int ret; > + BlockDriverState *root = bs; > + g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, root); > + > + for ( ; list; list = list->next) { > + bs = list->data; > + > + if (bs != root) { > + if (!bdrv_check_parents_compliance(bs, ignore_children, errp)) { > + return -EINVAL; > + } At this point bs still had the old permissions, but we don't access them. As we're going in topological order, the parents have already been updated if they were a child covered in bdrv_node_check_perm(), so we're checking the relevant values. Good. What about the root node? If I understand correctly, the parents of the root nodes wouldn't have been checked in the old code. In the new state, the parent BdrvChild already has to contain the new permission. In bdrv_refresh_perms(), we already check parent conflicts, so no change for all callers going through it. Good. bdrv_reopen_multiple() is less obvious. It passes permissions from the BDRVReopenState, without applying the permissions first. Do we check the old parent permissions instead of the new state here? > + bdrv_get_cumulative_perm(bs, &cumulative_perms, > + &cumulative_shared_perms); > + } > > - cur_ignore_children = g_slist_prepend(g_slist_copy(ignore_children), > c); > - ret = bdrv_check_update_perm(c->bs, q, cur_perm, cur_shared, > - cur_ignore_children, errp); > - g_slist_free(cur_ignore_children); > + ret = bdrv_node_check_perm(bs, q, cumulative_perms, > + cumulative_shared_perms, > + ignore_children, errp); We use the original ignore_children for every node in the sorted list. The old code extends it with all nodes in the path to each node. For the bdrv_check_update_perm() call that is now replaced with bdrv_check_parents_compliance(), I think this was necessary because bdrv_check_update_perm() always assumes adding a new edge, so if you update one instead of adding it, you have to ignore it so that it can't conflict with itself. This isn't necessary any more now because we just update and then check for consistency. For passing to bdrv_node_check_perm() it doesn't make a difference anyway because the parameter is now unused (and should probably be removed). > if (ret < 0) { > return ret; > } > - > - bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL); > } > > return 0; A tricky patch to understand, but I think it's right for the most part. Kevin