> > > > The main purpose of this is mirror.c and commit.c would form BDS loops on > > completion. > > These callers could break the look manually but the code would fail > > if a loop is not breaked and the blocker function are called on it. > > So the blocker code have to handle recursion loops. > > I think commit/mirror/etc should break any loops prior to calling > recursive functions on those loops (just like it should do before > calling bdrv_unref(), etc..). Otherwise, I think the recursive > functions make assumptions that may be true in certain contexts, but > not necessarily all. > > (Hmm, I believe that Fam had a series that did some nice cleanup on > bdrv_drop_intermediate() and related areas that did some loop > breaking, IIRC).
Ok I could use that as a basis. > > > > I don't think this particular test is a failure point. > > > > With the way it is written, the individual BDS is blocked with the > same reason pointer, but not necessarily the whole chain - unless you > make the assumption that blockers will only be used via the recursive > interface, and never individually on a node. there is no more a no recursive version with this patch so this assumption will be respected. > > The caller doesn't have an interface to check that the chain is not > blocked - bdrv_op_is_blocked_by() doesn't operate recursively. > > If we tried to block a chain that has some portion already blocked for > the same reason, shouldn't that be an error? Why should we allow this ? My understanding is that blocking something should be associated to a single operation whatever they are. So one operation to block implying one different reason is not so strange. > > > > + > > > > + /* block first for recursion loop protection to work */ > > > > + bdrv_do_op_block(bs, op, reason); > > > > + > > > > + bdrv_op_block(bs->file, base, op, reason); > > > > + > > > > + if (bs->drv && bs->drv->supports_backing) { > > > > + bdrv_op_block(bs->backing_hd, base, op, reason); > > > > + } > > > > + > > > > + if (bs->drv && bs->drv->bdrv_op_recursive_block) { > > > > + bs->drv->bdrv_op_recursive_block(bs, base, op, reason); > > > > > > Do we need to allow .bdrv_op_recursive_block() to fail and return > > > error (and handle it, of course)? > > > > I don't know yet: but lets let this question float a little more in the > > mail thread. > > > > > > > > > > > > > > > To reach this state the caller code would have to do the following twisted > > sequence. > > > > block(image3, with_reason1) > > unblock(image2, with_reason1) > > block(image1, with_reason1) > > > > There is no such sequence in the code thanks to the base argument and we > > could > > enforce that no such sequence ever get written. > > > > If we ignore blockdev-add and scenarios where an image node may have > multiple overlays, we might be able to assume that such a sequence > could not exist. > > But in that case, should this negative check result in an error? > > It would seem at this point we would have encountered one of these > scenarios: > > 1.) An invalid block/unblock state in the chain, if we assume that no > such sequence should exist > > 2.) We tried to unblock more than we originally blocked > > > > > > > I would assume that bdrv_op_unblock(image2, NULL, reason) would still > > > unblock image1, even if image2 was unblocked. > > > > Should we also assume that bdrv_op_unblock(image4, NULL, reason) with > > image4 being > > image3 parent unblock everything underneath ? > > > > I think we either do that, or return an error. But to have > bdrv_op_unblock() (or bdrv_op_block()) silently stop at some point in > the chain prior to reaching 'base' doesn't seem correct to me. > Maybe you are right. I don't mind rewriting the patchset with error handling and without the recursion loop avoidance code given I find Fam's loop breaking patches on the list. I remember trying to write loop breaking by myself just before 2.1 and it was annoying. Best regards Benoît