On Thu, Sep 01, 2022 at 11:04:09AM +0200, Laszlo Ersek wrote:
...
> > 
> > Rather than open-code a cleanup loop on all error paths, I instead fix
> > the problem by adding a meta_valid witness that is only set to true in
> > select places.  The obvious place is when handling NBD_REP_ACK; but it

[1]

> > also makes sense if we didn't call NBD_OPT_SET_META_CONTEXT at all: if
> > negotiating structured replies failed (including oldstyle servers), or
> > if the user never called nbd_add_meta_context() and therefore doesn't
> > care (blindly returning 0 to nbd_can_meta_context() in those cases is
> > fine; our existing tests/oldstyle and tests/newstyle-limited cover

[2]

> > that).  However, whereas we previously always returned a 0/1 answer
> > for nbd_can_meta_context once we are in transmission phase, this new
> > witness now means that in the corner case of explicit server failure
> > to NBD_OPT_SET_META_CONTEXT, nbd_can_meta_context() returns -1 to call
> > attention to the earlier server failure (although it is something that
> > is unlikely enough in practice that I doubt anyone will experience it
> > in the wild).
> > 
> > The change in nbd_can_meta_context() to use h->meta_valid instead of
> > h->eflags has an additional latent semantic difference not observable
> > at this time (because both h->meta_valid and h->eflags are set in
> > tandem by successful nbd_opt_go()/nbd_opt_info() in their current
> > two-command sequence), but which matters to future patches adding new

[3]

> > APIs.  On the one hand, once we add nbd_set_request_meta_context() to
> > stop nbd_opt_go() from performing NBD_OPT_SET_META_CONTEXT, we want
> > nbd_can_meta_context() to fail during transmission phase if the
> > context was not set elsewhere during negotiation, rather than blindly

patch 4/12 adds a test for this

> > returning 0.  On the other hand, when manually calling
> > nbd_opt_set_meta_context() during negotiation, we do not want it to
> > touch h->eflags, but do want nbd_can_meta_context() to start working
> > right away.

patch 11/12 adds a test for this

> > 
> > Note that the use of a witness flag also allows me to slightly change
> > when the list of previously-negotiated contexts is freed - instead of
> > doing it in nbd_internal_reset_size_and_flags(), that function now
> > simply marks any existing vector as invalid; and the actual wipe is

[4]

> > done when starting a new NBD_OPT_SET_META_CONTEXT or when closing

[5]

> > struct nbd_handle.  There are still some oddities to be cleaned up in

[6]

> > later patches by moving when the flag is marked invalid (for example,
> > we really want nbd_set_export_name() to wipe both flags and meta

patch 9/12

> > contexts, but the act of NBD_OPT_GO should not wipe contexts, and the

patch 4/12

> > act of NBD_OPT_SET_META_CONTEXT should not wipe flags), but I'm

patch 7/12

> > leaving those further tweaks for later patches to make this one easier
> > to review.
...

> > +++ b/generator/states-newstyle-opt-meta-context.c
> > @@ -29,6 +29,9 @@ STATE_MACHINE {
> >     */
> >    assert (h->gflags & LIBNBD_HANDSHAKE_FLAG_FIXED_NEWSTYLE);
> >    nbd_internal_reset_size_and_flags (h);
> > +  for (i = 0; i < h->meta_contexts.len; ++i)
> > +    free (h->meta_contexts.ptr[i].name);
> > +  meta_vector_reset (&h->meta_contexts);
> 
> This is code movement out of nbd_internal_reset_size_and_flags(), OK.

yep, this is the wipe [5] moved here because it was taken out of [4]

> 
> >    if (h->opt_current == NBD_OPT_LIST_META_CONTEXT) {
> >      assert (h->opt_mode);
> >      assert (h->structured_replies);
> > @@ -44,7 +47,7 @@ STATE_MACHINE {
> >      }
> >    }
> > 
> > -  assert (h->meta_contexts.len == 0);
> > +  assert (!h->meta_valid);
> 
> OK, this is the new predicate that nbd_internal_reset_size_and_flags()
> now ensures.

yep, matches up with [4]

> 
> > 
> >    /* Calculate the length of the option request data. */
> >    len = 4 /* exportname len */ + strlen (h->export_name) + 4 /* nr queries 
> > */;
> > @@ -194,8 +197,10 @@ STATE_MACHINE {
> >        CALL_CALLBACK (h->opt_cb.completion, &err);
> >        nbd_internal_free_option (h);
> >      }
> > -    else
> > +    else {
> >        SET_NEXT_STATE (%^OPT_GO.START);
> > +      h->meta_valid = true;
> > +    }
> >      break;
> >    case NBD_REP_META_CONTEXT:  /* A context. */
> >      if (len > maxpayload)
> 
> IIUC, this modifies the NBD_REP_ACK handling for
> NBD_OPT_SET_META_CONTEXT. IIUC we've consumed all the mappings, so we
> can now set the witness to "OK". Seems OK.

yep, matches to [1].  All other error paths out of this state no
longer set meta_valid, which saves me the effort of having to
open-code cleanup into each of those points.

> 
> > diff --git a/generator/states-reply-structured.c 
> > b/generator/states-reply-structured.c
> > index aaf75b8..f18c999 100644
> > --- a/generator/states-reply-structured.c
> > +++ b/generator/states-reply-structured.c
> > @@ -445,6 +445,7 @@ STATE_MACHINE {
> >      assert (CALLBACK_IS_NOT_NULL (cmd->cb.fn.extent));
> >      assert (h->bs_entries);
> >      assert (length >= 12);
> > +    assert (h->meta_valid);
> > 
> >      /* Need to byte-swap the entries returned, but apart from that we
> >       * don't validate them.
> 
> This seems to express that we may only have initiated a block status
> request (for which we are now processing the replies --
> REPLY.STRUCTURED_REPLY.RECV_BS_ENTRIES) in case we had a valid meta
> context list first. Seems plausible. (A bit lower down in the context,
> not quoted, we "Look up the context ID" in the meta context vector, so
> yes, very sensible assertion.)

Yeah, the assertion is useful to prove we aren't reading a
half-populated h->meta_contexts vector left half-populated after a
failed NBD_OPT_SET_META_CONTEXT; and it is met because
nbd_unlocked_aio_block_status() checked before ever issuing
NBD_CMD_BLOCK_STATUS.

> 
> > diff --git a/lib/flags.c b/lib/flags.c
> > index 87c20ee..91efc1a 100644
> > --- a/lib/flags.c
> > +++ b/lib/flags.c
> > @@ -37,9 +37,7 @@ nbd_internal_reset_size_and_flags (struct nbd_handle *h)
> > 
> >    h->exportsize = 0;
> >    h->eflags = 0;
> > -  for (i = 0; i < h->meta_contexts.len; ++i)
> > -    free (h->meta_contexts.ptr[i].name);
> > -  meta_vector_reset (&h->meta_contexts);
> 
> Moved to NEWSTYLE.OPT_META_CONTEXT.START; OK.

This is [4]; it is actually moved to both
NEWSTYLE.OPT_META_CONTEXT.START [5], and to handle cleanup [6].

> 
> (This is the only place we reset the meta context vector, pre-patch.)
> 
> > +  h->meta_valid = false;
> 
> Replacing the above logic, OK.

Yes. As of this patch, calling nbd_internal_reset_size_and_flags()
still wipes both size and contexts state, unchanged from status quo
before; but later patches in the series separate the wipe of contexts
state to other locations.

> 
> >    h->block_minimum = 0;
> >    h->block_preferred = 0;
> >    h->block_maximum = 0;
> > @@ -75,6 +73,11 @@ nbd_internal_set_size_and_flags (struct nbd_handle *h,
> >      eflags &= ~NBD_FLAG_SEND_FAST_ZERO;
> >    }
> > 
> > +  if (!h->structured_replies || h->request_meta_contexts.len == 0) {
> > +    assert (h->meta_contexts.len == 0);
> > +    h->meta_valid = true;
> > +  }
> > +
> >    h->exportsize = exportsize;
> >    h->eflags = eflags;
> >    return 0;
> 
> Hmmm. This gives me a run for my money...
> 
> I think the condition here describes the case when we're never going to
> send an NBD_OPT_SET_META_CONTEXT -- meaning that we're never going to
> reach the "meta_valid = true" assignment in the ACK handling for
> NBD_OPT_SET_META_CONTEXT. Still we need to qualify our empty metacontext
> vector as valid somewhere, so nbd_internal_set_size_and_flags() is being
> proposed as the location for that.
> 
> The question is where nbd_internal_set_size_and_flags() is called
> *from*. I find:
> 
> - NEWSTYLE.OPT_EXPORT_NAME.CHECK_REPLY
> - NBD_REP_INFO | NBD_INFO_EXPORT
> - OLDSTYLE.CHECK
> 
> We also have a comment on nbd_internal_set_size_and_flags() saying
> 
> /* Set the export size and eflags on the handle, validating them.
>  * This is called from the state machine when either the newstyle or
>  * oldstyle negotiation reaches the point that these are available.
>  */
> 
> Aha! So, also in accordance with the commit message, I need to replace
> "we're never going to reach the ACK for NBD_OPT_SET_META_CONTEXT" with
> "we could never have reached the ACK for NBD_OPT_SET_META_CONTEXT".
> 
> I'm not sure if this is the *only place* (or the *best place*) to set
> the witness [*], but it does look like a "good one".

This is point [2] in the commit message.  Yeah, my alternatives were
to either find common code that is reached by both oldstyle and
OPT_EXPORT_NAME code (the two cases where we can't even attempt
NBD_OPT_SET_META_CONTEXT) - which is this function, or to open-code
the same effect into both callers.  I chose to put it once in the
common code, in part because...

> 
> [*] The commit message speaks about a "latent semantic difference" which
> is currently masked by our managing meta_valid and eflags strictly in
> tandem. Hopefully I'll understand that paragraph better when looking at
> later patches in the series.

... later in the series (4/12), this gets tweaked again by further
gating the logic here to only set h->meta_valid=true when
h->request_meta is set, rather than having to tweak all three points
that transition into transmission phase (even though only one of those
three points could have even attempted NBD_OPT_SET_META_CONTEXT), per
point [3].

> 
> > @@ -194,9 +197,8 @@ nbd_unlocked_can_meta_context (struct nbd_handle *h, 
> > const char *name)
> >  {
> >    size_t i;
> > 
> > -  if (h->eflags == 0) {
> > -    set_error (EINVAL, "server has not returned export flags, "
> > -               "you need to connect to the server first");
> > +  if (h->request_meta_contexts.len && !h->meta_valid) {
> > +    set_error (EINVAL, "need a successful server meta context request 
> > first");
> >      return -1;
> >    }
> 
> This is not easy to understand at first, but I kind of convinced myself
> with the following argument: the condition for successfully querying the
> context list is
> 
>   h->request_meta_contexts.len == 0 || h->meta_valid
> 
> i.e., we never needed to ask the server, or the server responded (and we
> parsed the response successfully). The *result* list may or may not be
> empty at this point (it could be empty for two reasons: we never asked
> for any contexts, or we did but the server supports none); either way,
> the result list is valid for searching.
> 
> OK.

Correct reading.

> 
> > 
> > diff --git a/lib/handle.c b/lib/handle.c
> > index 8713e18..03f45a4 100644
> > --- a/lib/handle.c
> > +++ b/lib/handle.c
> > @@ -115,6 +115,8 @@ nbd_create (void)
> >  void
> >  nbd_close (struct nbd_handle *h)
> >  {
> > +  size_t i;
> > +
> >    nbd_internal_set_error_context ("nbd_close");
> > 
> >    if (h == NULL)
> > @@ -127,6 +129,9 @@ nbd_close (struct nbd_handle *h)
> > 
> >    free (h->bs_entries);
> >    nbd_internal_reset_size_and_flags (h);
> > +  for (i = 0; i < h->meta_contexts.len; ++i)
> > +    free (h->meta_contexts.ptr[i].name);
> > +  meta_vector_reset (&h->meta_contexts);
> >    nbd_internal_free_option (h);
> >    free_cmd_list (h->cmds_to_issue);
> >    free_cmd_list (h->cmds_in_flight);
> 
> OK.

This is [6], needed because it was moved out of [4].

> 
> > diff --git a/lib/rw.c b/lib/rw.c
> > index 81f8f35..2ab85f3 100644
> > --- a/lib/rw.c
> > +++ b/lib/rw.c
> > @@ -482,7 +482,7 @@ nbd_unlocked_aio_block_status (struct nbd_handle *h,
> >        return -1;
> >      }
> > 
> > -    if (h->meta_contexts.len == 0) {
> > +    if (!h->meta_valid || h->meta_contexts.len == 0) {
> >        set_error (ENOTSUP, "did not negotiate any metadata contexts, "
> >                   "either you did not call nbd_add_meta_context before "
> >                   "connecting or the server does not support it");
> > 
> 
> Conversely, here the condition for passing is
> 
>   h->meta_valid && h->meta_contexts.len > 0
> 
> meaning we asked the server, and it supports at least one of the
> requested meta contexts.
> 
> I think the patch is correct, but whether it is *complete* as well is
> something I can't easily evaluate. With that caveat:
> 
> Reviewed-by: Laszlo Ersek <ler...@redhat.com>

Thank you for a thorough review, and I'm glad my thorough commit
message helped.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org
_______________________________________________
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs

Reply via email to