On Sun, 2025-08-24 at 23:00 +0200, Martin Wilck wrote:
> On Thu, 2025-07-10 at 14:11 -0400, Benjamin Marzinski wrote:
> > When a reservation key is changed from one non-zero value to
> > another,
> > libmpathpersist checks if the old key is still holding the
> > reservation,
> > and preempts it if it is. This was only safe if two nodes never
> > share
> > the same key. If a node uses the same key as another node that is
> > holding the reservation, and switches keys so that it no longer
> > matches,
> > it will end up preempting the reservation. This is clearly
> > unexpected
> > behavior, and it can come about by a simple accident of registering
> > a
> > device with the wrong key, and then immediately fixing it.
> > 
> > To handle this, add code to track if a device is the reservation
> > holder
> > to multipathd. multipathd now has three new commands "getprhold",
> > "setprhold", and "unsetprhold". These commands work like the
> > equivalent
> > *prstatus commands. libmpathpersist calls setprhold on RESERVE
> > commands
> > and PREEMPT commands when the preempted key is holding the
> > reservation
> > and unsetprhold on RELEASE commands. Also, calling unsetprflag
> > causes
> > prhold to be unset as well, so CLEAR commands and REGISTER commands
> > with
> > a 0x0 service action key will also unset prhold. libmpathpersist()
> > will
> > also unset prhold if it notices that the device cannot be holding a
> > reservation in preempt_missing_path().
> > 
> > When a new multipath device is created, its initial prhold state is
> > PR_UNKNOWN until it checks the current reservation, just like with
> > prflag. If multipathd ever finds that a device's registration has
> > been
> > preempted or cleared in update_map_pr(), it unsets prhold, just
> > like
> > with prflag.
> > 
> > Now, before libmpathpersist preempts a reservation when changing
> > keys
> > it also checks if multipathd thinks that the device is holding
> > the reservation. If it does not, then libmpathpersist won't preempt
> > the key. It will assume that another node is holding the
> > reservation
> > with the same key.
> > 
> > I should note that this safety check only stops a node not holding
> > the
> > reservation from preempting the node holding the reservation. If
> > the
> > node holding the reservation changes its key, but it fails to
> > change
> > the
> > resevation key, because that path is down or gone, it will still
> > issue
> > the preempt to move the reservation to a usable path, even if
> > another
> > node is using the same key. This will remove the registrations for
> > that
> > other node. It also will not work correctly if multipathd stops
> > tracking
> > a device for some reason. It's only a best-effort safety check.
> > 
> > Signed-off-by: Benjamin Marzinski <bmarz...@redhat.com>
> > ---
> >  libmpathpersist/mpath_persist_int.c | 87 ++++++++++++++++++++++++-
> > --
> > --
> >  libmpathpersist/mpath_updatepr.c    | 19 ++++++-
> >  libmpathpersist/mpathpr.h           |  2 +
> >  libmultipath/structs.h              |  1 +
> >  multipathd/callbacks.c              |  3 +
> >  multipathd/cli.c                    |  4 +-
> >  multipathd/cli.h                    |  3 +
> >  multipathd/cli_handlers.c           | 48 ++++++++++++++++
> >  multipathd/main.c                   | 33 ++++++++++-
> >  9 files changed, 182 insertions(+), 18 deletions(-)
> > 
> 
> Two questions all the way down.
> 
> >  
> > +static void check_prhold(struct multipath *mpp, struct path *pp)
> > +{
> > +   struct prin_resp resp = {0};
> > +   int status;
> > +
> > +   if (mpp->prflag == PR_UNSET) {
> > +           mpp->prhold = PR_UNSET;
> > +           return;
> > +   }
> > +   if (mpp->prflag != PR_SET || mpp->prhold != PR_UNKNOWN)
> > +           return;
> 
> How can this situation come to pass? prflag must be PR_UNKNOWN. 
> Shouldn't we reset prhold to PR_UNKNOWN as well?

Forget this remark. I was confused.

> 
> > +
> > +   status = prin_do_scsi_ioctl(pp->dev, MPATH_PRIN_RRES_SA,
> > &resp, 0);
> > +   if (status != MPATH_PR_SUCCESS) {
> > +           condlog (0, "%s: pr in read reservation command
> > failed: %d",
> > +                    mpp->wwid, status);
> > +           return;
> > +   }
> > +   mpp->prhold = PR_UNSET;
> > +   if (!resp.prin_descriptor.prin_readresv.additional_length)
> > +           return;
> > +
> > +   if (memcmp(&mpp->reservation_key,
> > resp.prin_descriptor.prin_readresv.key, 8) == 0)
> > +           mpp->prhold = PR_SET;
> > +}
> > +
> >  static void mpath_pr_event_handle(struct path *pp)
> >  {
> >     struct multipath *mpp = pp->mpp;
> > @@ -4233,8 +4259,13 @@ static void mpath_pr_event_handle(struct
> > path
> > *pp)
> >             return;
> >     }
> >  
> > -   if (update_map_pr(mpp, pp) != MPATH_PR_SUCCESS)
> > +   if (update_map_pr(mpp, pp) != MPATH_PR_SUCCESS) {
> > +           if (mpp->prflag == PR_UNSET)
> > +                   mpp->prhold = PR_UNSET;
> 
> Perhaps we should move setting prhold to update_map_pr()?

You fixed this in your 2nd set.

Martin

Reply via email to