Hi Ben,

On Thu, 2025-07-10 at 14:10 -0400, Benjamin Marzinski wrote:
> A problem that mpathpersist has with making SCSI Persistent
> Resevations
> to a multipath device work like they do to individual SCSI devices is
> that some of the paths to a multipath device might be down or missing
> when the mpathpersist commands are run. Multipath handles registering
> a
> new key pretty well. If paths are unavailable at the time of the
> command, the key is registered when they later become available.  But
> if
> the multipath device is also holding a reservation on one of its
> paths,
> things get trickier.
> 
> If a persistent reservation is being held by an unsuable path of a
> multipath device (the path can either be down or completely removed),
> libmpathpersist can't change it just by forwarding the regular
> persistent reservation commands. This can cause problems both for the
> RELEASE command and the REGISTER and REGISTER AND IGNORE commands if
> they are used to change from one key to another. If the path holding
> the
> reservation is unavailable, the reservation won't be released or have
> its key changed, as expected. I wish the problem of having a
> reservation
> key changed while it is holding the reservation was simply a
> theoretical
> one, but there are enterprise users of multipath that need this
> capability.
> 
> This patchset deals with both of these problems. libmpathpersist
> always
> had code to handle releasing a reservation held by an unavailable
> path,
> but the existing method is broken. It relies on poorly supported
> optional features of SCSI Persistent Reservations: the READ FULL
> STATUS
> command and specifying Initiator Ports with the REGISTER command
> (SIP_C). Also, fixing its current issues would additionally require
> supporting the All Target Ports option (ATP_C). This existing
> workaround
> has been redesigned to use the PREEMPT command instead. Key changes
> where the path holding the reservation is unavailable were not
> previously handled by libmpathpersist. This patchset also handles
> them
> using the PREEMPT command.
> 
> patches 0001-0008 are simply cleanups an prep work.
> patches 0009-0010 redesign the RELEASE command workaround
> patch 0011 creates a workaround for the REGISTER command
> patch 0012 makes this work for the REGISTER AND IGNORE command as
> well
> patches 0013-0014 are more prep work
> patch 0015 Adds a safety check before preempting a key, so that only
>            devices that are holding a reservation will do the
>            preemption
> 
> These workarounds depend on the fact that managing scsi persistent
> reservations for multipath devices has never worked correctly if
> multiple nodes use the same persistent reservation key for the same
> device. multipathd needs to be able to register the key on new paths,
> but the key could have been preempted since it was registered for the
> multipath device. To make sure that multipath isn't automatically
> registering paths for a device that has been preempted, multipathd
> checks that the configured key has already been registered by other
> paths of the device.  Unfortunately, there is no way to verify that
> the
> other paths to the multipath device on this node are the ones holding
> the keys in question. If another node is using the same key for the
> device, those registered keys could belong to paths on the other
> node,
> and there could be no registered keys from this node (likely due to
> being preempted). If multipathd registered the key for this new path
> because it saw another node's keys, then a preempted node would
> suddenly
> gain access to the storage.
> 
> As far as the workarounds from this patchset go, if there was another
> node using the same key, then preempting the key in these workarounds
> would preempt the other node as well. The safety check in patch 0015
> tries to make sure that this only happens when the current is holding
> the reservation, but this isn't foolproof.
> 
> I should also note that while working on this patchset, I noticed a
> number of other issues with the persistent reservation code, mostly
> involving changing a registered key. I'll be fixing them in a future
> patchset.
> 
> Benjamin Marzinski (15):
>   multipathd: remove thread from mpath_pr_event_handle
>   libmpathpersist: remove uneeded wrapper function.
>   libmpathpersist: reduce log level for persistent reservation
> checking
>   libmpathpersist: remove pointless update_map_pr ret value code
>   multipathd: use update_map_pr in mpath_pr_event_handle
>   libmpathpersist: limit changing prflag in update_map_pr
>   multipathd: Don't call update_map_pr unnecessarily
>   libmpathpersist: remove useless function send_prout_activepath
>   limpathpersist: redesign failed release workaround
>   libmpathpersist: fail the release if all threads fail
>   limpathpersist: Handle changing key corner case
>   libmapthpersist: Handle REGISTER AND IGNORE changing key corner
> case
>   libmultipath: rename prflag_value enums
>   libmpathpersist: use a switch statement for prout command
> finalizing
>   libmpathpersist: Add safety check for preempting on key change
> 
>  libmpathpersist/libmpathpersist.version |   2 +-
>  libmpathpersist/mpath_persist_int.c     | 505 ++++++++++++++--------
> --
>  libmpathpersist/mpath_persist_int.h     |   2 +-
>  libmpathpersist/mpath_updatepr.c        |  74 +++-
>  libmpathpersist/mpathpr.h               |   3 +
>  libmultipath/libmultipath.version       |   1 +
>  libmultipath/structs.h                  |  10 +-
>  multipathd/callbacks.c                  |   3 +
>  multipathd/cli.c                        |   4 +-
>  multipathd/cli.h                        |   3 +
>  multipathd/cli_handlers.c               |  70 +++-
>  multipathd/main.c                       | 149 +++----
>  12 files changed, 483 insertions(+), 343 deletions(-)

For the series, except those patches I've replied to:
Reviewed-by: Martin Wilck <mwi...@suse.com>

I have taken the liberty to apply the changes suggested by the clang-
format style checker and the fixes for the previously mentioned gcc4.8
compile errors in my tip branch. I also already added my Reviewed-by:
tags there.

I should add that while the logic of this patch set makes a lot of
sense to me and I've done my best to do a careful review, I don't feel
that understand the possible corner cases well enough to be able to
confirm that they can't possibly have unwanted side effects. Doing so
would require a lot more testing and deep-diving into the matter than
I've had time for. You are more exprienced with PR scenarios than me. 
Your analysis of the various cases is as solid as it gets, AFAICT.

Thanks
Martin








Martin

[1] https://github.com/openSUSE/multipath-tools/tree/tip

Reply via email to