On Tue, 2025-08-26 at 17:07 -0400, Benjamin Marzinski wrote:
> On Tue, Aug 26, 2025 at 12:06:50PM +0200, Martin Wilck wrote:
> 
> 
> The only situations where we will do the suspend is when a multipath
> device preempts the reservation key that it is holding, either
> because
> it was explicitly told to (which is something that the SCSI spec
> specifically calls out as valid, and is in-fact the only way to
> change
> the reservation type of an existing reservation) or because we failed
> to
> release a reservation, since the path holding the reservation is
> down.

Right. This is what I forgot while reasoning about the noflush
semantics. Which, thanks to this discussion, I think I finally
understand correctly.

> If we do a flushing suspend, all queued IO will be failed, regardless
> of
> the user's no_path_retry setting. We really don't want to do that

I agree.

> > I tend to think that sending PRIN or PROUT commands to queueing
> > devices
> > is an extreme corner case. Perhaps we should just refuse to do this
> > (realizing that we can't avoid a device entering queueing mode
> > while
> > we're processing PR commands, but that's an even more extreme
> > corner
> > case).
> 
> libmpathpersist runs the path checkers before it sends any commands
> and
> doesn't send PRIN or PROUT commands to paths that are down. If there
> are
> no usable paths, libmpathpersist will just return failure. I have no
> plans to change any of this.

Side note: There's a realistic chance that persistent reservation
commands may succeed on paths that appear down for regular IO, e.g. in
ALUA STANDBY state. But I doubt that it makes sense to try that.

> 1. sg_persist doesn't force the filesystem to quiesce, so why should
> mpathpersist? Nothing that is doing persistent reservations should
> rely
> on that happening when they change a reservation.
> 
> 2. We won't be losing our ability to write to the disk. Like I said,
> there are only two cases where we try to preempt ourselves and
> trigger
> this suspend: either we were explicitly told to preempt ourselves, or
> we
> failed doing a release because the path holding the reservation was
> down. In both cases, we keep our registered keys.

You're right. Sorry for the confusion, and thanks for taking the time
to explain.

Martin

Reply via email to