On Thu, May 15, 2025 at 10:15:53AM +0200, Kevin Wolf wrote:
> Am 13.05.2025 um 15:51 hat Stefan Hajnoczi geschrieben:
> > On Tue, May 13, 2025 at 01:37:30PM +0200, Kevin Wolf wrote:
> > > When scsi-block is used on a host multipath device, it runs into the
> > > problem that the kernel dm-mpath doesn't know anything about SCSI or
> > > SG_IO and therefore can't decide if a SG_IO request returned an error
> > > and needs to be retried on a different path. Instead of getting working
> > > failover, an error is returned to scsi-block and handled according to
> > > the configured error policy. Obviously, this is not what users want,
> > > they want working failover.
> > > 
> > > QEMU can parse the SG_IO result and determine whether this could have
> > > been a path error, but just retrying the same request could just send it
> > > to the same failing path again and result in the same error.
> > > 
> > > With a kernel that supports the DM_MPATH_PROBE_PATHS ioctl on dm-mpath
> > > block devices (queued in the device mapper tree for Linux 6.16), we can
> > > tell the kernel to probe all paths and tell us if any usable paths
> > > remained. If so, we can now retry the SG_IO ioctl and expect it to be
> > > sent to a working path.
> > > 
> > > Signed-off-by: Kevin Wolf <kw...@redhat.com>
> > > ---
> > >  block/file-posix.c | 82 +++++++++++++++++++++++++++++++++++++++++++++-
> > >  1 file changed, 81 insertions(+), 1 deletion(-)
> > 
> > Maybe the probability of retry success would be higher with a delay so
> > that intermittent issues have time to resolve themselves. Either way,
> > the patch looks good.
> 
> I don't think adding a delay here would be helpful. The point of
> multipath isn't that you wait until a bad path comes back, but that you
> just switch to a different path until it is restored.

That's not what this loop does. DM_MPATH_PROBE_PATHS probes all paths
and fails when no paths are available. The delay would only apply in the
case when there are no paths available.

If the point is not to wait until some path comes back, then why loop at
all?

> Ideally, calling DM_MPATH_PROBE_PATHS would just remove all the bad
> paths instantaneously and we would either be able to send the request
> using one of the remaining good paths, or know that we have to fail. In
> practice, the ioctl will probably have to wait for a timeout, so you get
> a delay anyway.

True, if the read requests themselves time out rather than failing with
an immediate error, then no delay is required. I guess both cases can
happen and userspace has no visibility aside from measuring the time
spent in the ioctl.

> What could potentially be useful is a new error policy [rw]error=retry
> with a configurable delay. This wouldn't be in file-posix, but in the
> devices after file-posix came to the conclusion that there is currently
> no usable path. On the other hand, retrying indefinitely is probably not
> what you want either, so that could quickly become rather complex.

Yes, it will be complex. I have no objection to this patch as-is.

Stefan

Attachment: signature.asc
Description: PGP signature

Reply via email to