On Tue, 2025-05-13 at 10:00 +0200, Martin Wilck wrote: >
> > If you think it does, is there another reason why you didn't try > > this > > before? > > It didn't occur to me back then that we could fail paths without > retrying in the kernel. > > Perhaps we could have the sg driver pass the blk_status_t (which is > available on the sg level) to device mapper somehow in the sg_io_hdr > structure? That way we could entirely avoid the layering violation > between SCSI and dm. Not sure if that would be acceptible to > Christoph, > as blk_status_t is supposed to be exclusive to the kernel. Can we > find > a way to make sure it's passed to DM, but not to user space? I have to correct myself. I was confused by my old patches which contain special casing for SG_IO. The current upstream code does of course not support special-casing SG_IO in any way. device-mapper neither looks at the ioctl `cmd` value nor at any arguments, and has only the Unix error code to examine when the ioctl returns. The device mapper layer has access to *less* information than the user space process that issued the ioctl. Adding hooks to the sg driver wouldn't buy us anything in this situation. If we can't change this, we can't fail paths in the SG_IO error code path, end of story. With Kevin's patch 1/2 applied, it would in principle be feasible to special-case SG_IO, handle it in the dm-multipath, retrieve the blk_status_t somehow, and possibly initiate path failover. This way we'd at least keep the generic dm layer clean of SCSI specific code. But still, the end result would look very similar attempt from 2021 and would therefore lead us nowhere, probably. I'm still not too fond of DM_MPATH_PROBE_PATHS_CMD, but I can't offer a better solution at this time. If the side issues are fixed, it will be an improvement over the current upstream, situation where we can do no path failover at all. In the long term, we should evaluate alternatives. If my conjecture in my previous post is correct we need only PRIN/PROUT commands, there might be a better solution than scsi-block for our customers. Using regular block IO should actually also improved performance. Regards Martin