On Wed, Feb 25, 2026 at 03:36:12PM +0000, John Garry wrote:
> For a scmd which suffers failover, requeue the master bio of each bio
> attached to its request.
>
> A handler is added in the scsi_driver structure to lookup a
> mpath_disk from a request. This is needed because the scsi_disk structure
> will manage the mpath_disk, and the code core has no method to look this
> up from the scsi_scmnd.
>
> Failover occurs when the scsi_cmnd has failed and it is discovered that the
> original scsi_device has transport down.
>
> Signed-off-by: John Garry <[email protected]>
> ---
> drivers/scsi/scsi_error.c | 12 ++++++
> drivers/scsi/scsi_lib.c | 9 +++-
> drivers/scsi/scsi_multipath.c | 80 +++++++++++++++++++++++++++++++++++
> include/scsi/scsi.h | 1 +
> include/scsi/scsi_driver.h | 3 ++
> include/scsi/scsi_multipath.h | 14 ++++++
> 6 files changed, 118 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index f869108fd9693..0fd1b46764c3f 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -40,6 +40,7 @@
> #include <scsi/scsi_ioctl.h>
> #include <scsi/scsi_dh.h>
> #include <scsi/scsi_devinfo.h>
> +#include <scsi/scsi_multipath.h>
> #include <scsi/sg.h>
>
> #include "scsi_priv.h"
> @@ -1901,12 +1902,16 @@ bool scsi_noretry_cmd(struct scsi_cmnd *scmd)
> enum scsi_disposition scsi_decide_disposition(struct scsi_cmnd *scmd)
> {
I'm no scsi expert, but the checks in scsi_decide_disposition() don't
look quite right to me. You only failover in cases were it the code
might want to retry (and when the device is offline), but there are
other cases, when you don't want to retry the same path, where it would
make sense to try another path, DID_TRANSPORT_FAILFAST for example.
As a more general issue, since you are already cloning the bios,
couldn't you just trigger the failovers in scsi_mpath_clone_end_io().
blk_path_error() can tell you which bios should be retried. That seems
like a much simpler approach than trying to intercept the request before
it completes the clones, and I don't see what you're losing by letting
the clones complete, and dealing requeueing there (again, I'm no scsi
expert).
> enum scsi_disposition rtn;
> + struct request *req = scsi_cmd_to_rq(scmd);
>
> /*
> * if the device is offline, then we clearly just pass the result back
> * up to the top level.
> */
> if (!scsi_device_online(scmd->device)) {
> + if (scsi_is_mpath_request(req))
> + return scsi_mpath_failover_disposition(scmd);
> +
> SCSI_LOG_ERROR_RECOVERY(5, scmd_printk(KERN_INFO, scmd,
> "%s: device offline - report as SUCCESS\n", __func__));
> return SUCCESS;
> diff --git a/drivers/scsi/scsi_multipath.c b/drivers/scsi/scsi_multipath.c
> index c3e0f792e921f..16b1f84fc552c 100644
> --- a/drivers/scsi/scsi_multipath.c
> +++ b/drivers/scsi/scsi_multipath.c
> @@ -518,6 +518,86 @@ void scsi_mpath_put_head(struct scsi_mpath_head
> *scsi_mpath_head)
> }
> EXPORT_SYMBOL_GPL(scsi_mpath_put_head);
>
> +bool scsi_is_mpath_request(struct request *req)
> +{
> + return is_mpath_request(req);
> +}
> +EXPORT_SYMBOL_GPL(scsi_is_mpath_request);
> +
> +static inline void bio_list_add_clone_master(struct bio_list *bl,
> + struct bio *clone)
> +{
> + struct scsi_mpath_clone_bio *scsi_mpath_clone_bio;
> + struct bio *master_bio;
> +
> + if (clone->bi_next)
> + bio_list_add_clone_master(bl, clone->bi_next);
This is a pretty minimal function, but a request could have a lot of
merged bios, so you probably want to hande them iteratively, instead of
recursing into the function, and eating up stack space unnecessarily.
Also, this reverses the bio's order when adding them to the requeue
list.
-Ben
> +
> + scsi_mpath_clone_bio = scsi_mpath_to_master_bio(clone);
> + master_bio = scsi_mpath_clone_bio->master_bio;
> +
> + if (bl->tail)
> + bl->tail->bi_next = master_bio;
> + else
> + bl->head = master_bio;
> +
> + bl->tail = master_bio;
> +
> + bio_put(clone);
> +}