On 11/13/18 11:57 AM, Keith Busch wrote:
> The scsi timeout error handling had been directly updating the request
> state to prevent a natural completion and error handling from completing
> the same request twice. Fix this layering violation by having scsi
> control the fate of its commands with scsi owned flags rather than
> use blk-mq's.
>
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 61babcb269ab..c680171ca201 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1635,8 +1635,18 @@ static blk_status_t scsi_mq_prep_fn(struct request
> *req)
>
> static void scsi_mq_done(struct scsi_cmnd *cmd)
> {
> + if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
> + return;
> trace_scsi_dispatch_cmd_done(cmd);
> blk_mq_complete_request(cmd->request);
> +
> +#ifdef CONFIG_FAIL_IO_TIMEOUT
> + /*
> + * Clearing complete here serves only to allow the desired recovery to
> + * escalate on blk_rq_should_fake_timeout()'s error injection.
> + */
> + clear_bit(__SCMD_COMPLETE, &cmd->flags);
> +#endif
> }
We could have this be:
static void scsi_mq_done(struct scsi_cmnd *cmd)
{
if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
return;
trace_scsi_dispatch_cmd_done(cmd);
if (blk_mq_complete_request(cmd->request)) {
/*
* Clearing complete here serves only to allow the
* desired recovery to escalate on
* blk_rq_should_fake_timeout()'s error injection.
*/
clear_bit(__SCMD_COMPLETE, &cmd->flags);
}
}
with
bool blk_mq_complete_request(struct request *rq)
{
if (unlikely(blk_should_fake_timeout(rq->q)))
return true;
__blk_mq_complete_request(rq);
return false;
}
and not have this CONFIG_FAIL_IO_TIMEOUT dependency, but that'd be a bit
more expensive.
Was going to suggest a request flag, but the request is gone at this
point. So that won't really work...
I'm with your solution as well, fwiw.
--
Jens Axboe