On 11/13/18 11:57 AM, Keith Busch wrote:
> The scsi timeout error handling had been directly updating the request
> state to prevent a natural completion and error handling from completing
> the same request twice. Fix this layering violation by having scsi
> control the fate of its commands with scsi owned flags rather than
> use blk-mq's.
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 61babcb269ab..c680171ca201 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1635,8 +1635,18 @@ static blk_status_t scsi_mq_prep_fn(struct request 
> *req)
>  
>  static void scsi_mq_done(struct scsi_cmnd *cmd)
>  {
> +     if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
> +             return;
>       trace_scsi_dispatch_cmd_done(cmd);
>       blk_mq_complete_request(cmd->request);
> +
> +#ifdef CONFIG_FAIL_IO_TIMEOUT
> +     /*
> +      * Clearing complete here serves only to allow the desired recovery to
> +      * escalate on blk_rq_should_fake_timeout()'s error injection.
> +      */
> +     clear_bit(__SCMD_COMPLETE, &cmd->flags);
> +#endif
>  }

We could have this be:

static void scsi_mq_done(struct scsi_cmnd *cmd)
{
        if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
                return;
        trace_scsi_dispatch_cmd_done(cmd);

        if (blk_mq_complete_request(cmd->request)) {
                /*
                 * Clearing complete here serves only to allow the
                 * desired recovery to escalate on
                 * blk_rq_should_fake_timeout()'s error injection.
                 */
                clear_bit(__SCMD_COMPLETE, &cmd->flags);
        }
}

with

bool blk_mq_complete_request(struct request *rq)
{
        if (unlikely(blk_should_fake_timeout(rq->q)))
                return true;
        __blk_mq_complete_request(rq);
        return false;
}

and not have this CONFIG_FAIL_IO_TIMEOUT dependency, but that'd be a bit
more expensive.

Was going to suggest a request flag, but the request is gone at this
point. So that won't really work...

I'm with your solution as well, fwiw.

-- 
Jens Axboe

Reply via email to