Re: [PATCH 2/3] scsi: Do not rely on blk-mq for double completions

Bart Van Assche Wed, 14 Nov 2018 09:52:07 -0800

On Wed, 2018-11-14 at 09:26 -0700, Keith Busch wrote:
> The scsi timeout error handling had been directly updating the request
> state to prevent a natural completion and error handling from completing
> the same request twice. Fix this layering violation by having scsi
> control the fate of its commands with scsi owned flags rather than
> use blk-mq's.
> 
> Signed-off-by: Keith Busch <[email protected]>
> ---
>  drivers/scsi/scsi_error.c | 17 +++--------------
>  drivers/scsi/scsi_lib.c   |  6 +++++-
>  include/scsi/scsi_cmnd.h  |  5 ++++-
>  3 files changed, 12 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index c736d61b1648..f89e829a1c51 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -199,6 +199,9 @@ scsi_abort_command(struct scsi_cmnd *scmd)
>               return FAILED;
>       }
>  
> +     if (test_and_set_bit(__SCMD_COMPLETE, &scmd->flags))
> +             return SUCCESS;
> +
>       spin_lock_irqsave(shost->host_lock, flags);
>       if (shost->eh_deadline != -1 && !shost->last_reset)
>               shost->last_reset = jiffies;
> @@ -296,20 +299,6 @@ enum blk_eh_timer_return scsi_times_out(struct request
> *req)
>               rtn = host->hostt->eh_timed_out(scmd);
>  
>       if (rtn == BLK_EH_DONE) {
> -             /*
> -              * For blk-mq, we must set the request state to complete
> now
> -              * before sending the request to the scsi error handler.
> This
> -              * will prevent a use-after-free in the event the LLD
> manages
> -              * to complete the request before the error handler
> finishes
> -              * processing this timed out request.
> -              *
> -              * If the request was already completed, then the LLD beat
> the
> -              * time out handler from transferring the request to the
> scsi
> -              * error handler. In that case we can return immediately
> as no
> -              * further action is required.
> -              */
> -             if (req->q->mq_ops && !blk_mq_mark_complete(req))
> -                     return rtn;
>               if (scsi_abort_command(scmd) != SUCCESS) {
>                       set_host_byte(scmd, DID_TIME_OUT);
>                       scsi_eh_scmd_add(scmd);
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index c7fccbb8f554..1e74137f1073 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -2044,8 +2044,11 @@ static int scsi_mq_prep_fn(struct request *req)
>  
>  static void scsi_mq_done(struct scsi_cmnd *cmd)
>  {
> +     if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
> +             return;
>       trace_scsi_dispatch_cmd_done(cmd);
> -     blk_mq_complete_request(cmd->request);
> +     if (unlikely(!blk_mq_complete_request(cmd->request)))
> +             clear_bit(__SCMD_COMPLETE, &cmd->flags);
>  }
>  
>  static void scsi_mq_put_budget(struct blk_mq_hw_ctx *hctx)
> @@ -2104,6 +2107,7 @@ static blk_status_t scsi_queue_rq(struct
> blk_mq_hw_ctx *hctx,
>                       goto out_dec_host_busy;
>               req->rq_flags |= RQF_DONTPREP;
>       } else {
> +             cmd->flags &= ~SCMD_COMPLETE;
>               blk_mq_start_request(req);
>       }



Hi Keith,

Please Cc Martin Petersen and the scsi mailing list for SCSI patches.

Regarding this patch: I think this patch introduces a subtle but severe bug
in the SCSI core, namely that if an abort is processed concurrently with
request completion with "fake timeout" enabled that the abort is ignored.

Bart.

Re: [PATCH 2/3] scsi: Do not rely on blk-mq for double completions

Reply via email to