On Wed, Nov 14, 2018 at 11:00:18AM -0700, Keith Busch wrote:
> On Wed, Nov 14, 2018 at 09:51:36AM -0800, Bart Van Assche wrote:
> > Regarding this patch: I think this patch introduces a subtle but severe bug
> > in the SCSI core, namely that if an abort is processed concurrently with
> > request completion with "fake timeout" enabled that the abort is ignored.
> 
> That requires the following occur concurrently:
> 
>   1. A real completion
>   2. A real timeout
>   3. A fake timeout
> 
> That can't happen on a production kernel, and highly improbable
> on the fake one. We can still fix it by having scsi timeout return
> BLK_EH_RESET_TIMER in this case. I didn't like adding code just to work
> around error injection, but there isn't a good alternative at the moment.

So do this instead:

--8<---
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index ff372b335ced..d343024e732a 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -199,9 +199,6 @@ scsi_abort_command(struct scsi_cmnd *scmd)
                return FAILED;
        }
 
-       if (test_and_set_bit(__SCMD_COMPLETE, &scmd->flags))
-               return SUCCESS;
-
        spin_lock_irqsave(shost->host_lock, flags);
        if (shost->eh_deadline != -1 && !shost->last_reset)
                shost->last_reset = jiffies;
@@ -299,6 +296,8 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
                rtn = host->hostt->eh_timed_out(scmd);
 
        if (rtn == BLK_EH_DONE) {
+               if (test_and_set_bit(__SCMD_COMPLETE, &scmd->flags))
+                       return BLK_EH_RESET_TIMER;
                if (scsi_abort_command(scmd) != SUCCESS) {
                        set_host_byte(scmd, DID_TIME_OUT);
                        scsi_eh_scmd_add(scmd);
-->8---

Reply via email to