On Tue, Nov 13, 2018 at 12:20:46PM -0700, Jens Axboe wrote:
> On 11/13/18 11:57 AM, Keith Busch wrote:
> > static void scsi_mq_done(struct scsi_cmnd *cmd)
> > {
> > + if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
> > + return;
> > trace_scsi_dispatch_cmd_done(cmd);
> > blk_mq_complete_request(cmd->request);
> > +
> > +#ifdef CONFIG_FAIL_IO_TIMEOUT
> > + /*
> > + * Clearing complete here serves only to allow the desired recovery to
> > + * escalate on blk_rq_should_fake_timeout()'s error injection.
> > + */
> > + clear_bit(__SCMD_COMPLETE, &cmd->flags);
> > +#endif
> > }
>
> We could have this be:
>
> static void scsi_mq_done(struct scsi_cmnd *cmd)
> {
> if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
> return;
> trace_scsi_dispatch_cmd_done(cmd);
>
> if (blk_mq_complete_request(cmd->request)) {
> /*
> * Clearing complete here serves only to allow the
> * desired recovery to escalate on
> * blk_rq_should_fake_timeout()'s error injection.
> */
> clear_bit(__SCMD_COMPLETE, &cmd->flags);
> }
> }
>
> with
>
> bool blk_mq_complete_request(struct request *rq)
> {
> if (unlikely(blk_should_fake_timeout(rq->q)))
> return true;
> __blk_mq_complete_request(rq);
> return false;
> }
>
> and not have this CONFIG_FAIL_IO_TIMEOUT dependency, but that'd be a bit
> more expensive.
I was trying to avoid every cost no matter how negligable (those are the
only types of costs left as far as I can see), but I think your proposal
might actually be necessary: if a timeout wasn't faked, clearing the
completion flag unconditionally might have a problem with a real timeout
racing with the real completion. :(