On 11/13/18 12:45 PM, Keith Busch wrote:
> On Tue, Nov 13, 2018 at 12:20:46PM -0700, Jens Axboe wrote:
>> On 11/13/18 11:57 AM, Keith Busch wrote:
>>> static void scsi_mq_done(struct scsi_cmnd *cmd)
>>> {
>>> + if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
>>> + return;
>>> trace_scsi_dispatch_cmd_done(cmd);
>>> blk_mq_complete_request(cmd->request);
>>> +
>>> +#ifdef CONFIG_FAIL_IO_TIMEOUT
>>> + /*
>>> + * Clearing complete here serves only to allow the desired recovery to
>>> + * escalate on blk_rq_should_fake_timeout()'s error injection.
>>> + */
>>> + clear_bit(__SCMD_COMPLETE, &cmd->flags);
>>> +#endif
>>> }
>>
>> We could have this be:
>>
>> static void scsi_mq_done(struct scsi_cmnd *cmd)
>> {
>> if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
>> return;
>> trace_scsi_dispatch_cmd_done(cmd);
>>
>> if (blk_mq_complete_request(cmd->request)) {
>> /*
>> * Clearing complete here serves only to allow the
>> * desired recovery to escalate on
>> * blk_rq_should_fake_timeout()'s error injection.
>> */
>> clear_bit(__SCMD_COMPLETE, &cmd->flags);
>> }
>> }
>>
>> with
>>
>> bool blk_mq_complete_request(struct request *rq)
>> {
>> if (unlikely(blk_should_fake_timeout(rq->q)))
>> return true;
>> __blk_mq_complete_request(rq);
>> return false;
>> }
>>
>> and not have this CONFIG_FAIL_IO_TIMEOUT dependency, but that'd be a bit
>> more expensive.
>
> I was trying to avoid every cost no matter how negligable (those are the
> only types of costs left as far as I can see), but I think your proposal
> might actually be necessary: if a timeout wasn't faked, clearing the
> completion flag unconditionally might have a problem with a real timeout
> racing with the real completion. :(
Yep, I think you are right... No way around it then. Are you going to
resend it with the fix?
--
Jens Axboe