On Mon, 2017-09-18 at 18:03 -0400, Keith Busch wrote:
> I think we've always known it's possible to lose a request during timeout
> handling, but just accepted that possibility. It seems to be causing
> problems, though, leading to unnecessary error escalation and IO failures.
> 
> The possiblity arises when the block layer marks the request complete
> prior to running the timeout handler. If that request happens to complete
> while the handler is running, the request will be lost, inevitably
> triggering a second timeout.
> 
> This patch attempts to shorten the window for this race condition by
> clearing the started flag when the driver completes a request. The block
> layer's timeout handler will then complete the command if it observes
> the started flag is no longer set.
> 
> Note it's possible to lose the command even with this patch. It's just
> less likely to happen.

Hello Keith,

Are you sure the root cause of this race condition is in the blk-mq core?
I've never observed such behavior in any of my numerous scsi-mq tests (which
trigger timeouts). Are you sure the race you observed is not caused by a
blk_mq_reinit_tagset() call, a function that is only used by the NVMe driver
and not by any other block driver?

Bart.

Reply via email to