On Thu, Sep 22, 2016 at 09:03:56AM -0600, Jens Axboe wrote:
> Having to grab a mutex, for instance. We invoke ->queue_rq() with
> preemption disabled, so I'd hope that would not be the case... What
> drivers block in ->queue_rq?

I though I had converted a lot of them to GFP_NOIO instead of GFP_ATOMIC
allocations, but I can't find any evidence of that.  Maybe it was just
my imagination, or an unsubmitted patch series.  Sorry for the

> Loop was another case that was on my radar to get rid of the queue_work
> it currently has to do. Josef is currently testing the nbd driver using
> this approach, so we should get some numbers there too. If we have to,
> we can always bump up the concurrency to mimic more of the behavior of
> having multiple workers running on the hardware queue. I'd prefer to
> still do that in blk-mq, instead of having drivers reinventing their own
> work offload functionality.

There should be a lot of numbers in the list archives from the time
that Ming did the loop conversion, as I've been trying to steer him
that way, and he actually implemented and benchmarked it.

We can't just increase the concurrency as a single work_struct item
can't be queued multiple times even on a high concurreny workqueue.
