Re: [PATCH 2/4] block: Fix a race between request queue freezing and running queues

Ming Lei Mon, 01 Apr 2019 17:53:58 -0700

On Mon, Apr 01, 2019 at 02:20:12PM -0700, Bart Van Assche wrote:
> Any request queue data structure may change while a queue is frozen.
> Hence make sure that blk_mq_run_hw_queues() does not access any hw
> queue while a request queue is frozen.
> 
> After blk_cleanup_queue() has marked a queue as dead it is no longer
> safe to access the hardware queue data structures. This patch avoids
> that blk_mq_run_hw_queues() crashes when called during or after
> blk_cleanup_queue() has freed the hardware queues. This patch is a
> variant of a patch posted by Hannes Reinecke ("[PATCH] block: don't
> call blk_mq_run_hw_queues() for dead or dying queues "). This patch
> is similar in nature to commit c246e80d8673 ("block: Avoid that
> request_fn is invoked on a dead queue"; v3.8). An example of a crash
> that is fixed by this patch:
> 
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<ffffffff8135a10b>] sbitmap_any_bit_set+0xb/0x30
> Call Trace:
>  [<ffffffff81303a88>] blk_mq_run_hw_queues+0x48/0x90
>  [<ffffffff813053cc>] blk_mq_requeue_work+0x10c/0x120
>  [<ffffffff81098cb4>] process_one_work+0x154/0x410
>  [<ffffffff81099896>] worker_thread+0x116/0x4a0
>  [<ffffffff8109edb9>] kthread+0xc9/0xe0
>  [<ffffffff81619b05>] ret_from_fork+0x55/0x80
> 
> Cc: Christoph Hellwig <[email protected]>
> Cc: Hannes Reinecke <[email protected]>
> Cc: James Smart <[email protected]>
> Cc: Ming Lei <[email protected]>
> Cc: Jianchao Wang <[email protected]>
> Cc: Dongli Zhang <[email protected]>
> Cc: <[email protected]>
> Fixes: a063057d7c73 ("block: Fix a race between request queue removal and the 
> block cgroup controller") # v4.17.
> Reported-by: James Smart <[email protected]>
> Signed-off-by: Bart Van Assche <[email protected]>
> ---
>  block/blk-mq.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 3ff3d7b49969..652d0c6d5945 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1499,12 +1499,20 @@ void blk_mq_run_hw_queues(struct request_queue *q, 
> bool async)
>       struct blk_mq_hw_ctx *hctx;
>       int i;
>  
> +     /*
> +      * Do not run any hardware queues if the queue is frozen or if a
> +      * concurrent blk_cleanup_queue() call is removing any data
> +      * structures used by this function.
> +      */
> +     if (!percpu_ref_tryget(&q->q_usage_counter))
> +             return;
>       queue_for_each_hw_ctx(q, hctx, i) {
>               if (blk_mq_hctx_stopped(hctx))
>                       continue;
>  
>               blk_mq_run_hw_queue(hctx, async);
>       }
> +     percpu_ref_put(&q->q_usage_counter);
>  }
>  EXPORT_SYMBOL(blk_mq_run_hw_queues);


I don't see it is necessary to add percpu_ref_tryget()/percpu_ref_put()
in the fast path if we simply release all hctx resource in hctx's
release handler by the following patch:

https://lore.kernel.org/linux-block/[email protected]/T/#u

Even we can kill the percpu_ref_tryget_live()/percpu_ref_put() in
scsi_end_request().

Thanks,
Ming

Re: [PATCH 2/4] block: Fix a race between request queue freezing and running queues

Reply via email to