On 4/3/19 6:32 AM, Bart Van Assche wrote:
> Since the callback function called by blk_mq_queue_tag_busy_iter()
> may sleep calling synchronize_rcu() from __blk_mq_update_nr_hw_queues()
> is not sufficient to wait until blk_mq_queue_tag_busy_iter() has
> finished. Instead of making __blk_mq_update_nr_hw_queues() wait until
> q->q_usage_counter == 0 is globally visible, do not iterate over tags
> if the request queue is frozen.
>
> Cc: Christoph Hellwig <[email protected]>
> Cc: Hannes Reinecke <[email protected]>
> Cc: James Smart <[email protected]>
> Cc: Ming Lei <[email protected]>
> Cc: Jianchao Wang <[email protected]>
> Cc: Keith Busch <[email protected]>
> Cc: Dongli Zhang <[email protected]>
> Cc: <[email protected]>
> Fixes: 530ca2c9bd69 ("blk-mq: Allow blocking queue tag iter callbacks") #
> v4.19.
> Signed-off-by: Bart Van Assche <[email protected]>
> ---
> block/blk-mq-tag.c | 10 +++++-----
> block/blk-mq.c | 5 +----
> 2 files changed, 6 insertions(+), 9 deletions(-)
>
> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> index a4931fc7be8a..89f479634b4d 100644
> --- a/block/blk-mq-tag.c
> +++ b/block/blk-mq-tag.c
> @@ -383,14 +383,13 @@ void blk_mq_queue_tag_busy_iter(struct request_queue
> *q, busy_iter_fn *fn,
>
> /*
> * __blk_mq_update_nr_hw_queues() updates nr_hw_queues and queue_hw_ctx
> - * while the queue is frozen. So we can use q_usage_counter to avoid
> - * racing with it. __blk_mq_update_nr_hw_queues() uses
> - * synchronize_rcu() to ensure this function left the critical section
> - * below.
> + * while the queue is frozen. Hold q_usage_counter to serialize
> + * __blk_mq_update_nr_hw_queues() against this function.
> */
> if (!percpu_ref_tryget(&q->q_usage_counter))
> return;
> -
> + if (atomic_read(&q->mq_freeze_depth))
> + goto out;
> queue_for_each_hw_ctx(q, hctx, i) {
> struct blk_mq_tags *tags = hctx->tags;
>
> @@ -405,6 +404,7 @@ void blk_mq_queue_tag_busy_iter(struct request_queue *q,
> busy_iter_fn *fn,
> bt_for_each(hctx, &tags->breserved_tags, fn, priv,
> true);
> bt_for_each(hctx, &tags->bitmap_tags, fn, priv, false);
> }
> +out:
> blk_queue_exit(q);
> }
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 652d0c6d5945..f9fc1536408d 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -3226,10 +3226,7 @@ static void __blk_mq_update_nr_hw_queues(struct
> blk_mq_tag_set *set,
>
> list_for_each_entry(q, &set->tag_list, tag_set_list)
> blk_mq_freeze_queue(q);
> - /*
> - * Sync with blk_mq_queue_tag_busy_iter.
> - */
> - synchronize_rcu();
I used to see the removal of above in one of Jianchao's patchset.
https://lore.kernel.org/patchwork/patch/1051047/
Dongli Zhang
> +
> /*
> * Switch IO scheduler to 'none', cleaning up the data associated
> * with the previous scheduler. We will switch back once we are done
>