On Mon, Feb 05, 2024 at 11:41:04AM +0900, Damien Le Moal wrote:
> On 2/5/24 11:19, Ming Lei wrote:
> >>>> +static inline void blk_zone_wplug_add_bio(struct blk_zone_wplug *zwplug,
> >>>> +                                          struct bio *bio, unsigned int 
> >>>> nr_segs)
> >>>> +{
> >>>> +        /*
> >>>> +         * Keep a reference on the BIO request queue usage. This 
> >>>> reference will
> >>>> +         * be dropped either if the BIO is failed or after it is issued 
> >>>> and
> >>>> +         * completes.
> >>>> +         */
> >>>> +        percpu_ref_get(&bio->bi_bdev->bd_disk->queue->q_usage_counter);
> >>>
> >>> It is fragile to get nested usage_counter, and same with 
> >>> grabbing/releasing it
> >>> from different contexts or even functions, and it could be much better to 
> >>> just
> >>> let block layer maintain it.
> >>>
> >>> From patch 23's change:
> >>>
> >>> +  * Zoned block device information. Reads of this information must be
> >>> +  * protected with blk_queue_enter() / blk_queue_exit(). Modifying this
> >>>
> >>> Anytime if there is in-flight bio, the block device is opened, so both 
> >>> gendisk and
> >>> request_queue are live, so not sure if this .q_usage_counter protection
> >>> is needed.
> >>
> >> Hannes also commented about this. Let me revisit this.
> > 
> > I think only queue re-configuration(blk_revalidate_zone) requires the
> > queue usage counter. Otherwise, bdev open()/close() should work just
> > fine.
> 
> I want to check FS case though. No clear if mounting FS that supports zone
> (btrfs) also uses bdev open ?

I feel the following delta change might be cleaner and easily documented:

- one IO takes single reference for both bio based and blk-mq,
- no drop & re-grab
- only grab extra reference for bio based
- two code paths share same pattern

diff --git a/block/blk-core.c b/block/blk-core.c
index 9520ccab3050..118dd789beb5 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -597,6 +597,10 @@ static void __submit_bio(struct bio *bio)
 
        if (!bio->bi_bdev->bd_has_submit_bio) {
                blk_mq_submit_bio(bio);
+       } else if (bio_zone_write_plugging(bio)) {
+               struct gendisk *disk = bio->bi_bdev->bd_disk;
+
+               disk->fops->submit_bio(bio);
        } else if (likely(bio_queue_enter(bio) == 0)) {
                struct gendisk *disk = bio->bi_bdev->bd_disk;
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index f0fc61a3ec81..fc6d792747dc 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -3006,8 +3006,12 @@ void blk_mq_submit_bio(struct bio *bio)
        if (blk_mq_attempt_bio_merge(q, bio, nr_segs))
                goto queue_exit;
 
+       /*
+        * Grab one reference for plugged zoned write and it will be reused in
+        * next real submission
+        */
        if (blk_queue_is_zoned(q) && blk_zone_write_plug_bio(bio, nr_segs))
-               goto queue_exit;
+               return;
 
        if (!rq) {
 new_request:
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index f6d4f511b664..87abb3f7ef30 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -514,7 +514,8 @@ static inline void blk_zone_wplug_add_bio(struct 
blk_zone_wplug *zwplug,
         * be dropped either if the BIO is failed or after it is issued and
         * completes.
         */
-       percpu_ref_get(&bio->bi_bdev->bd_disk->queue->q_usage_counter);
+       if (bio->bi_bdev->bd_has_submit_bio)
+               percpu_ref_get(&bio->bi_bdev->bd_disk->queue->q_usage_counter);
 
        /*
         * The BIO is being plugged and thus will have to wait for the on-going
@@ -760,15 +761,10 @@ static void blk_zone_wplug_bio_work(struct work_struct 
*work)
 
        blk_zone_wplug_unlock(zwplug, flags);
 
-       /*
-        * blk-mq devices will reuse the reference on the request queue usage
-        * we took when the BIO was plugged, but the submission path for
-        * BIO-based devices will not do that. So drop this reference here.
-        */
+       submit_bio_noacct_nocheck(bio);
+
        if (bio->bi_bdev->bd_has_submit_bio)
                blk_queue_exit(bio->bi_bdev->bd_disk->queue);
-
-       submit_bio_noacct_nocheck(bio);
 }
 
 static struct blk_zone_wplug *blk_zone_alloc_write_plugs(unsigned int nr_zones)

Thanks,
Ming


Reply via email to