On 10/30/25 11:13 PM, Damien Le Moal wrote:
Introduce the function blkdev_report_zones_cached() to provide a fast
report zone built using the blkdev_get_zone_info() function, which gets
zone information from a disk zones_cond array or zone write plugs.
For a large capacity SMR drive, such fast report zone can be completed
in a few millioseconds compared to several seconds completion times
when the report zone is obtained from the device.

millioseconds -> milliseconds

Does retrieving the cached zone information really require multiple
milliseconds instead of only a few microseconds?
For zoned device that do not use zone write plug resources,

zoned device -> zoned devices

+static inline bool disk_need_zone_resources(struct gendisk *disk)
+{
+       /*
+        * All mq zoned devices need zone resources so that the block layer
+        * can automatically handle write BIO plugging. BIO-based device drivers
+        * (e.g. DM devices) are normally responsible for handling zone write
+        * ordering and do not need zone resources, unless the driver requires
+        * zone append emulation.
+        */
+       return queue_is_mq(disk->queue) ||
+               queue_emulates_zone_append(disk->queue);
+}

Today queue_is_mq() returns true for request-based queues only. Since
this is the terminology used elsewhere in the block layer, maybe change "mq zoned devices" into "request-based zoned block devices"?

  static inline unsigned int disk_zone_wplugs_hash_size(struct gendisk *disk)
  {
        return 1U << disk->zone_wplugs_hash_bits;
@@ -962,6 +975,68 @@ int blkdev_get_zone_info(struct block_device *bdev, 
sector_t sector,
  }
  EXPORT_SYMBOL_GPL(blkdev_get_zone_info);
+/**
+ * blkdev_report_zones_cached - Get cached zones information
+ * @bdev:     Target block device
+ * @sector:   Sector from which to report zones
+ * @nr_zones: Maximum number of zones to report
+ * @cb:       Callback function called for each reported zone
+ * @data:     Private data for the callback function
+ *
+ * Description:
+ *    Similar to blkdev_report_zones() but instead of calling into the low 
level
+ *    device driver to get the zone report from the device, use
+ *    blkdev_get_zone_info() to generate the report from the disk zone write
+ *    plugs and zones condition array. Since calling this function without a
+ *    callback does not make sense, @cb must be specified.
+ */
+int blkdev_report_zones_cached(struct block_device *bdev, sector_t sector,
+                       unsigned int nr_zones, report_zones_cb cb, void *data)
+{
+       struct gendisk *disk = bdev->bd_disk;
+       sector_t capacity = get_capacity(disk);
+       sector_t zone_sectors = bdev_zone_sectors(bdev);
+       unsigned int idx = 0;
+       struct blk_zone zone;
+       int ret;
+
+       if (!cb || !bdev_is_zoned(bdev) ||
+           WARN_ON_ONCE(!disk->fops->report_zones))
+               return -EOPNOTSUPP;
+
+       if (!nr_zones || sector >= capacity)
+               return 0;
+
+       /*
+        * If we do not have any zone write plug resources, fallback to using
+        * the regular zone report.
+        */
+       if (!disk_need_zone_resources(disk)) {
+               struct blk_report_zones_args args = {
+                       .cb = cb,
+                       .data = data,
+                       .report_active = true,
+               };
+
+               return blkdev_do_report_zones(bdev, sector, nr_zones, &args);
+       }
+
+       for (sector = ALIGN(sector, zone_sectors);
+            sector < capacity && idx < nr_zones;
+            sector += zone_sectors, idx++) {

Please change "sector = ALIGN(sector, zone_sectors)" into an something
based on bdev_offset_from_zone_start(), e.g. the following code:

        sector += zone_sectors - 1;
        sector -= bdev_offset_from_zone_start(bdev, sector);

Thanks,

Bart.

Reply via email to