On 10/30/25 11:13 PM, Damien Le Moal wrote:
Introduce the function blkdev_report_zones_cached() to provide a fast
report zone built using the blkdev_get_zone_info() function, which gets
zone information from a disk zones_cond array or zone write plugs.
For a large capacity SMR drive, such fast report zone can be completed
in a few millioseconds compared to several seconds completion times
when the report zone is obtained from the device.
millioseconds -> milliseconds
Does retrieving the cached zone information really require multiple
milliseconds instead of only a few microseconds?
For zoned device that do not use zone write plug resources,
zoned device -> zoned devices
+static inline bool disk_need_zone_resources(struct gendisk *disk)
+{
+ /*
+ * All mq zoned devices need zone resources so that the block layer
+ * can automatically handle write BIO plugging. BIO-based device drivers
+ * (e.g. DM devices) are normally responsible for handling zone write
+ * ordering and do not need zone resources, unless the driver requires
+ * zone append emulation.
+ */
+ return queue_is_mq(disk->queue) ||
+ queue_emulates_zone_append(disk->queue);
+}
Today queue_is_mq() returns true for request-based queues only. Since
this is the terminology used elsewhere in the block layer, maybe change
"mq zoned devices" into "request-based zoned block devices"?
static inline unsigned int disk_zone_wplugs_hash_size(struct gendisk *disk)
{
return 1U << disk->zone_wplugs_hash_bits;
@@ -962,6 +975,68 @@ int blkdev_get_zone_info(struct block_device *bdev,
sector_t sector,
}
EXPORT_SYMBOL_GPL(blkdev_get_zone_info);
+/**
+ * blkdev_report_zones_cached - Get cached zones information
+ * @bdev: Target block device
+ * @sector: Sector from which to report zones
+ * @nr_zones: Maximum number of zones to report
+ * @cb: Callback function called for each reported zone
+ * @data: Private data for the callback function
+ *
+ * Description:
+ * Similar to blkdev_report_zones() but instead of calling into the low
level
+ * device driver to get the zone report from the device, use
+ * blkdev_get_zone_info() to generate the report from the disk zone write
+ * plugs and zones condition array. Since calling this function without a
+ * callback does not make sense, @cb must be specified.
+ */
+int blkdev_report_zones_cached(struct block_device *bdev, sector_t sector,
+ unsigned int nr_zones, report_zones_cb cb, void *data)
+{
+ struct gendisk *disk = bdev->bd_disk;
+ sector_t capacity = get_capacity(disk);
+ sector_t zone_sectors = bdev_zone_sectors(bdev);
+ unsigned int idx = 0;
+ struct blk_zone zone;
+ int ret;
+
+ if (!cb || !bdev_is_zoned(bdev) ||
+ WARN_ON_ONCE(!disk->fops->report_zones))
+ return -EOPNOTSUPP;
+
+ if (!nr_zones || sector >= capacity)
+ return 0;
+
+ /*
+ * If we do not have any zone write plug resources, fallback to using
+ * the regular zone report.
+ */
+ if (!disk_need_zone_resources(disk)) {
+ struct blk_report_zones_args args = {
+ .cb = cb,
+ .data = data,
+ .report_active = true,
+ };
+
+ return blkdev_do_report_zones(bdev, sector, nr_zones, &args);
+ }
+
+ for (sector = ALIGN(sector, zone_sectors);
+ sector < capacity && idx < nr_zones;
+ sector += zone_sectors, idx++) {
Please change "sector = ALIGN(sector, zone_sectors)" into an something
based on bdev_offset_from_zone_start(), e.g. the following code:
sector += zone_sectors - 1;
sector -= bdev_offset_from_zone_start(bdev, sector);
Thanks,
Bart.