Re: [f2fs-dev] [PATCH] f2fs: ensure minimum trim granularity accounts for all devices

Yongpeng Yang Tue, 28 Oct 2025 02:28:55 -0700

On 10/28/25 10:30, Chao Yu via Linux-f2fs-devel wrote:

On 10/27/25 21:06, Yongpeng Yang wrote:

On 10/27/25 16:35, Chao Yu via Linux-f2fs-devel wrote:

On 10/24/25 22:37, Yongpeng Yang wrote:

From: Yongpeng Yang <[email protected]>


When F2FS uses multiple block devices, each device may have a
different discard granularity. The minimum trim granularity must be
at least the maximum discard granularity of all devices, excluding
zoned devices. Use max_t instead of the max() macro to compute the
maximum value.

Signed-off-by: Yongpeng Yang <[email protected]>
---
   fs/f2fs/f2fs.h | 12 ++++++++++++
   fs/f2fs/file.c | 12 ++++++------
   2 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 32fb2e7338b7..064bdbf463f7 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -4762,6 +4762,18 @@ static inline bool f2fs_hw_support_discard(struct 
f2fs_sb_info *sbi)
       return false;
   }
   +static inline unsigned int f2fs_hw_discard_granularity(struct f2fs_sb_info 
*sbi)
+{
+    int i = 1;
+    unsigned int discard_granularity = 
bdev_discard_granularity(sbi->sb->s_bdev);


Yongpeng,

The patch makes sense to me.

One extra question, if a zoned device contains both conventional zones and
sequential zones, what discard granularity will it exposes?

Thanks,

I don't have such a device. I think the exposed discard granularity should be 
that of the conventional zones, since the sequential zones have a default reset 
granularity of 1 zone, and no additional information is needed to indicate that.


I guess you can have a virtual one simulated by null_blk driver?

https://zonedstorage.io/docs/getting-started/zbd-emulation#zoned-block-device-emulation-with-null_blk

1. When using qemu to emulate a zns ssd, a namespace cannotsimultaneously contain both conventional zones and sequential zones.Additionally, for the emulated zoned device, the discard_granularitycannot be configured manually. Its size is defaulted to the maximumvalue between the logical_block_size and 4KiB.


static int nvme_ns_init_blk(NvmeNamespace *ns, Error **errp)
{
...
    if (ns->blkconf.discard_granularity == -1) {
        ns->blkconf.discard_granularity =
            MAX(ns->blkconf.logical_block_size, MIN_DISCARD_GRANULARITY);
    }
...
}

The default value of discard_granularity is set to logical_block_size innvme driver.static void nvme_config_discard(struct nvme_ns *ns, struct queue_limits*lim)

{
...
        lim->discard_granularity = lim->logical_block_size;
...
}

2. QEMU cannot emulate SMR HDDs. From scsi driver code, I found that thediscard_granularity of a scsi device is as follows. The value ofsdkp->unmap_granularity is shared across multiple LUNs, meaning thatboth conventional LUNs and sequential LUNs have the samesdkp->unmap_granularity. As a result, the discard_granularity is alsothe same for both types of zones. Therefore, from the driverperspective, a zoned device that contains both conventional zones andsequential zones will have the same discard_granularity as otherconventional devices.

static void sd_config_discard(struct scsi_disk *sdkp, structqueue_limits *lim,

                unsigned int mode)
{
...
    lim->discard_granularity = max(sdkp->physical_block_size,
                        sdkp->unmap_granularity * logical_block_size);
...
}

static void sd_read_block_limits(struct scsi_disk *sdkp,
                struct queue_limits *lim)
{
...
    sdkp->unmap_granularity = get_unaligned_be32(&vpd->data[28]);
...
}

3. It seems that discard_granularity is related to logical_block_sizeand physical_block_size, and is not associated with the zone size. Forzoned device, discard_granularity is meaningless.


- nullblk_create.sh 512 2 1024 1024
- cat /sys/block/nullb1/queue/discard_*
0
0
0
0

I didn't dig into more details, though. :)

Thanks,


I found that null device didn't config discard_*.

static int null_add_dev(struct nullb_device *dev)

{
...
        struct queue_limits lim = {
                .logical_block_size     = dev->blocksize,
                .physical_block_size    = dev->blocksize,
                .max_hw_sectors         = dev->max_sectors,
        };
...
}>>

Yongpeng>

+
+    if (f2fs_is_multi_device(sbi))
+        for (; i < sbi->s_ndevs && !bdev_is_zoned(FDEV(i).bdev); i++)
+            discard_granularity = max_t(unsigned int, discard_granularity,
+                        bdev_discard_granularity(FDEV(i).bdev));
+    return discard_granularity;
+}
+
   static inline bool f2fs_realtime_discard_enable(struct f2fs_sb_info *sbi)
   {
       return (test_opt(sbi, DISCARD) && f2fs_hw_support_discard(sbi)) ||
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 6d42e2d28861..ced0f78532c9 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -2588,14 +2588,14 @@ static int f2fs_keep_noreuse_range(struct inode *inode,
   static int f2fs_ioc_fitrim(struct file *filp, unsigned long arg)
   {
       struct inode *inode = file_inode(filp);
-    struct super_block *sb = inode->i_sb;
+    struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
       struct fstrim_range range;
       int ret;
         if (!capable(CAP_SYS_ADMIN))
           return -EPERM;
   -    if (!f2fs_hw_support_discard(F2FS_SB(sb)))
+    if (!f2fs_hw_support_discard(sbi))
           return -EOPNOTSUPP;
         if (copy_from_user(&range, (struct fstrim_range __user *)arg,
@@ -2606,9 +2606,9 @@ static int f2fs_ioc_fitrim(struct file *filp, unsigned 
long arg)
       if (ret)
           return ret;
   -    range.minlen = max((unsigned int)range.minlen,
-               bdev_discard_granularity(sb->s_bdev));
-    ret = f2fs_trim_fs(F2FS_SB(sb), &range);
+    range.minlen = max_t(unsigned int, range.minlen,
+            f2fs_hw_discard_granularity(sbi));
+    ret = f2fs_trim_fs(sbi, &range);
       mnt_drop_write_file(filp);
       if (ret < 0)
           return ret;
@@ -2616,7 +2616,7 @@ static int f2fs_ioc_fitrim(struct file *filp, unsigned 
long arg)
       if (copy_to_user((struct fstrim_range __user *)arg, &range,
                   sizeof(range)))
           return -EFAULT;
-    f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
+    f2fs_update_time(sbi, REQ_TIME);
       return 0;
   }




_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel




_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH] f2fs: ensure minimum trim granularity accounts for all devices

Reply via email to