I confirmed that the trigger commit is dbf8e63f48af as Yi reported. I took a
look in the commit, but it looks fine to me. So I thought the cause is not
in the commit diff.

I found the WARN is printed when the f2fs is set up with multiple devices,
and read requests are mapped to the very first block of the second device in the
direct read path. In this case, f2fs_map_blocks() and f2fs_map_blocks_cached()
modify map->m_pblk as the physical block address from each block device. It
becomes zero when it is mapped to the first block of the device. However,
f2fs_iomap_begin() assumes that map->m_pblk is the physical block address of the
whole f2fs, across the all block devices. It compares map->m_pblk against
NULL_ADDR == 0, then go into the unexpected branch and sets the invalid
iomap->length. The WARN catches the invalid iomap->length.

This WARN is printed even for non-zoned block devices, by following steps.

 - Create two (non-zoned) null_blk devices memory backed with 128MB size each:
   nullb0 and nullb1.
 # mkfs.f2fs /dev/nullb0 -c /dev/nullb1
 # mount -t f2fs /dev/nullb0 "${mount_dir}"
 # dd if=/dev/zero of="${mount_dir}/test.dat" bs=1M count=192
 # dd if="${mount_dir}/test.dat" of=/dev/null bs=1M count=192 iflag=direct

I created a fix candidate patch [1]. It modifies f2fs_iomap_begin() to handle
map->m_pblk as the physical block address from each device start, not the
address of whole f2fs. I confirmed it avoids the WARN.

But I'm not so sure if the fix is good enough. map->m_pblk has dual meanings.
Sometimes it holds the physical block address of each device, and sometimes
the address of the whole f2fs. I'm not sure what is the condition for
map->m_pblk to have which meaning. I guess F2FS_GET_BLOCK_DIO flag is the
condition, but f2fs_map_blocks_cached() does not ensure it.

Also, I noticed that map->m_pblk is referred to in other functions below, and
not sure if they need the similar change as I did for f2fs_iomap_begin().


I would like to hear advices from f2fs experts for the fix.


diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 26e317696b33..5232223a69e5 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1569,6 +1569,7 @@ static bool f2fs_map_blocks_cached(struct inode *inode,
                int bidx = f2fs_target_device_index(sbi, map->m_pblk);
                struct f2fs_dev_info *dev = &sbi->devs[bidx];
+               map->m_multidev_dio = true;
                map->m_bdev = dev->bdev;
                map->m_pblk -= dev->start_blk;
                map->m_len = min(map->m_len, dev->end_blk + 1 - map->m_pblk);
@@ -4211,9 +4212,11 @@ static int f2fs_iomap_begin(struct inode *inode, loff_t 
offset, loff_t length,
                            unsigned int flags, struct iomap *iomap,
                            struct iomap *srcmap)
+       struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
        struct f2fs_map_blocks map = {};
        pgoff_t next_pgofs = 0;
-       int err;
+       block_t pblk;
+       int err, i;
        map.m_lblk = bytes_to_blks(inode, offset);
        map.m_len = bytes_to_blks(inode, offset + length - 1) - map.m_lblk + 1;
@@ -4239,12 +4242,17 @@ static int f2fs_iomap_begin(struct inode *inode, loff_t 
offset, loff_t length,
         * We should never see delalloc or compressed extents here based on
         * prior flushing and checks.
-       if (WARN_ON_ONCE(map.m_pblk == NEW_ADDR))
+       pblk = map.m_pblk;
+       if (map.m_multidev_dio && map.m_flags & F2FS_MAP_MAPPED)
+               for (i = 0; i < sbi->s_ndevs; i++)
+                       if (FDEV(i).bdev == map.m_bdev)
+                               pblk += FDEV(i).start_blk;
+       if (WARN_ON_ONCE(pblk == NEW_ADDR))
                return -EINVAL;
-       if (WARN_ON_ONCE(map.m_pblk == COMPRESS_ADDR))
+       if (WARN_ON_ONCE(pblk == COMPRESS_ADDR))
                return -EINVAL;
-       if (map.m_pblk != NULL_ADDR) {
+       if (pblk != NULL_ADDR) {
                iomap->length = blks_to_bytes(inode, map.m_len);
                iomap->type = IOMAP_MAPPED;
                iomap->flags |= IOMAP_F_MERGED;

Linux-f2fs-devel mailing list

Reply via email to