Re: [f2fs-dev] [PATCH v3] f2fs: prevent writing without fallocate() for pinned files

2024-03-25 Thread Chao Yu

On 2024/3/25 23:02, Daeho Jeong wrote:

On Fri, Mar 22, 2024 at 9:26 PM Chao Yu  wrote:


On 2024/3/21 1:42, Daeho Jeong wrote:

On Wed, Mar 20, 2024 at 2:38 AM Chao Yu  wrote:


On 2024/3/20 5:23, Daeho Jeong wrote:

From: Daeho Jeong 

In a case writing without fallocate(), we can't guarantee it's allocated
in the conventional area for zoned stroage.

Signed-off-by: Daeho Jeong 
---
v2: covered the direct io case
v3: covered the mkwrite case
---
fs/f2fs/data.c | 14 --
fs/f2fs/file.c | 16 
2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index c21b92f18463..d3e5ab2736a6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1584,8 +1584,11 @@ int f2fs_map_blocks(struct inode *inode, struct 
f2fs_map_blocks *map, int flag)

/* use out-place-update for direct IO under LFS mode */
if (map->m_may_create &&
- (is_hole || (f2fs_lfs_mode(sbi) && flag == F2FS_GET_BLOCK_DIO))) {
- if (unlikely(f2fs_cp_error(sbi))) {
+ (is_hole || (f2fs_lfs_mode(sbi) && flag == F2FS_GET_BLOCK_DIO &&
+  !f2fs_is_pinned_file(inode {
+ if (unlikely(f2fs_cp_error(sbi)) ||
+ (f2fs_is_pinned_file(inode) && is_hole &&
+  flag != F2FS_GET_BLOCK_PRE_DIO)) {
err = -EIO;
goto sync_out;
}
@@ -3378,6 +3381,8 @@ static int prepare_write_begin(struct f2fs_sb_info *sbi,
f2fs_map_lock(sbi, flag);
locked = true;
} else if ((pos & PAGE_MASK) >= i_size_read(inode)) {
+ if (f2fs_is_pinned_file(inode))
+ return -EIO;
f2fs_map_lock(sbi, flag);
locked = true;
}
@@ -3407,6 +3412,11 @@ static int prepare_write_begin(struct f2fs_sb_info *sbi,

if (!f2fs_lookup_read_extent_cache_block(inode, index,
 _blkaddr)) {
+ if (f2fs_is_pinned_file(inode)) {
+ err = -EIO;
+ goto out;
+ }
+
if (locked) {
err = f2fs_reserve_block(, index);
goto out;
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 82277e95c88f..4db3b21c804b 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -57,7 +57,7 @@ static vm_fault_t f2fs_vm_page_mkwrite(struct vm_fault *vmf)
struct inode *inode = file_inode(vmf->vma->vm_file);
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
struct dnode_of_data dn;
- bool need_alloc = true;
+ bool need_alloc = !f2fs_is_pinned_file(inode);


Will this check races w/ pinfile get|set?


Do you mean "set/clear" case? I believe "set" case is okay, since we


Yup,


can't set if the inode already has a data block. For "clear" case, I


However, we can set pinfile on written inode in regular block device:


You're right. I missed it. Maybe I think we should keep the concept
consistent across devices regardless of zoned storage support. How
about preventing file pinning for already written inodes across all
device types? I am changing the pinfile concept by allowing the users


I'm okay with that, or we can tries to migrate data block of target file
from seq-zone to conv-zone or regular-device before setting it w/ pin flag...

Thanks,


to write on only fallocate()-ed space.



 if (f2fs_sb_has_blkzoned(sbi) && F2FS_HAS_BLOCKS(inode)) {
 ret = -EFBIG;
 goto out;
 }

Should we add the logic only if blkzoned feture is enabled?


believe mkwrite failure is okay in racy conditions caused by clearing
the pin flag. What do you think?


Or we can use filemap_invalidate_lock() in f2fs_ioc_set_pin_file() to
avoid the race condition?

Thanks,





Thanks,


int err = 0;
vm_fault_t ret;

@@ -114,19 +114,15 @@ static vm_fault_t f2fs_vm_page_mkwrite(struct vm_fault 
*vmf)
goto out_sem;
}

+ set_new_dnode(, inode, NULL, NULL, 0);
if (need_alloc) {
/* block allocation */
- set_new_dnode(, inode, NULL, NULL, 0);
err = f2fs_get_block_locked(, page->index);
- }
-
-#ifdef CONFIG_F2FS_FS_COMPRESSION
- if (!need_alloc) {
- set_new_dnode(, inode, NULL, NULL, 0);
+ } else {
err = f2fs_get_dnode_of_data(, page->index, LOOKUP_NODE);
f2fs_put_dnode();
}
-#endif
+
if (err) {
unlock_page(page);
goto out_sem;
@@ -4611,6 +4607,10 @@ static int f2fs_preallocate_blocks(struct kiocb *iocb, 
struct iov_iter *iter,
return ret;
}

+ /* For pinned files, it should be fallocate()-ed in advance. */
+ if (f2fs_is_pinned_file(inode))
+ return 0;
+
/* Do not preallocate blocks that will be written partially in 4KB. */

Re: [f2fs-dev] [bug report]WARNING: CPU: 22 PID: 44011 at fs/iomap/iter.c:51 iomap_iter+0x32b observed with blktests zbd/010

2024-03-25 Thread Chao Yu

On 2024/3/25 14:56, Shinichiro Kawasaki wrote:

On Mar 25, 2024 / 11:06, Chao Yu wrote:

On 2024/3/25 10:14, Shinichiro Kawasaki wrote:

On Mar 24, 2024 / 20:13, Chao Yu wrote:
...

Hi Shinichiro,

Can you please check below diff? IIUC, for the case: f2fs_map_blocks()
returns zero blkaddr in non-primary device, which is a verified valid
block address, we'd better to check m_flags & F2FS_MAP_MAPPED instead
of map.m_pblk != NULL_ADDR to decide whether tagging IOMAP_MAPPED flag
or not.

---
   fs/f2fs/data.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 6f66e3e4221a..41a56d4298c8 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -4203,7 +4203,7 @@ static int f2fs_iomap_begin(struct inode *inode, loff_t 
offset, loff_t length,
if (WARN_ON_ONCE(map.m_pblk == COMPRESS_ADDR))
return -EINVAL;

-   if (map.m_pblk != NULL_ADDR) {
+   if (map.m_flags & F2FS_MAP_MAPPED) {
iomap->length = blks_to_bytes(inode, map.m_len);
iomap->type = IOMAP_MAPPED;
iomap->flags |= IOMAP_F_MERGED;



Thanks Chao, I confirmed that the diff above avoids the WARN and zbd/010
failure. From that point of view, it looks good.


Thank you for the confirmation. :)



One thing I noticed is that the commit message of 8d3c1fa3fa5ea ("f2fs:
don't rely on F2FS_MAP_* in f2fs_iomap_begin") says that f2fs_map_blocks()
might be setting F2FS_MAP_* flag on a hole, and that's why the commit
avoided the F2FS_MAP_MAPPED flag check. So I was not sure if it is the
right thing to reintroduce the flag check.


I didn't see such logic in previous f2fs_map_blocks(, F2FS_GET_BLOCK_DIO) 
codebase,
I doubt it hits the same case: map.m_pblk is valid zero blkaddr which locates in
the head of secondary device? What do you think?

Quoted commit message from 8d3c1fa3fa5ea:

When testing with a mixed zoned / convention device combination, there
are regular but not 100% reproducible failures in xfstests generic/113
where the __is_valid_data_blkaddr assert hits due to finding a hole.

Previous code:

-   if (map.m_flags & (F2FS_MAP_MAPPED | F2FS_MAP_UNWRITTEN)) {
-   iomap->length = blks_to_bytes(inode, map.m_len);
-   if (map.m_flags & F2FS_MAP_MAPPED) {
-   iomap->type = IOMAP_MAPPED;
-   iomap->flags |= IOMAP_F_MERGED;
-   } else {
-   iomap->type = IOMAP_UNWRITTEN;
-   }
-   if (WARN_ON_ONCE(!__is_valid_data_blkaddr(map.m_pblk)))
-   return -EINVAL;


Hmm, I can agree with your guess. Let me add two more points:

1) The commit message says that the generic/113 failure was not 100% recreated.
So it was difficult to confirm that the commit avoided the failure, 
probably.

2) I ran zbd/010 using the kernel just before the commit 8d3c1fa3fa5ea, and
observed the WARN in the hunk you quoted above.

  WARNING: CPU: 1 PID: 1035 at fs/f2fs/data.c:4164 
f2fs_iomap_begin+0x19e/0x1b0 [f2fs]

So, it implies that the WARN observed xfstests generic/113 has same failure
scenario as blktests zbd/010, probably.


Yup,




Based on these guesses, I think your fix diff is reasonable. If you post it as a
formal patch, feel free to add my:

Tested-by: Shin'ichiro Kawasaki 


Thank you for the test!

I've submitted a formal patch, let me know, if you have any comments on it, or 
want
to update it.

https://lore.kernel.org/linux-f2fs-devel/20240325152623.797099-1-c...@kernel.org/

Thanks,


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH v14 8/9] ext4: Move CONFIG_UNICODE defguards into the code flow

2024-03-25 Thread Gabriel Krisman Bertazi
Eugen Hristev  writes:

> From: Gabriel Krisman Bertazi 
>
> Instead of a bunch of ifdefs, make the unicode built checks part of the
> code flow where possible, as requested by Torvalds.
>
> Signed-off-by: Gabriel Krisman Bertazi 
> [eugen.hris...@collabora.com: port to 6.8-rc3]
> Signed-off-by: Eugen Hristev 
> ---
>  fs/ext4/crypto.c | 19 +++
>  fs/ext4/ext4.h   | 33 +
>  fs/ext4/namei.c  | 14 +-
>  fs/ext4/super.c  |  4 +---
>  4 files changed, 30 insertions(+), 40 deletions(-)
>
> diff --git a/fs/ext4/crypto.c b/fs/ext4/crypto.c
> index 7ae0b61258a7..1d2f8b79529c 100644
> --- a/fs/ext4/crypto.c
> +++ b/fs/ext4/crypto.c
> @@ -31,12 +31,7 @@ int ext4_fname_setup_filename(struct inode *dir, const 
> struct qstr *iname,
>  
>   ext4_fname_from_fscrypt_name(fname, );
>  
> -#if IS_ENABLED(CONFIG_UNICODE)
> - err = ext4_fname_setup_ci_filename(dir, iname, fname);
> - if (err)
> - ext4_fname_free_filename(fname);
> -#endif
> - return err;
> + return ext4_fname_setup_ci_filename(dir, iname, fname);
>  }

Oops. I ended up replying to v10 but it still applies in the latest
version. copying it here for reference:

  This shouldn't remove the error path.  It effectively reintroduces the
  memory leak fixed by commit 7ca4b085f430 ("ext4: fix memory leaks in
  ext4_fname_{setup_filename,prepare_lookup}").

  This patch was only about inlining the codeguards, so it shouldn't be
  changing the logic.

>  int ext4_fname_prepare_lookup(struct inode *dir, struct dentry *dentry,
> @@ -51,12 +46,7 @@ int ext4_fname_prepare_lookup(struct inode *dir, struct 
> dentry *dentry,
>  
>   ext4_fname_from_fscrypt_name(fname, );
>  
> -#if IS_ENABLED(CONFIG_UNICODE)
> - err = ext4_fname_setup_ci_filename(dir, >d_name, fname);
> - if (err)
> - ext4_fname_free_filename(fname);
> -#endif
> - return err;
> + return ext4_fname_setup_ci_filename(dir, >d_name, fname);
>  }
>  
>  void ext4_fname_free_filename(struct ext4_filename *fname)
> @@ -70,10 +60,7 @@ void ext4_fname_free_filename(struct ext4_filename *fname)
>   fname->usr_fname = NULL;
>   fname->disk_name.name = NULL;
>  
> -#if IS_ENABLED(CONFIG_UNICODE)
> - kfree(fname->cf_name.name);
> - fname->cf_name.name = NULL;
> -#endif
> + ext4_fname_free_ci_filename(fname);
>  }
>  
>  static bool uuid_is_zero(__u8 u[16])
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 4061d11b9763..c68f48f706cd 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -2740,8 +2740,25 @@ ext4_fsblk_t ext4_inode_to_goal_block(struct inode *);
>  
>  #if IS_ENABLED(CONFIG_UNICODE)
>  extern int ext4_fname_setup_ci_filename(struct inode *dir,
> -  const struct qstr *iname,
> -  struct ext4_filename *fname);
> + const struct qstr *iname,
> + struct ext4_filename *fname);
> +
> +static inline void ext4_fname_free_ci_filename(struct ext4_filename *fname)
> +{
> + kfree(fname->cf_name.name);
> + fname->cf_name.name = NULL;
> +}
> +#else
> +static inline int ext4_fname_setup_ci_filename(struct inode *dir,
> +const struct qstr *iname,
> +struct ext4_filename *fname)
> +{
> + return 0;
> +}
> +
> +static inline void ext4_fname_free_ci_filename(struct ext4_filename *fname)
> +{
> +}
>  #endif
>  
>  /* ext4 encryption related stuff goes here crypto.c */
> @@ -2764,16 +2781,11 @@ static inline int ext4_fname_setup_filename(struct 
> inode *dir,
>   int lookup,
>   struct ext4_filename *fname)
>  {
> - int err = 0;
>   fname->usr_fname = iname;
>   fname->disk_name.name = (unsigned char *) iname->name;
>   fname->disk_name.len = iname->len;
>  
> -#if IS_ENABLED(CONFIG_UNICODE)
> - err = ext4_fname_setup_ci_filename(dir, iname, fname);
> -#endif
> -
> - return err;
> + return ext4_fname_setup_ci_filename(dir, iname, fname);
>  }
>  
>  static inline int ext4_fname_prepare_lookup(struct inode *dir,
> @@ -2785,10 +2797,7 @@ static inline int ext4_fname_prepare_lookup(struct 
> inode *dir,
>  
>  static inline void ext4_fname_free_filename(struct ext4_filename *fname)
>  {
> -#if IS_ENABLED(CONFIG_UNICODE)
> - kfree(fname->cf_name.name);
> - fname->cf_name.name = NULL;
> -#endif
> + ext4_fname_free_ci_filename(fname);
>  }
>  
>  static inline int ext4_ioctl_get_encryption_pwsalt(struct file *filp,
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 3268cf45d9db..a5d9e5b01015 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1834,8 +1834,7 @@ static struct dentry *ext4_lookup(struct inode *dir, 
> struct dentry *dentry, unsi
>   }
>   }
>  
> -#if IS_ENABLED(CONFIG_UNICODE)
> -   

[f2fs-dev] [PATCH 2/2] f2fs: support to map continuous holes or preallocated address

2024-03-25 Thread Chao Yu
This patch supports to map continuous holes or preallocated addresses
to improve performace of lookuping mapping info during read DIO.

[testcase 1]
xfs_io -f /mnt/f2fs/hole -c "truncate 1m" -c "fsync"
xfs_io -d /mnt/f2fs/hole -c "pread -b 1m 0 1m"

[before]
f2fs_direct_IO_enter: dev = (253,16), ino = 6 pos = 0 len = 1048576 ki_flags = 
2 ki_ioprio = 0 rw = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 0, start blkaddr = 0x0, 
len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag = 3, 
err = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 1, start blkaddr = 0x0, 
len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag = 3, 
err = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 2, start blkaddr = 0x0, 
len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag = 3, 
err = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 3, start blkaddr = 0x0, 
len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag = 3, 
err = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 4, start blkaddr = 0x0, 
len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag = 3, 
err = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 5, start blkaddr = 0x0, 
len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag = 3, 
err = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 6, start blkaddr = 0x0, 
len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag = 3, 
err = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 7, start blkaddr = 0x0, 
len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag = 3, 
err = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 8, start blkaddr = 0x0, 
len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag = 3, 
err = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 9, start blkaddr = 0x0, 
len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag = 3, 
err = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 10, start blkaddr = 
0x0, len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag 
= 3, err = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 11, start blkaddr = 
0x0, len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag 
= 3, err = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 12, start blkaddr = 
0x0, len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag 
= 3, err = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 13, start blkaddr = 
0x0, len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag 
= 3, err = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 14, start blkaddr = 
0x0, len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag 
= 3, err = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 15, start blkaddr = 
0x0, len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag 
= 3, err = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 16, start blkaddr = 
0x0, len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag 
= 3, err = 0
..
f2fs_direct_IO_exit: dev = (253,16), ino = 6 pos = 0 len = 1048576 rw = 0 ret = 
1048576

[after]
f2fs_direct_IO_enter: dev = (253,16), ino = 6 pos = 0 len = 1048576 ki_flags = 
2 ki_ioprio = 0 rw = 0
f2fs_map_blocks: dev = (253,16), ino = 6, file offset = 0, start blkaddr = 0x0, 
len = 0x100, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag = 
3, err = 0
f2fs_direct_IO_exit: dev = (253,16), ino = 6 pos = 0 len = 1048576 rw = 0 ret = 
1048576

[testcase 2]
xfs_io -f /mnt/f2fs/preallocated -c "falloc 0 1m" -c "fsync"
xfs_io -d /mnt/f2fs/preallocated -c "pread -b 1m 0 1m"

[before]
f2fs_direct_IO_enter: dev = (253,16), ino = 11 pos = 0 len = 1048576 ki_flags = 
2 ki_ioprio = 0 rw = 0
f2fs_map_blocks: dev = (253,16), ino = 11, file offset = 0, start blkaddr = 
0x0, len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag 
= 3, err = 0
f2fs_map_blocks: dev = (253,16), ino = 11, file offset = 1, start blkaddr = 
0x0, len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag 
= 3, err = 0
f2fs_map_blocks: dev = (253,16), ino = 11, file offset = 2, start blkaddr = 
0x0, len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag 
= 3, err = 0
f2fs_map_blocks: dev = (253,16), ino = 11, file offset = 3, start blkaddr = 
0x0, len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag 
= 3, err = 0
f2fs_map_blocks: dev = (253,16), ino = 11, file offset = 4, start blkaddr = 
0x0, len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag 
= 3, err = 0
f2fs_map_blocks: dev = (253,16), ino = 11, file offset = 5, start blkaddr = 
0x0, len = 0x0, flags = 0, seg_type = 1, may_create = 0, multidevice = 0, flag 
= 3, err = 0

[f2fs-dev] [PATCH 1/2] f2fs: introduce map_is_mergeable() for cleanup

2024-03-25 Thread Chao Yu
No logic changes.

Signed-off-by: Chao Yu 
---
 fs/f2fs/data.c | 24 ++--
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 5ef1874b572a..9c000ca4f808 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1507,6 +1507,23 @@ static bool f2fs_map_blocks_cached(struct inode *inode,
return true;
 }
 
+static bool map_is_mergeable(struct f2fs_sb_info *sbi,
+   struct f2fs_map_blocks *map,
+   block_t blkaddr, int flag, int bidx,
+   int ofs)
+{
+   if (map->m_multidev_dio && map->m_bdev != FDEV(bidx).bdev)
+   return false;
+   if (map->m_pblk != NEW_ADDR &&
+   blkaddr == (map->m_pblk + ofs))
+   return true;
+   if (map->m_pblk == NEW_ADDR && blkaddr == NEW_ADDR)
+   return true;
+   if (flag == F2FS_GET_BLOCK_PRE_DIO)
+   return true;
+   return false;
+}
+
 /*
  * f2fs_map_blocks() tries to find or build mapping relationship which
  * maps continuous logical blocks to physical blocks, and return such
@@ -1653,12 +1670,7 @@ int f2fs_map_blocks(struct inode *inode, struct 
f2fs_map_blocks *map, int flag)
 
if (map->m_multidev_dio)
map->m_bdev = FDEV(bidx).bdev;
-   } else if ((map->m_pblk != NEW_ADDR &&
-   blkaddr == (map->m_pblk + ofs)) ||
-   (map->m_pblk == NEW_ADDR && blkaddr == NEW_ADDR) ||
-   flag == F2FS_GET_BLOCK_PRE_DIO) {
-   if (map->m_multidev_dio && map->m_bdev != FDEV(bidx).bdev)
-   goto sync_out;
+   } else if (map_is_mergeable(sbi, map, blkaddr, flag, bidx, ofs)) {
ofs++;
map->m_len++;
} else {
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fs: multidev: fix to recognize valid zero block address

2024-03-25 Thread Chao Yu
As reported by Yi Zhang in mailing list [1], kernel warning was catched
during zbd/010 test as below:

./check zbd/010
zbd/010 (test gap zone support with F2FS)[failed]
runtime...  3.752s
something found in dmesg:
[ 4378.146781] run blktests zbd/010 at 2024-02-18 11:31:13
[ 4378.192349] null_blk: module loaded
[ 4378.209860] null_blk: disk nullb0 created
[ 4378.413285] scsi_debug:sdebug_driver_probe: scsi_debug: trim
poll_queues to 0. poll_q/nr_hw = (0/1)
[ 4378.422334] scsi host15: scsi_debug: version 0191 [20210520]
 dev_size_mb=1024, opts=0x0, submit_queues=1, statistics=0
[ 4378.434922] scsi 15:0:0:0: Direct-Access-ZBC Linux
scsi_debug   0191 PQ: 0 ANSI: 7
[ 4378.443343] scsi 15:0:0:0: Power-on or device reset occurred
[ 4378.449371] sd 15:0:0:0: Attached scsi generic sg5 type 20
[ 4378.449418] sd 15:0:0:0: [sdf] Host-managed zoned block device
...
(See 
'/mnt/tests/gitlab.com/api/v4/projects/19168116/repository/archive.zip/storage/blktests/blk/blktests/results/nodev/zbd/010.dmesg'

WARNING: CPU: 22 PID: 44011 at fs/iomap/iter.c:51
CPU: 22 PID: 44011 Comm: fio Not tainted 6.8.0-rc3+ #1
RIP: 0010:iomap_iter+0x32b/0x350
Call Trace:
 
 __iomap_dio_rw+0x1df/0x830
 f2fs_file_read_iter+0x156/0x3d0 [f2fs]
 aio_read+0x138/0x210
 io_submit_one+0x188/0x8c0
 __x64_sys_io_submit+0x8c/0x1a0
 do_syscall_64+0x86/0x170
 entry_SYSCALL_64_after_hwframe+0x6e/0x76

Shinichiro Kawasaki helps to analyse this issue and proposes a potential
fixing patch in [2].

Quoted from reply of Shinichiro Kawasaki:

"I confirmed that the trigger commit is dbf8e63f48af as Yi reported. I took a
look in the commit, but it looks fine to me. So I thought the cause is not
in the commit diff.

I found the WARN is printed when the f2fs is set up with multiple devices,
and read requests are mapped to the very first block of the second device in the
direct read path. In this case, f2fs_map_blocks() and f2fs_map_blocks_cached()
modify map->m_pblk as the physical block address from each block device. It
becomes zero when it is mapped to the first block of the device. However,
f2fs_iomap_begin() assumes that map->m_pblk is the physical block address of the
whole f2fs, across the all block devices. It compares map->m_pblk against
NULL_ADDR == 0, then go into the unexpected branch and sets the invalid
iomap->length. The WARN catches the invalid iomap->length.

This WARN is printed even for non-zoned block devices, by following steps.

 - Create two (non-zoned) null_blk devices memory backed with 128MB size each:
   nullb0 and nullb1.
 # mkfs.f2fs /dev/nullb0 -c /dev/nullb1
 # mount -t f2fs /dev/nullb0 "${mount_dir}"
 # dd if=/dev/zero of="${mount_dir}/test.dat" bs=1M count=192
 # dd if="${mount_dir}/test.dat" of=/dev/null bs=1M count=192 iflag=direct

..."

So, the root cause of this issue is: when multi-devices feature is on,
f2fs_map_blocks() may return zero blkaddr in non-primary device, which is
a verified valid block address, however, f2fs_iomap_begin() treats it as
an invalid block address, and then it triggers the warning in iomap
framework code.

Finally, as discussed, we decide to use a more simple and direct way that
checking (map.m_flags & F2FS_MAP_MAPPED) condition instead of
(map.m_pblk != NULL_ADDR) to fix this issue.

Thanks a lot for the effort of Yi Zhang and Shinichiro Kawasaki on this
issue.

[1] 
https://lore.kernel.org/linux-f2fs-devel/CAHj4cs-kfojYC9i0G73PRkYzcxCTex=-vugrfep40g_urgv...@mail.gmail.com/
[2] 
https://lore.kernel.org/linux-f2fs-devel/gngdj77k4picagsfdtiaa7gpgnup6fsgwzsltx6milmhegmjff@iax2n4wvrqye/

Reported-by: Yi Zhang 
Closes: 
https://lore.kernel.org/linux-f2fs-devel/CAHj4cs-kfojYC9i0G73PRkYzcxCTex=-vugrfep40g_urgv...@mail.gmail.com/
Tested-by: Shin'ichiro Kawasaki 
Fixes: 1517c1a7a445 ("f2fs: implement iomap operations")
Fixes: 8d3c1fa3fa5e ("f2fs: don't rely on F2FS_MAP_* in f2fs_iomap_begin")
Signed-off-by: Chao Yu 
---
 fs/f2fs/data.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index d9494b5fc7c1..5ef1874b572a 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -4185,7 +4185,7 @@ static int f2fs_iomap_begin(struct inode *inode, loff_t 
offset, loff_t length,
if (WARN_ON_ONCE(map.m_pblk == COMPRESS_ADDR))
return -EINVAL;
 
-   if (map.m_pblk != NULL_ADDR) {
+   if (map.m_flags & F2FS_MAP_MAPPED) {
iomap->length = blks_to_bytes(inode, map.m_len);
iomap->type = IOMAP_MAPPED;
iomap->flags |= IOMAP_F_MERGED;
-- 
2.40.1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH v3] f2fs: prevent writing without fallocate() for pinned files

2024-03-25 Thread Daeho Jeong
On Fri, Mar 22, 2024 at 9:26 PM Chao Yu  wrote:
>
> On 2024/3/21 1:42, Daeho Jeong wrote:
> > On Wed, Mar 20, 2024 at 2:38 AM Chao Yu  wrote:
> >>
> >> On 2024/3/20 5:23, Daeho Jeong wrote:
> >>> From: Daeho Jeong 
> >>>
> >>> In a case writing without fallocate(), we can't guarantee it's allocated
> >>> in the conventional area for zoned stroage.
> >>>
> >>> Signed-off-by: Daeho Jeong 
> >>> ---
> >>> v2: covered the direct io case
> >>> v3: covered the mkwrite case
> >>> ---
> >>>fs/f2fs/data.c | 14 --
> >>>fs/f2fs/file.c | 16 
> >>>2 files changed, 20 insertions(+), 10 deletions(-)
> >>>
> >>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> >>> index c21b92f18463..d3e5ab2736a6 100644
> >>> --- a/fs/f2fs/data.c
> >>> +++ b/fs/f2fs/data.c
> >>> @@ -1584,8 +1584,11 @@ int f2fs_map_blocks(struct inode *inode, struct 
> >>> f2fs_map_blocks *map, int flag)
> >>>
> >>>/* use out-place-update for direct IO under LFS mode */
> >>>if (map->m_may_create &&
> >>> - (is_hole || (f2fs_lfs_mode(sbi) && flag == 
> >>> F2FS_GET_BLOCK_DIO))) {
> >>> - if (unlikely(f2fs_cp_error(sbi))) {
> >>> + (is_hole || (f2fs_lfs_mode(sbi) && flag == F2FS_GET_BLOCK_DIO &&
> >>> +  !f2fs_is_pinned_file(inode {
> >>> + if (unlikely(f2fs_cp_error(sbi)) ||
> >>> + (f2fs_is_pinned_file(inode) && is_hole &&
> >>> +  flag != F2FS_GET_BLOCK_PRE_DIO)) {
> >>>err = -EIO;
> >>>goto sync_out;
> >>>}
> >>> @@ -3378,6 +3381,8 @@ static int prepare_write_begin(struct f2fs_sb_info 
> >>> *sbi,
> >>>f2fs_map_lock(sbi, flag);
> >>>locked = true;
> >>>} else if ((pos & PAGE_MASK) >= i_size_read(inode)) {
> >>> + if (f2fs_is_pinned_file(inode))
> >>> + return -EIO;
> >>>f2fs_map_lock(sbi, flag);
> >>>locked = true;
> >>>}
> >>> @@ -3407,6 +3412,11 @@ static int prepare_write_begin(struct f2fs_sb_info 
> >>> *sbi,
> >>>
> >>>if (!f2fs_lookup_read_extent_cache_block(inode, index,
> >>> _blkaddr)) {
> >>> + if (f2fs_is_pinned_file(inode)) {
> >>> + err = -EIO;
> >>> + goto out;
> >>> + }
> >>> +
> >>>if (locked) {
> >>>err = f2fs_reserve_block(, index);
> >>>goto out;
> >>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> >>> index 82277e95c88f..4db3b21c804b 100644
> >>> --- a/fs/f2fs/file.c
> >>> +++ b/fs/f2fs/file.c
> >>> @@ -57,7 +57,7 @@ static vm_fault_t f2fs_vm_page_mkwrite(struct vm_fault 
> >>> *vmf)
> >>>struct inode *inode = file_inode(vmf->vma->vm_file);
> >>>struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
> >>>struct dnode_of_data dn;
> >>> - bool need_alloc = true;
> >>> + bool need_alloc = !f2fs_is_pinned_file(inode);
> >>
> >> Will this check races w/ pinfile get|set?
> >
> > Do you mean "set/clear" case? I believe "set" case is okay, since we
>
> Yup,
>
> > can't set if the inode already has a data block. For "clear" case, I
>
> However, we can set pinfile on written inode in regular block device:

You're right. I missed it. Maybe I think we should keep the concept
consistent across devices regardless of zoned storage support. How
about preventing file pinning for already written inodes across all
device types? I am changing the pinfile concept by allowing the users
to write on only fallocate()-ed space.

>
> if (f2fs_sb_has_blkzoned(sbi) && F2FS_HAS_BLOCKS(inode)) {
> ret = -EFBIG;
> goto out;
> }
>
> Should we add the logic only if blkzoned feture is enabled?
>
> > believe mkwrite failure is okay in racy conditions caused by clearing
> > the pin flag. What do you think?
>
> Or we can use filemap_invalidate_lock() in f2fs_ioc_set_pin_file() to
> avoid the race condition?
>
> Thanks,
>
> >
> >>
> >> Thanks,
> >>
> >>>int err = 0;
> >>>vm_fault_t ret;
> >>>
> >>> @@ -114,19 +114,15 @@ static vm_fault_t f2fs_vm_page_mkwrite(struct 
> >>> vm_fault *vmf)
> >>>goto out_sem;
> >>>}
> >>>
> >>> + set_new_dnode(, inode, NULL, NULL, 0);
> >>>if (need_alloc) {
> >>>/* block allocation */
> >>> - set_new_dnode(, inode, NULL, NULL, 0);
> >>>err = f2fs_get_block_locked(, page->index);
> >>> - }
> >>> -
> >>> -#ifdef CONFIG_F2FS_FS_COMPRESSION
> >>> - if (!need_alloc) {
> >>> - set_new_dnode(, inode, NULL, NULL, 0);
> >>> + } else {
> >>>err = f2fs_get_dnode_of_data(, page->index, 
> >>> LOOKUP_NODE);
> >>>f2fs_put_dnode();
> >>>}
> >>> -#endif
> >>> +
> >>>if (err) {
> >>>unlock_page(page);
> >>>   

Re: [f2fs-dev] [bug report]WARNING: CPU: 22 PID: 44011 at fs/iomap/iter.c:51 iomap_iter+0x32b observed with blktests zbd/010

2024-03-25 Thread Shinichiro Kawasaki via Linux-f2fs-devel
On Mar 25, 2024 / 11:06, Chao Yu wrote:
> On 2024/3/25 10:14, Shinichiro Kawasaki wrote:
> > On Mar 24, 2024 / 20:13, Chao Yu wrote:
> > ...
> > > Hi Shinichiro,
> > > 
> > > Can you please check below diff? IIUC, for the case: f2fs_map_blocks()
> > > returns zero blkaddr in non-primary device, which is a verified valid
> > > block address, we'd better to check m_flags & F2FS_MAP_MAPPED instead
> > > of map.m_pblk != NULL_ADDR to decide whether tagging IOMAP_MAPPED flag
> > > or not.
> > > 
> > > ---
> > >   fs/f2fs/data.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> > > index 6f66e3e4221a..41a56d4298c8 100644
> > > --- a/fs/f2fs/data.c
> > > +++ b/fs/f2fs/data.c
> > > @@ -4203,7 +4203,7 @@ static int f2fs_iomap_begin(struct inode *inode, 
> > > loff_t offset, loff_t length,
> > >   if (WARN_ON_ONCE(map.m_pblk == COMPRESS_ADDR))
> > >   return -EINVAL;
> > > 
> > > - if (map.m_pblk != NULL_ADDR) {
> > > + if (map.m_flags & F2FS_MAP_MAPPED) {
> > >   iomap->length = blks_to_bytes(inode, map.m_len);
> > >   iomap->type = IOMAP_MAPPED;
> > >   iomap->flags |= IOMAP_F_MERGED;
> > > 
> > 
> > Thanks Chao, I confirmed that the diff above avoids the WARN and zbd/010
> > failure. From that point of view, it looks good.
> 
> Thank you for the confirmation. :)
> 
> > 
> > One thing I noticed is that the commit message of 8d3c1fa3fa5ea ("f2fs:
> > don't rely on F2FS_MAP_* in f2fs_iomap_begin") says that f2fs_map_blocks()
> > might be setting F2FS_MAP_* flag on a hole, and that's why the commit
> > avoided the F2FS_MAP_MAPPED flag check. So I was not sure if it is the
> > right thing to reintroduce the flag check.
> 
> I didn't see such logic in previous f2fs_map_blocks(, F2FS_GET_BLOCK_DIO) 
> codebase,
> I doubt it hits the same case: map.m_pblk is valid zero blkaddr which locates 
> in
> the head of secondary device? What do you think?
> 
> Quoted commit message from 8d3c1fa3fa5ea:
> 
> When testing with a mixed zoned / convention device combination, there
> are regular but not 100% reproducible failures in xfstests generic/113
> where the __is_valid_data_blkaddr assert hits due to finding a hole.
> 
> Previous code:
> 
> - if (map.m_flags & (F2FS_MAP_MAPPED | F2FS_MAP_UNWRITTEN)) {
> - iomap->length = blks_to_bytes(inode, map.m_len);
> - if (map.m_flags & F2FS_MAP_MAPPED) {
> - iomap->type = IOMAP_MAPPED;
> - iomap->flags |= IOMAP_F_MERGED;
> - } else {
> - iomap->type = IOMAP_UNWRITTEN;
> - }
> - if (WARN_ON_ONCE(!__is_valid_data_blkaddr(map.m_pblk)))
> - return -EINVAL;

Hmm, I can agree with your guess. Let me add two more points:

1) The commit message says that the generic/113 failure was not 100% recreated.
   So it was difficult to confirm that the commit avoided the failure, probably.

2) I ran zbd/010 using the kernel just before the commit 8d3c1fa3fa5ea, and
   observed the WARN in the hunk you quoted above.

 WARNING: CPU: 1 PID: 1035 at fs/f2fs/data.c:4164 
f2fs_iomap_begin+0x19e/0x1b0 [f2fs]

   So, it implies that the WARN observed xfstests generic/113 has same failure
   scenario as blktests zbd/010, probably.


Based on these guesses, I think your fix diff is reasonable. If you post it as a
formal patch, feel free to add my:

Tested-by: Shin'ichiro Kawasaki 


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel