在 8/28/2025 9:44 AM, Chao Yu 写道:
> On 8/26/25 09:48, 王晓珺 wrote:
>> 在 8/25/2025 10:08 AM, Chao Yu 写道:
>>> On 8/20/25 15:54, Wang Xiaojun wrote:
>>>> This patch fixes missing space reclamation during the recovery process.
>>>>
>>>> In the following scenarios, F2FS cannot reclaim truncated space.
>>>> case 1:
>>>> write file A, size is 1G | CP | truncate A to 1M | fsync A | SPO
>>>>
>>>> case 2:
>>>> CP | write file A, size is 1G | fsync A | truncate A to 1M | fsync A |SPO
>>>>
>>>> During the recovery process, F2FS will recover file A,
>>>> but the 1M-1G space cannot be reclaimed.
>>>>
>>>> But the combination of truncate and falloc complicates the recovery
>>>> process.
>>>> For example, in the following scenario:
>>>> write fileA 2M | fsync | truncate 256K | falloc -k 256K 1M | fsync A | SPO
>>>> The falloc (256K, 1M) need to be recovered as pre-allocated space.
>>>>
>>>> However in the following scenarios, the situation is the opposite.
>>>> write fileA 2M | fsync | falloc -k 2M 10M | fsync A | truncate 256K |
>>>> fsync A | SPO
>>>> In this scenario, the space allocated by falloc needs to be truncated.
>>>>
>>>> During the recovery process, it is difficult to distinguish
>>>> between the above two types of falloc.
>>>>
>>>> So in this case of falloc -k we need to trigger a checkpoint for fsync.
>>>>
>>>> Fixes: d624c96fb3249 ("f2fs: add recovery routines for roll-forward")
>>>>
>>>> Signed-off-by: Wang Xiaojun <wangxiao...@vivo.com>
>>>> ---
>>>> v4: Trigger checkpoint for fsync after falloc -k
>>>> v3: Add a Fixes line.
>>>> v2: Apply Chao's suggestion from v1. No logical changes.
>>>> v1: Fix missing space reclamation during the recovery process.
>>>> ---
>>>>    fs/f2fs/checkpoint.c |  3 +++
>>>>    fs/f2fs/f2fs.h       |  3 +++
>>>>    fs/f2fs/file.c       |  8 ++++++--
>>>>    fs/f2fs/recovery.c   | 18 +++++++++++++++++-
>>>>    4 files changed, 29 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
>>>> index db3831f7f2f5..775e3333097e 100644
>>>> --- a/fs/f2fs/checkpoint.c
>>>> +++ b/fs/f2fs/checkpoint.c
>>>> @@ -1151,6 +1151,9 @@ static int f2fs_sync_inode_meta(struct f2fs_sb_info 
>>>> *sbi)
>>>>                    if (inode) {
>>>>                            sync_inode_metadata(inode, 0);
>>>>    
>>>> +                  if (is_inode_flag_set(inode, FI_FALLOC_KEEP_SIZE))
>>>> +                          clear_inode_flag(inode, FI_FALLOC_KEEP_SIZE);
>>>> +
>>>>                            /* it's on eviction */
>>>>                            if (is_inode_flag_set(inode, FI_DIRTY_INODE))
>>>>                                    f2fs_update_inode_page(inode);
>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>>> index 46be7560548c..f5a54bc848d5 100644
>>>> --- a/fs/f2fs/f2fs.h
>>>> +++ b/fs/f2fs/f2fs.h
>>>> @@ -459,6 +459,7 @@ struct fsync_inode_entry {
>>>>            struct inode *inode;    /* vfs inode pointer */
>>>>            block_t blkaddr;        /* block address locating the last 
>>>> fsync */
>>>>            block_t last_dentry;    /* block address locating the last 
>>>> dentry */
>>>> +  loff_t max_i_size;      /* previous max file size for truncate */
>>>>    };
>>>>    
>>>>    #define nats_in_cursum(jnl)             (le16_to_cpu((jnl)->n_nats))
>>>> @@ -835,6 +836,7 @@ enum {
>>>>            FI_ATOMIC_REPLACE,      /* indicate atomic replace */
>>>>            FI_OPENED_FILE,         /* indicate file has been opened */
>>>>            FI_DONATE_FINISHED,     /* indicate page donation of file has 
>>>> been finished */
>>>> +  FI_FALLOC_KEEP_SIZE,    /* file allocate reserved space and keep size */
>>>>            FI_MAX,                 /* max flag, never be used */
>>>>    };
>>>>    
>>>> @@ -1193,6 +1195,7 @@ enum cp_reason_type {
>>>>            CP_SPEC_LOG_NUM,
>>>>            CP_RECOVER_DIR,
>>>>            CP_XATTR_DIR,
>>>> +  CP_FALLOC_FILE,
>>>>    };
>>>>    
>>>>    enum iostat_type {
>>>> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>>>> index 42faaed6a02d..f0820f817824 100644
>>>> --- a/fs/f2fs/file.c
>>>> +++ b/fs/f2fs/file.c
>>>> @@ -236,6 +236,8 @@ static inline enum cp_reason_type 
>>>> need_do_checkpoint(struct inode *inode)
>>>>            else if (f2fs_exist_written_data(sbi, F2FS_I(inode)->i_pino,
>>>>                                                            XATTR_DIR_INO))
>>>>                    cp_reason = CP_XATTR_DIR;
>>>> +  else if (is_inode_flag_set(inode, FI_FALLOC_KEEP_SIZE))
>>>> +          cp_reason = CP_FALLOC_FILE;
>>>>    
>>>>            return cp_reason;
>>>>    }
>>>> @@ -1953,10 +1955,12 @@ static int f2fs_expand_inode_data(struct inode 
>>>> *inode, loff_t offset,
>>>>            }
>>>>    
>>>>            if (new_size > i_size_read(inode)) {
>>>> -          if (mode & FALLOC_FL_KEEP_SIZE)
>>>> +          if (mode & FALLOC_FL_KEEP_SIZE) {
>>>> +                  set_inode_flag(inode, FI_FALLOC_KEEP_SIZE);
>>> Xiaojun,
>>>
>>> Well, what about this case?
>>>
>>> falloc -k ofs size file
>>> flush all data and metadata of file
>> Hi Chao,
>> Flush all data and metadata of file, but without using fsync or CP?
> Xiaojun,
>
> I think so, or am I missing someting?
>
> Thanks,

Hi Chao,
I think this case is possible. Thank you for pointing out this issue.
I will fix it in the next version.

Thanks,

>> Thanks,
>>
>>> evict inode
>>> write file & fsync file won't trigger a checkpoint?
>>>
>>> Or am I missing something?
>>>
>>> Thanks,
>>>
>>>>                            file_set_keep_isize(inode);
>>>> -          else
>>>> +          } else {
>>>>                            f2fs_i_size_write(inode, new_size);
>>>> +          }
>>>>            }
>>>>    
>>>>            return err;
>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>>> index 4cb3a91801b4..68b62c8a74d3 100644
>>>> --- a/fs/f2fs/recovery.c
>>>> +++ b/fs/f2fs/recovery.c
>>>> @@ -95,6 +95,7 @@ static struct fsync_inode_entry *add_fsync_inode(struct 
>>>> f2fs_sb_info *sbi,
>>>>            entry = f2fs_kmem_cache_alloc(fsync_entry_slab,
>>>>                                            GFP_F2FS_ZERO, true, NULL);
>>>>            entry->inode = inode;
>>>> +  entry->max_i_size = i_size_read(inode);
>>>>            list_add_tail(&entry->list, head);
>>>>    
>>>>            return entry;
>>>> @@ -796,6 +797,7 @@ static int recover_data(struct f2fs_sb_info *sbi, 
>>>> struct list_head *inode_list,
>>>>            while (1) {
>>>>                    struct fsync_inode_entry *entry;
>>>>                    struct folio *folio;
>>>> +          loff_t i_size;
>>>>    
>>>>                    if (!f2fs_is_valid_blkaddr(sbi, blkaddr, META_POR))
>>>>                            break;
>>>> @@ -828,6 +830,9 @@ static int recover_data(struct f2fs_sb_info *sbi, 
>>>> struct list_head *inode_list,
>>>>                                    break;
>>>>                            }
>>>>                            recovered_inode++;
>>>> +                  i_size = i_size_read(entry->inode);
>>>> +                  if (entry->max_i_size < i_size)
>>>> +                          entry->max_i_size = i_size;
>>>>                    }
>>>>                    if (entry->last_dentry == blkaddr) {
>>>>                            err = recover_dentry(entry->inode, folio, 
>>>> dir_list);
>>>> @@ -844,8 +849,19 @@ static int recover_data(struct f2fs_sb_info *sbi, 
>>>> struct list_head *inode_list,
>>>>                    }
>>>>                    recovered_dnode++;
>>>>    
>>>> -          if (entry->blkaddr == blkaddr)
>>>> +          if (entry->blkaddr == blkaddr) {
>>>> +                  i_size = i_size_read(entry->inode);
>>>> +                  if (entry->max_i_size > i_size) {
>>>> +                          err = f2fs_truncate_blocks(entry->inode,
>>>> +                                                  i_size, false);
>>>> +                          if (err) {
>>>> +                                  f2fs_folio_put(folio, true);
>>>> +                                  break;
>>>> +                          }
>>>> +                          f2fs_mark_inode_dirty_sync(entry->inode, true);
>>>> +                  }
>>>>                            list_move_tail(&entry->list, tmp_inode_list);
>>>> +          }
>>>>    next:
>>>>                    ra_blocks = adjust_por_ra_blocks(sbi, ra_blocks, 
>>>> blkaddr,
>>>>                                            next_blkaddr_of_node(folio));
>>


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to