On Fri, Aug 16, 2019 at 11:03:34AM +0800, Chao Yu wrote:
> There is one case can cause data corruption.
> 
> - write 4k to fileA
> - fsync fileA, 4k data is writebacked to lbaA
> - write 4k to fileA
> - kworker flushs 4k to lbaB; dnode contain lbaB didn't be persisted yet
> - write 4k to fileB
> - kworker flush 4k to lbaA due to SSR
> - SPOR -> dnode with lbaA will be recovered, however lbaA contains fileB's
> data
> 
> One solution is tracking all fsynced file's block history, and disallow
> SSR overwrite on newly invalidated block on that file.
> 
> However, during recovery, no matter the dnode is flushed or fsynced, all
> previous dnodes until last fsynced one in node chain can be recovered,
> that means we need to record all block change in flushed dnode, which
> will cause heavy cost, so let's just use simple fix by forbidding SSR
> overwrite directly.
> 
> Signed-off-by: Chao Yu <yuch...@huawei.com>
> ---
>  fs/f2fs/segment.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index 9d9d9a050d59..69b3b553ee6b 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -2205,9 +2205,11 @@ static void update_sit_entry(struct f2fs_sb_info *sbi, 
> block_t blkaddr, int del)
>               if (!f2fs_test_and_set_bit(offset, se->discard_map))
>                       sbi->discard_blks--;
>  
> -             /* don't overwrite by SSR to keep node chain */
> -             if (IS_NODESEG(se->type) &&
> -                             !is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
> +             /*
> +              * SSR should never reuse block which is checkpointed
> +              * or newly invalidated.
> +              */
> +             if (!is_sbi_flag_set(sbi, SBI_CP_DISABLED)) {
>                       if (!f2fs_test_and_set_bit(offset, se->ckpt_valid_map))
>                               se->ckpt_valid_blocks++;
>               }
> -- 

FYI, this commit caused xfstests generic/064 to start failing:

$ kvm-xfstests -c f2fs generic/064
...
generic/064 3s ...      [13:36:37][    5.946293] run fstests generic/064 at 
2019-09-26 13:36:37
 [13:36:41]- output mismatch (see 
/results/f2fs/results-default/generic/064.out.bad)
    --- tests/generic/064.out   2019-09-18 04:53:46.000000000 -0700
    +++ /results/f2fs/results-default/generic/064.out.bad       2019-09-26 
13:36:41.533018683 -0700
    @@ -1,2 +1,3 @@
     QA output created by 064
     Extent count after inserts is in range
    +extents mismatched before = 1 after = 50
    ...
    (Run 'diff -u /root/xfstests/tests/generic/064.out 
/results/f2fs/results-default/generic/064.out.bad'  to see the entire diff)
Ran: generic/064
Failures: generic/064
Failed 1 of 1 tests


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to