On 02/12, guoweichao wrote:
> Hi Jaegeuk,
>
> On 2018/2/12 7:32, Jaegeuk Kim wrote:
> > On 02/06, Weichao Guo wrote:
> >> There is a potential inconsistent metadata case due to a cp block
> >> crc invalid in the latest checkpoint caused by hardware issues:
> >> 1) write nodes into segment x;
> >> 2) write checkpoint A;
> >> 3) remove nodes in segment x;
> >> 4) write checkpoint B;
> >> 5) issue discard or write datas into segment x;
> >> 6) sudden power-cut;
> >> 7) use checkpoint A after reboot as checkpoint B is invalid
> >>
> >> This inconsistency may be found after several reboots long time later
> >> and the kernel log about cp block crc invalid has disappeared. This
> >> makes the root cause of the inconsistency is hard to locate. Let us
> >> separate such other part issues from f2fs logical bugs in debug version.
> >>
> >> Signed-off-by: Weichao Guo <[email protected]>
> >> ---
> >> fs/f2fs/checkpoint.c | 8 ++++++--
> >> 1 file changed, 6 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> >> index 8b0945b..16ba96a 100644
> >> --- a/fs/f2fs/checkpoint.c
> >> +++ b/fs/f2fs/checkpoint.c
> >> @@ -737,13 +737,17 @@ static int get_checkpoint_version(struct
> >> f2fs_sb_info *sbi, block_t cp_addr,
> >> crc_offset = le32_to_cpu((*cp_block)->checksum_offset);
> >> if (crc_offset > (blk_size - sizeof(__le32))) {
> >> f2fs_msg(sbi->sb, KERN_WARNING,
> >> - "invalid crc_offset: %zu", crc_offset);
> >> + "invalid crc_offset: %zu at blk_addr: 0x%x",
> >> + crc_offset, cp_addr);
> >> + f2fs_bug_on(sbi, 1);
> >
> > I don't think we can use bug_on here, since we're easily getting this when
> > power-cut happened in the middle of checkpoint pack writes, which is an
> > expected
> > behavior. Hmm, we need to consider another way to detect that.
> We only check CP block crc here. The two CP blocks may have different CP
> versions when
> power-cut happened, but their crc value should be valid. IMO, this patch will
> trigger a
> bug_on only when some external issues cause CP block crc invalid as one 4K
> page is
> persisted atomically.
Huh? This checks crc_offset, not crc? Unfortunately, my simple fault injection
test gave this bug_on within a day. The below bug_on seems what you're saying
about tho.
>
> Thanks,
> >
> > Thanks,
> >
> >> return -EINVAL;
> >> }
> >>
> >> crc = cur_cp_crc(*cp_block);
> >> if (!f2fs_crc_valid(sbi, crc, *cp_block, crc_offset)) {
> >> - f2fs_msg(sbi->sb, KERN_WARNING, "invalid crc value");
> >> + f2fs_msg(sbi->sb, KERN_WARNING,
> >> + "invalid crc value at blk_addr: 0x%x", cp_addr);
> >> + f2fs_bug_on(sbi, 1);
> >> return -EINVAL;
> >> }
> >>
> >> --
> >> 2.10.1
> >
> > .
> >
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel