On 2019/1/16 下午8:01, Hugo Mills wrote: > Hi, Qu, > > On Wed, Jan 16, 2019 at 07:53:08PM +0800, Qu Wenruo wrote: >> There are at least 2 reports about memory bit flip sneaking into on-disk >> data. >> >> Currently we only have a relaxed check triggered at >> btrfs_mark_buffer_dirty() time, as it's not mandatory, only for >> CONFIG_BTRFS_FS_CHECK_INTEGRITY enabled build. >> >> This patch will address the hole by triggering comprehensive check on >> tree blocks before writing it back to disk. >> >> The timing is set to csum_tree_block() where @verify == 0. >> At that timing, we're generation csum for tree blocks before submitting >> the metadata bio, so we could avoid all the unnecessary calls at >> btrfs_mark_buffer_dirty(), but still catch enough error. > > I agree wholeheartedly with the idea of this change. Just one > question: > > How does this get reported to the user/administrator? As I > understand it, a detectably corrupt metadata page will generate an I/O > error from the filesystem before it's written to disk? How will this > show up in kernel logs?
Well, you caught me. I haven't try the error case, and in fact if it fails, it fails by triggering kernel BUG_ON(), thus you may not have a chance to see the error message from btrfs module. > Is it distinguishable in any way from a > similar error that was generated on reading such a corrupt metadata > node from the disk? Kind of distinguishable for this patch, when you hit kernel BUG_ON() at fs/btrfs/extent_io.c:4016 then it's definitely from this patch. :P > > Basically, I want to be able to distinguish this case (error > detected when writing) from the existing case (error detected when > reading) when someone shows up on IRC with a "broken filesystem". Definitely, I'll address this and the BUG_ON() in next version. Thanks, Qu > > Hugo. > >> Reported-by: Leonard Lausen <[email protected]> >> Signed-off-by: Qu Wenruo <[email protected]> >> --- >> fs/btrfs/disk-io.c | 10 ++++++++++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c >> index 8da2f380d3c0..45bf6be9e751 100644 >> --- a/fs/btrfs/disk-io.c >> +++ b/fs/btrfs/disk-io.c >> @@ -313,6 +313,16 @@ static int csum_tree_block(struct btrfs_fs_info >> *fs_info, >> return -EUCLEAN; >> } >> } else { >> + /* >> + * Here we're calculating csum before writing it to disk, >> + * do comprenhensive check here to catch memory corruption >> + */ >> + if (btrfs_header_level(buf)) >> + err = btrfs_check_node(fs_info, buf); >> + else >> + err = btrfs_check_leaf_full(fs_info, buf); >> + if (err < 0) >> + return err; >> write_extent_buffer(buf, result, 0, csum_size); >> } >> >
signature.asc
Description: OpenPGP digital signature
