Re: [PATCH 1/2] btrfs: catch cow on deleting snapshots
On Fri, Nov 30, 2018 at 12:19:18PM -0500, Josef Bacik wrote: > On Fri, Nov 30, 2018 at 05:14:54PM +, Filipe Manana wrote: > > On Fri, Nov 30, 2018 at 4:53 PM Josef Bacik wrote: > > > > > > From: Josef Bacik > > > > > > When debugging some weird extent reference bug I suspected that we were > > > changing a snapshot while we were deleting it, which could explain my > > > bug. This was indeed what was happening, and this patch helped me > > > verify my theory. It is never correct to modify the snapshot once it's > > > being deleted, so mark the root when we are deleting it and make sure we > > > complain about it when it happens. > > > > > > Signed-off-by: Josef Bacik > > > --- > > > fs/btrfs/ctree.c | 3 +++ > > > fs/btrfs/ctree.h | 1 + > > > fs/btrfs/extent-tree.c | 9 + > > > 3 files changed, 13 insertions(+) > > > > > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > > > index 5912a97b07a6..5f82f86085e8 100644 > > > --- a/fs/btrfs/ctree.c > > > +++ b/fs/btrfs/ctree.c > > > @@ -1440,6 +1440,9 @@ noinline int btrfs_cow_block(struct > > > btrfs_trans_handle *trans, > > > u64 search_start; > > > int ret; > > > > > > + if (test_bit(BTRFS_ROOT_DELETING, &root->state)) > > > + WARN(1, KERN_CRIT "cow'ing blocks on a fs root thats > > > being dropped\n"); > > > > Please use btrfs_warn(), it makes sure we use a consistent message > > style, identifies the fs, etc. > > Also, "thats" should be "that is" or "that's". > > > > Ah yeah, I was following the other convention in there but we should probably > convert all of those to btrfs_warn. I'll fix the grammer thing as well, just > a > leftover from the much less code of conduct friendly message I originally had > there. Thanks, Committed with the following fixup: - WARN(1, KERN_CRIT "cow'ing blocks on a fs root thats being dropped\n"); + btrfs_error(fs_info, + "COW'ing blocks on a fs root that's being dropped");
Re: [PATCH 1/2] btrfs: catch cow on deleting snapshots
On 12/1/18 1:19 AM, Qu Wenruo wrote: > > > On 2018/12/1 上午12:52, Josef Bacik wrote: >> From: Josef Bacik >> >> When debugging some weird extent reference bug I suspected that we were >> changing a snapshot while we were deleting it, which could explain my >> bug. > > May I ask under which case we're going to modify an unlinked snapshot? > > Maybe metadata relocation? I have some old logging from a filesystem that got corrupted when doing a subvolume delete while doing metadata balance: https://syrinx.knorrie.org/~knorrie/btrfs/domokun-balance-skinny-lockup.txt This is one of the very few moments when I really lost a btrfs filesystem and had to mkfs again to move on. I'm wondering if this one looks related. I never found an explanation for it. Hans > > Thanks, > Qu > >> This was indeed what was happening, and this patch helped me >> verify my theory. It is never correct to modify the snapshot once it's >> being deleted, so mark the root when we are deleting it and make sure we >> complain about it when it happens. >> >> Signed-off-by: Josef Bacik >> --- >> fs/btrfs/ctree.c | 3 +++ >> fs/btrfs/ctree.h | 1 + >> fs/btrfs/extent-tree.c | 9 + >> 3 files changed, 13 insertions(+) >> >> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c >> index 5912a97b07a6..5f82f86085e8 100644 >> --- a/fs/btrfs/ctree.c >> +++ b/fs/btrfs/ctree.c >> @@ -1440,6 +1440,9 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle >> *trans, >> u64 search_start; >> int ret; >> >> +if (test_bit(BTRFS_ROOT_DELETING, &root->state)) >> +WARN(1, KERN_CRIT "cow'ing blocks on a fs root thats being >> dropped\n"); >> + >> if (trans->transaction != fs_info->running_transaction) >> WARN(1, KERN_CRIT "trans %llu running %llu\n", >> trans->transid, >> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h >> index facde70c15ed..5a3a94ccb65c 100644 >> --- a/fs/btrfs/ctree.h >> +++ b/fs/btrfs/ctree.h >> @@ -1199,6 +1199,7 @@ enum { >> BTRFS_ROOT_FORCE_COW, >> BTRFS_ROOT_MULTI_LOG_TASKS, >> BTRFS_ROOT_DIRTY, >> +BTRFS_ROOT_DELETING, >> }; >> >> /* >> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >> index 581c2a0b2945..dcb699dd57f3 100644 >> --- a/fs/btrfs/extent-tree.c >> +++ b/fs/btrfs/extent-tree.c >> @@ -9333,6 +9333,15 @@ int btrfs_drop_snapshot(struct btrfs_root *root, >> if (block_rsv) >> trans->block_rsv = block_rsv; >> >> +/* >> + * This will help us catch people modifying the fs tree while we're >> + * dropping it. It is unsafe to mess with the fs tree while it's being >> + * dropped as we unlock the root node and parent nodes as we walk down >> + * the tree, assuming nothing will change. If something does change >> + * then we'll have stale information and drop references to blocks we've >> + * already dropped. >> + */ >> +set_bit(BTRFS_ROOT_DELETING, &root->state); >> if (btrfs_disk_key_objectid(&root_item->drop_progress) == 0) { >> level = btrfs_header_level(root->node); >> path->nodes[level] = btrfs_lock_root_node(root); >> > -- Hans van Kranenburg signature.asc Description: OpenPGP digital signature
Re: [PATCH 1/2] btrfs: catch cow on deleting snapshots
On 2018/12/1 上午12:52, Josef Bacik wrote: > From: Josef Bacik > > When debugging some weird extent reference bug I suspected that we were > changing a snapshot while we were deleting it, which could explain my > bug. May I ask under which case we're going to modify an unlinked snapshot? Maybe metadata relocation? Thanks, Qu > This was indeed what was happening, and this patch helped me > verify my theory. It is never correct to modify the snapshot once it's > being deleted, so mark the root when we are deleting it and make sure we > complain about it when it happens. > > Signed-off-by: Josef Bacik > --- > fs/btrfs/ctree.c | 3 +++ > fs/btrfs/ctree.h | 1 + > fs/btrfs/extent-tree.c | 9 + > 3 files changed, 13 insertions(+) > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > index 5912a97b07a6..5f82f86085e8 100644 > --- a/fs/btrfs/ctree.c > +++ b/fs/btrfs/ctree.c > @@ -1440,6 +1440,9 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle > *trans, > u64 search_start; > int ret; > > + if (test_bit(BTRFS_ROOT_DELETING, &root->state)) > + WARN(1, KERN_CRIT "cow'ing blocks on a fs root thats being > dropped\n"); > + > if (trans->transaction != fs_info->running_transaction) > WARN(1, KERN_CRIT "trans %llu running %llu\n", > trans->transid, > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h > index facde70c15ed..5a3a94ccb65c 100644 > --- a/fs/btrfs/ctree.h > +++ b/fs/btrfs/ctree.h > @@ -1199,6 +1199,7 @@ enum { > BTRFS_ROOT_FORCE_COW, > BTRFS_ROOT_MULTI_LOG_TASKS, > BTRFS_ROOT_DIRTY, > + BTRFS_ROOT_DELETING, > }; > > /* > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index 581c2a0b2945..dcb699dd57f3 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -9333,6 +9333,15 @@ int btrfs_drop_snapshot(struct btrfs_root *root, > if (block_rsv) > trans->block_rsv = block_rsv; > > + /* > + * This will help us catch people modifying the fs tree while we're > + * dropping it. It is unsafe to mess with the fs tree while it's being > + * dropped as we unlock the root node and parent nodes as we walk down > + * the tree, assuming nothing will change. If something does change > + * then we'll have stale information and drop references to blocks we've > + * already dropped. > + */ > + set_bit(BTRFS_ROOT_DELETING, &root->state); > if (btrfs_disk_key_objectid(&root_item->drop_progress) == 0) { > level = btrfs_header_level(root->node); > path->nodes[level] = btrfs_lock_root_node(root); > signature.asc Description: OpenPGP digital signature
Re: [PATCH 1/2] btrfs: catch cow on deleting snapshots
On Fri, Nov 30, 2018 at 05:14:54PM +, Filipe Manana wrote: > On Fri, Nov 30, 2018 at 4:53 PM Josef Bacik wrote: > > > > From: Josef Bacik > > > > When debugging some weird extent reference bug I suspected that we were > > changing a snapshot while we were deleting it, which could explain my > > bug. This was indeed what was happening, and this patch helped me > > verify my theory. It is never correct to modify the snapshot once it's > > being deleted, so mark the root when we are deleting it and make sure we > > complain about it when it happens. > > > > Signed-off-by: Josef Bacik > > --- > > fs/btrfs/ctree.c | 3 +++ > > fs/btrfs/ctree.h | 1 + > > fs/btrfs/extent-tree.c | 9 + > > 3 files changed, 13 insertions(+) > > > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > > index 5912a97b07a6..5f82f86085e8 100644 > > --- a/fs/btrfs/ctree.c > > +++ b/fs/btrfs/ctree.c > > @@ -1440,6 +1440,9 @@ noinline int btrfs_cow_block(struct > > btrfs_trans_handle *trans, > > u64 search_start; > > int ret; > > > > + if (test_bit(BTRFS_ROOT_DELETING, &root->state)) > > + WARN(1, KERN_CRIT "cow'ing blocks on a fs root thats being > > dropped\n"); > > Please use btrfs_warn(), it makes sure we use a consistent message > style, identifies the fs, etc. > Also, "thats" should be "that is" or "that's". > Ah yeah, I was following the other convention in there but we should probably convert all of those to btrfs_warn. I'll fix the grammer thing as well, just a leftover from the much less code of conduct friendly message I originally had there. Thanks, Josef
Re: [PATCH 1/2] btrfs: catch cow on deleting snapshots
On Fri, Nov 30, 2018 at 4:53 PM Josef Bacik wrote: > > From: Josef Bacik > > When debugging some weird extent reference bug I suspected that we were > changing a snapshot while we were deleting it, which could explain my > bug. This was indeed what was happening, and this patch helped me > verify my theory. It is never correct to modify the snapshot once it's > being deleted, so mark the root when we are deleting it and make sure we > complain about it when it happens. > > Signed-off-by: Josef Bacik > --- > fs/btrfs/ctree.c | 3 +++ > fs/btrfs/ctree.h | 1 + > fs/btrfs/extent-tree.c | 9 + > 3 files changed, 13 insertions(+) > > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c > index 5912a97b07a6..5f82f86085e8 100644 > --- a/fs/btrfs/ctree.c > +++ b/fs/btrfs/ctree.c > @@ -1440,6 +1440,9 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle > *trans, > u64 search_start; > int ret; > > + if (test_bit(BTRFS_ROOT_DELETING, &root->state)) > + WARN(1, KERN_CRIT "cow'ing blocks on a fs root thats being > dropped\n"); Please use btrfs_warn(), it makes sure we use a consistent message style, identifies the fs, etc. Also, "thats" should be "that is" or "that's". With that fixed, Reviewed-by: Filipe Manana > + > if (trans->transaction != fs_info->running_transaction) > WARN(1, KERN_CRIT "trans %llu running %llu\n", >trans->transid, > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h > index facde70c15ed..5a3a94ccb65c 100644 > --- a/fs/btrfs/ctree.h > +++ b/fs/btrfs/ctree.h > @@ -1199,6 +1199,7 @@ enum { > BTRFS_ROOT_FORCE_COW, > BTRFS_ROOT_MULTI_LOG_TASKS, > BTRFS_ROOT_DIRTY, > + BTRFS_ROOT_DELETING, > }; > > /* > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index 581c2a0b2945..dcb699dd57f3 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -9333,6 +9333,15 @@ int btrfs_drop_snapshot(struct btrfs_root *root, > if (block_rsv) > trans->block_rsv = block_rsv; > > + /* > +* This will help us catch people modifying the fs tree while we're > +* dropping it. It is unsafe to mess with the fs tree while it's > being > +* dropped as we unlock the root node and parent nodes as we walk down > +* the tree, assuming nothing will change. If something does change > +* then we'll have stale information and drop references to blocks > we've > +* already dropped. > +*/ > + set_bit(BTRFS_ROOT_DELETING, &root->state); > if (btrfs_disk_key_objectid(&root_item->drop_progress) == 0) { > level = btrfs_header_level(root->node); > path->nodes[level] = btrfs_lock_root_node(root); > -- > 2.14.3 > -- Filipe David Manana, “Whether you think you can, or you think you can't — you're right.”
[PATCH 1/2] btrfs: catch cow on deleting snapshots
From: Josef Bacik When debugging some weird extent reference bug I suspected that we were changing a snapshot while we were deleting it, which could explain my bug. This was indeed what was happening, and this patch helped me verify my theory. It is never correct to modify the snapshot once it's being deleted, so mark the root when we are deleting it and make sure we complain about it when it happens. Signed-off-by: Josef Bacik --- fs/btrfs/ctree.c | 3 +++ fs/btrfs/ctree.h | 1 + fs/btrfs/extent-tree.c | 9 + 3 files changed, 13 insertions(+) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 5912a97b07a6..5f82f86085e8 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1440,6 +1440,9 @@ noinline int btrfs_cow_block(struct btrfs_trans_handle *trans, u64 search_start; int ret; + if (test_bit(BTRFS_ROOT_DELETING, &root->state)) + WARN(1, KERN_CRIT "cow'ing blocks on a fs root thats being dropped\n"); + if (trans->transaction != fs_info->running_transaction) WARN(1, KERN_CRIT "trans %llu running %llu\n", trans->transid, diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index facde70c15ed..5a3a94ccb65c 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1199,6 +1199,7 @@ enum { BTRFS_ROOT_FORCE_COW, BTRFS_ROOT_MULTI_LOG_TASKS, BTRFS_ROOT_DIRTY, + BTRFS_ROOT_DELETING, }; /* diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 581c2a0b2945..dcb699dd57f3 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -9333,6 +9333,15 @@ int btrfs_drop_snapshot(struct btrfs_root *root, if (block_rsv) trans->block_rsv = block_rsv; + /* +* This will help us catch people modifying the fs tree while we're +* dropping it. It is unsafe to mess with the fs tree while it's being +* dropped as we unlock the root node and parent nodes as we walk down +* the tree, assuming nothing will change. If something does change +* then we'll have stale information and drop references to blocks we've +* already dropped. +*/ + set_bit(BTRFS_ROOT_DELETING, &root->state); if (btrfs_disk_key_objectid(&root_item->drop_progress) == 0) { level = btrfs_header_level(root->node); path->nodes[level] = btrfs_lock_root_node(root); -- 2.14.3