On 09/29/2011 12:25 PM, Yan, Zheng wrote: > On 09/29/2011 10:00 AM, Liu Bo wrote: >> The btrfs snapshotting code requires that once a root has been >> snapshotted, we don't change it during a commit. >> >> But there are two cases to lead to tree corruptions: >> >> 1) multi-thread snapshots can commit serveral snapshots in a transaction, >> and this may change the src root when processing the following pending >> snapshots, which lead to the former snapshots corruptions; >> >> 2) the free inode cache was changing the roots when it root the cache, >> which lead to corruptions. >> > For the case 2, the free inode cache of newly created snapshot is invalid. > So it's better to avoid modifying snapshotted trees. >
For case 2, with flushing dirty inode cache during create_pending_snapshot, we can avoid modifying snapshotted trees as your advice. But for case 1, I have no idea how to do the same thing, since we are not allowed to commit per snapshot, which will make the performance terrible. thanks, liubo >> This fixes things by making sure we force COW the block after we create a >> snapshot during commiting a transaction, then any changes to the roots >> will result in COW, and we get all the fs roots and snapshot roots to be >> consistent. >> >> Signed-off-by: Liu Bo <[email protected]> >> Signed-off-by: Miao Xie <[email protected]> >> --- >> fs/btrfs/ctree.c | 17 ++++++++++++++++- >> fs/btrfs/ctree.h | 2 ++ >> fs/btrfs/transaction.c | 8 ++++++++ >> 3 files changed, 26 insertions(+), 1 deletions(-) >> >> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c >> index 011cab3..49dad7d 100644 >> --- a/fs/btrfs/ctree.c >> +++ b/fs/btrfs/ctree.c >> @@ -514,10 +514,25 @@ static inline int should_cow_block(struct >> btrfs_trans_handle *trans, >> struct btrfs_root *root, >> struct extent_buffer *buf) >> { >> + /* ensure we can see the force_cow */ >> + smp_rmb(); >> + >> + /* >> + * We do not need to cow a block if >> + * 1) this block is not created or changed in this transaction; >> + * 2) this block does not belong to TREE_RELOC tree; >> + * 3) the root is not forced COW. >> + * >> + * What is forced COW: >> + * when we create snapshot during commiting the transaction, >> + * after we've finished coping src root, we must COW the shared >> + * block to ensure the metadata consistency. >> + */ >> if (btrfs_header_generation(buf) == trans->transid && >> !btrfs_header_flag(buf, BTRFS_HEADER_FLAG_WRITTEN) && >> !(root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID && >> - btrfs_header_flag(buf, BTRFS_HEADER_FLAG_RELOC))) >> + btrfs_header_flag(buf, BTRFS_HEADER_FLAG_RELOC)) && >> + !root->force_cow) >> return 0; >> return 1; >> } >> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h >> index 03912c5..bece0df 100644 >> --- a/fs/btrfs/ctree.h >> +++ b/fs/btrfs/ctree.h >> @@ -1225,6 +1225,8 @@ struct btrfs_root { >> * for stat. It may be used for more later >> */ >> dev_t anon_dev; >> + >> + int force_cow; >> }; >> >> struct btrfs_ioctl_defrag_range_args { >> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c >> index 7dc36fa..bf6e2b3 100644 >> --- a/fs/btrfs/transaction.c >> +++ b/fs/btrfs/transaction.c >> @@ -816,6 +816,10 @@ static noinline int commit_fs_roots(struct >> btrfs_trans_handle *trans, >> >> btrfs_save_ino_cache(root, trans); >> >> + /* see comments in should_cow_block() */ >> + root->force_cow = 0; >> + smp_wmb(); >> + >> if (root->commit_root != root->node) { >> mutex_lock(&root->fs_commit_mutex); >> switch_commit_root(root); >> @@ -976,6 +980,10 @@ static noinline int create_pending_snapshot(struct >> btrfs_trans_handle *trans, >> btrfs_tree_unlock(old); >> free_extent_buffer(old); >> >> + /* see comments in should_cow_block() */ >> + root->force_cow = 1; >> + smp_wmb(); >> + >> btrfs_set_root_node(new_root_item, tmp); >> /* record when the snapshot was created in key.offset */ >> key.offset = trans->transid; > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
