[PATCH 3/6] btrfs: Add handling for disk split-brain scenario during fsid change
Even though fsid change without rewrite is a very quick operations it's still possible to experience a split brain scenario if power loss occurs at the right time. This patch handle the case where power failure occurs while the first transaction (the one setting CHANGING_FSID_V2) flag is being persisted on disk. This can cause the btrfs_fs_devices of this filesystem to be created by a device which: a) has the CHANGING_FSID_V2 flag set but its fsid value is intact b) or a device which doesn't have CHANGING_FSID_V2 flag set and its fsid value is intact This situation is trivially handled by the current find_fsid code since in both cases the devices are going to be treated like ordinary devices. Since btrfs is always mounted using the superblock of the latest device (the one with highest generation number), meaning it will have the CHANGING_FSID_V2 flag set, ensure it's being cleared on mount. On the first transaction commit following mount all disks will have it cleared. Signed-off-by: Nikolay Borisov --- fs/btrfs/disk-io.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index a458ef5b605e..6498434c2e06 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2831,10 +2831,10 @@ int open_ctree(struct super_block *sb, * the whole block of INFO_SIZE */ memcpy(fs_info->super_copy, bh->b_data, sizeof(*fs_info->super_copy)); - memcpy(fs_info->super_for_commit, fs_info->super_copy, - sizeof(*fs_info->super_for_commit)); brelse(bh); + disk_super = fs_info->super_copy; + ASSERT(!memcmp(fs_info->fs_devices->fsid, fs_info->super_copy->fsid, BTRFS_FSID_SIZE)); @@ -2844,6 +2844,15 @@ int open_ctree(struct super_block *sb, BTRFS_FSID_SIZE)); } + features = btrfs_super_flags(disk_super); + if (features & BTRFS_SUPER_FLAG_CHANGING_FSID_V2) { + features &= ~BTRFS_SUPER_FLAG_CHANGING_FSID_V2; + btrfs_set_super_flags(disk_super, features); + btrfs_info(fs_info, "found metadata uuid in progress flag. Clearing"); + } + + memcpy(fs_info->super_for_commit, fs_info->super_copy, + sizeof(*fs_info->super_for_commit)); ret = btrfs_validate_mount_super(fs_info); if (ret) { @@ -2852,7 +2861,6 @@ int open_ctree(struct super_block *sb, goto fail_alloc; } - disk_super = fs_info->super_copy; if (!btrfs_super_root(disk_super)) goto fail_alloc; -- 2.7.4
[PATCH 3/6] btrfs: Add handling for disk split-brain scenario during fsid change
Even though FSID change without rewrite is a very quick operations it's still possible to experience a split brain scenario if power loss occurs at the right time. This patch handle the case where power failure occurs while the first transaction (the one setting FSID_CHANGING_V2) flag is being persisted on disk. This can cause the btrfs_fs_device of this filesystem to be created by a device which: a) has the FSID_CHANGING_V2 flag set but its fsid value is intact b) or a device which doesn't have FSID_CHANGING_V2 flag set and its fsid value is intact This situatian is trivially handled by the current find_fsid code since in both cases the devices are going to be tread like ordinary devices. Since btrfs is mounted always using the superblock of the latest device (the one with higher generation number), meaning it will have the FSID_CHANGING_V2 flag set, ensure it's being cleared. On the first transaction commit following the mount all disks will have it cleared. Signed-off-by: Nikolay Borisov --- fs/btrfs/disk-io.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index be2caf513e2f..9c2f46f8421a 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2831,10 +2831,10 @@ int open_ctree(struct super_block *sb, * the whole block of INFO_SIZE */ memcpy(fs_info->super_copy, bh->b_data, sizeof(*fs_info->super_copy)); - memcpy(fs_info->super_for_commit, fs_info->super_copy, - sizeof(*fs_info->super_for_commit)); brelse(bh); + disk_super = fs_info->super_copy; + ASSERT(!memcmp(fs_info->fs_devices->fsid, fs_info->super_copy->fsid, BTRFS_FSID_SIZE)); @@ -2844,6 +2844,15 @@ int open_ctree(struct super_block *sb, BTRFS_FSID_SIZE)); } + features = btrfs_super_flags(disk_super); + if (features & BTRFS_SUPER_FLAG_CHANGING_FSID_v2) { + features &= ~BTRFS_SUPER_FLAG_CHANGING_FSID_v2; + btrfs_set_super_flags(disk_super, features); + btrfs_info(fs_info, "Found metadata uuid in progress flag. Clearing\n"); + } + + memcpy(fs_info->super_for_commit, fs_info->super_copy, + sizeof(*fs_info->super_for_commit)); ret = btrfs_validate_mount_super(fs_info); if (ret) { @@ -2852,7 +2861,6 @@ int open_ctree(struct super_block *sb, goto fail_alloc; } - disk_super = fs_info->super_copy; if (!btrfs_super_root(disk_super)) goto fail_alloc; -- 2.7.4