On 19.11.18 г. 11:48 ч., fdman...@kernel.org wrote:
> From: Filipe Manana <fdman...@suse.com>
>
> The available allocation bits members from struct btrfs_fs_info are
> protected by a sequence lock, and when starting balance we access them
> incorrectly in two different ways:
>
> 1) In the read sequence lock loop at btrfs_balance() we use the values we
> read from fs_info->avail_*_alloc_bits and we can immediately do actions
> that have side effects and can not be undone (printing a message and
> jumping to a label). This is wrong because a retry might be needed, so
> our actions must not have side effects and must be repeatable as long
> as read_seqretry() returns a non-zero value. In other words, we were
> essentially ignoring the sequence lock;
>
> 2) Right below the read sequence lock loop, we were reading the values
> from avail_metadata_alloc_bits and avail_data_alloc_bits without any
> protection from concurrent writers, that is, reading them outside of
> the read sequence lock critical section.
>
> So fix this by making sure we only read the available allocation bits
> while in a read sequence lock critical section and that what we do in the
> critical section is repeatable (has nothing that can not be undone) so
> that any eventual retry that is needed is handled properly.
>
> Fixes: de98ced9e743 ("Btrfs: use seqlock to protect fs_info->avail_{data,
> metadata, system}_alloc_bits")
> Fixes: 14506127979a ("btrfs: fix a bogus warning when converting only data or
> metadata")
> Signed-off-by: Filipe Manana <fdman...@suse.com>
Reviewed-by: Nikolay Borisov <nbori...@suse.com>
> ---
> fs/btrfs/volumes.c | 39 +++++++++++++++++++++++----------------
> 1 file changed, 23 insertions(+), 16 deletions(-)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index f4405e430da6..223334f08530 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -3712,6 +3712,7 @@ int btrfs_balance(struct btrfs_fs_info *fs_info,
> int ret;
> u64 num_devices;
> unsigned seq;
> + bool reducing_integrity;
>
> if (btrfs_fs_closing(fs_info) ||
> atomic_read(&fs_info->balance_pause_req) ||
> @@ -3796,24 +3797,30 @@ int btrfs_balance(struct btrfs_fs_info *fs_info,
> !(bctl->sys.target & allowed)) ||
> ((bctl->meta.flags & BTRFS_BALANCE_ARGS_CONVERT) &&
> (fs_info->avail_metadata_alloc_bits & allowed) &&
> - !(bctl->meta.target & allowed))) {
> - if (bctl->flags & BTRFS_BALANCE_FORCE) {
> - btrfs_info(fs_info,
> - "balance: force reducing metadata integrity");
> - } else {
> - btrfs_err(fs_info,
> - "balance: reduces metadata integrity, use --force if you want this");
> - ret = -EINVAL;
> - goto out;
> - }
> - }
> + !(bctl->meta.target & allowed)))
> + reducing_integrity = true;
> + else
> + reducing_integrity = false;
> +
> + /* if we're not converting, the target field is uninitialized */
> + meta_target = (bctl->meta.flags & BTRFS_BALANCE_ARGS_CONVERT) ?
> + bctl->meta.target : fs_info->avail_metadata_alloc_bits;
> + data_target = (bctl->data.flags & BTRFS_BALANCE_ARGS_CONVERT) ?
> + bctl->data.target : fs_info->avail_data_alloc_bits;
> } while (read_seqretry(&fs_info->profiles_lock, seq));
>
> - /* if we're not converting, the target field is uninitialized */
> - meta_target = (bctl->meta.flags & BTRFS_BALANCE_ARGS_CONVERT) ?
> - bctl->meta.target : fs_info->avail_metadata_alloc_bits;
> - data_target = (bctl->data.flags & BTRFS_BALANCE_ARGS_CONVERT) ?
> - bctl->data.target : fs_info->avail_data_alloc_bits;
> + if (reducing_integrity) {
> + if (bctl->flags & BTRFS_BALANCE_FORCE) {
> + btrfs_info(fs_info,
> + "balance: force reducing metadata
> integrity");
> + } else {
> + btrfs_err(fs_info,
> + "balance: reduces metadata integrity, use --force if you want this");
> + ret = -EINVAL;
> + goto out;
> + }
> + }
> +
> if (btrfs_get_num_tolerated_disk_barrier_failures(meta_target) <
> btrfs_get_num_tolerated_disk_barrier_failures(data_target)) {
> int meta_index = btrfs_bg_flags_to_raid_index(meta_target);
>