On 09/04/2021 13:37, Filipe Manana wrote:
> On Fri, Apr 9, 2021 at 11:54 AM Johannes Thumshirn
> <johannes.thumsh...@wdc.com> wrote:
>>
>> When relocating a block group the freed up space is not discarded. On
>> devices like SSDs this hint is useful to tell the device the space is
>> freed now. On zoned block devices btrfs' discard code will reset the zone
>> the block group is on, freeing up the occupied space.
>>
>> Signed-off-by: Johannes Thumshirn <johannes.thumsh...@wdc.com>
>> ---
>>  fs/btrfs/volumes.c | 13 +++++++++++++
>>  1 file changed, 13 insertions(+)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 6d9b2369f17a..d9ef8bce0cde 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -3103,6 +3103,10 @@ static int btrfs_relocate_chunk(struct btrfs_fs_info 
>> *fs_info, u64 chunk_offset)
>>         struct btrfs_root *root = fs_info->chunk_root;
>>         struct btrfs_trans_handle *trans;
>>         struct btrfs_block_group *block_group;
>> +       const bool trim = btrfs_is_zoned(fs_info) ||
>> +                               btrfs_test_opt(fs_info, DISCARD_SYNC);
>> +       u64 trimmed;
>> +       u64 length;
>>         int ret;
>>
>>         /*
>> @@ -3130,6 +3134,7 @@ static int btrfs_relocate_chunk(struct btrfs_fs_info 
>> *fs_info, u64 chunk_offset)
>>         if (!block_group)
>>                 return -ENOENT;
>>         btrfs_discard_cancel_work(&fs_info->discard_ctl, block_group);
>> +       length = block_group->length;
>>         btrfs_put_block_group(block_group);
>>
>>         trans = btrfs_start_trans_remove_block_group(root->fs_info,
>> @@ -3144,6 +3149,14 @@ static int btrfs_relocate_chunk(struct btrfs_fs_info 
>> *fs_info, u64 chunk_offset)
>>          * step two, delete the device extents and the
>>          * chunk tree entries
>>          */
>> +       if (trim) {
>> +               ret = btrfs_discard_extent(fs_info, chunk_offset, length,
>> +                                          &trimmed);
> 
> Ideally we do IO and potentially slow operations such as discard while
> not holding a transaction handle open, to avoid slowing down anyone
> trying to commit the transaction.
> 
>> +               if (ret) {
>> +                       btrfs_abort_transaction(trans, ret);
> 
> This is leaking the transaction, still need to call btrfs_end_transaction().
> 
> Making the discard before joining/starting transaction, would fix both things.

Note taken.

> 
> Now more importantly, I don't see why the freed space isn't already
> discarded at this point.
> Relocation creates delayed references to drop extents from the block
> group, and when the delayed references are run, we pin the extents
> through:
> 
> __btrfs_free_extent() -> btrfs_update_block_group()
> 
> Then at the very end of the transaction commit, we discard pinned
> extents and then unpin them, at btrfs_finish_extent_commit().
> Relocation commits the transaction at the end of
> relocate_block_group(), so the delayed references are fun, and then we
> discard and unpin their extents.
> So that's why I don't get it why the discard is needed here (at least
> when we are not on a zoned filesystem).

This is a good point to investigate, but as of now, the zone isn't reset.
Zone reset handling in btrfs is piggy backed onto discard handling, but
from testing the patchset I see the zone isn't reset:

[   81.014752] BTRFS info (device nullb0): reclaiming chunk 1073741824
[   81.015732] BTRFS info (device nullb0): relocating block group 1073741824 
flags data
[   81.798090] BTRFS info (device nullb0): found 12452 extents, stage: move 
data extents
[   82.314562] BTRFS info (device nullb0): found 12452 extents, stage: update 
data pointers 
# blkzone report -o $((1073741824 >> 9)) -c 1 /dev/nullb0
  start: 0x000200000, len 0x080000, cap 0x080000, wptr 0x0799a0 reset:0 
non-seq:0, zcond: 2(oi) [type: 2(SEQ_WRITE_REQUIRED)]

Whereas when the this patch is applied as well:
[   85.126542] BTRFS info (device nullb0): reclaiming chunk 1073741824
[   85.128211] BTRFS info (device nullb0): relocating block group 1073741824 
flags data
[   86.061831] BTRFS info (device nullb0): found 12452 extents, stage: move 
data extents
[   86.766549] BTRFS info (device nullb0): found 12452 extents, stage: update 
data pointers
# blkzone report -c 1 -o $((1073741824 >> 9)) /dev/nullb0
  start: 0x000200000, len 0x080000, cap 0x080000, wptr 0x000000 reset:0 
non-seq:0, zcond: 1(em) [type: 2(SEQ_WRITE_REQUIRED)]

As a positive side effect of this, I now have code that can be used in 
xfstests to verify the patchset.

Byte,
        Johannes

Reply via email to