On 13/04/2021 14:57, Filipe Manana wrote:
> And what about the other mechanism that triggers discards on pinned
> extents, after the transaction commits the super blocks?
> Why isn't that happening (with -o discard=sync)? We create the delayed
> references to drop extents from the relocated block group, which
> results in pinning extents.
> This is the case that surprised me that it isn't working for you.

I think this is the case. I would have expected to end up in this
part of btrfs_finish_extent_commit():

                                              
        /*                                                                      
 
         * Transaction is finished.  We don't need the lock anymore.  We        
 
         * do need to clean up the block groups in case of a transaction        
 
         * abort.                                                               
 
         */                                                                     
 
        deleted_bgs = &trans->transaction->deleted_bgs;                         
 
        list_for_each_entry_safe(block_group, tmp, deleted_bgs, bg_list) {      
 
                u64 trimmed = 0;                                                
 
                                                                                
 
                ret = -EROFS;                                                   
 
                if (!TRANS_ABORTED(trans))                                      
 
                        ret = btrfs_discard_extent(fs_info,                     
 
                                                   block_group->start,          
 
                                                   block_group->length,         
 
                                                   &trimmed);                   
 
                                                                                
 
                list_del_init(&block_group->bg_list);                           
 
                btrfs_unfreeze_block_group(block_group);                        
 
                btrfs_put_block_group(block_group);                             
 
                                                                                
 
                if (ret) {                                                      
 
                        const char *errstr = btrfs_decode_error(ret);           
 
                        btrfs_warn(fs_info,                                     
 
                           "discard failed while removing blockgroup: errno=%d 
%s",
                                   ret, errstr);                                
 
                }                                                               
 
        }                                    

and the btrfs_discard_extent() over the whole block group would then trigger a
REQ_OP_ZONE_RESET operation, resetting the device's zone.

But as btrfs_delete_unused_bgs() doesn't add the block group to the 
->deleted_bgs list, we're not reaching above code. I /think/ (i.e. verification
pending) the -o discard=sync case works for regular block devices, as each 
extent
is discarded on it's own, by this (also in btrfs_finish_extent_commit()):

        while (!TRANS_ABORTED(trans)) {                                         
 
                struct extent_state *cached_state = NULL;                       
 
                                                                                
 
                mutex_lock(&fs_info->unused_bg_unpin_mutex);                    
 
                ret = find_first_extent_bit(unpin, 0, &start, &end,             
 
                                            EXTENT_DIRTY, &cached_state);       
 
                if (ret) {                                                      
 
                        mutex_unlock(&fs_info->unused_bg_unpin_mutex);          
 
                        break;                                                  
 
                }                                                               
 
                                                                                
 
                if (btrfs_test_opt(fs_info, DISCARD_SYNC))                      
 
                        ret = btrfs_discard_extent(fs_info, start,              
 
                                                   end + 1 - start, NULL);      
 
                                                                                
 
                clear_extent_dirty(unpin, start, end, &cached_state);           
 
                unpin_extent_range(fs_info, start, end, true);                  
 
                mutex_unlock(&fs_info->unused_bg_unpin_mutex);                  
 
                free_extent_state(cached_state);                                
 
                cond_resched();                                                 
 
        }

If this is the case, my patch will essentially discard the data twice, for a
non-zoned block device, which is certainly not ideal. So the correct fix would
be to get the block group into the 'trans->transaction->deleted_bgs' list
after relocation, which would work if we wouldn't check for block_group->ro in
btrfs_delete_unused_bgs(), but I suppose this check is there for a reason.

How about changing the patch to the following:

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 6d9b2369f17a..ba13b2ea3c6f 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3103,6 +3103,9 @@ static int btrfs_relocate_chunk(struct btrfs_fs_info 
*fs_info, u64 chunk_offset)
        struct btrfs_root *root = fs_info->chunk_root;
        struct btrfs_trans_handle *trans;
        struct btrfs_block_group *block_group;
+       u64 length;
        int ret;
 
        /*
@@ -3130,8 +3133,16 @@ static int btrfs_relocate_chunk(struct btrfs_fs_info 
*fs_info, u64 chunk_offset)
        if (!block_group)
                return -ENOENT;
        btrfs_discard_cancel_work(&fs_info->discard_ctl, block_group);
+       length = block_group->length;
        btrfs_put_block_group(block_group);

+       /* 
+        * For a zoned filesystem we need to discard/zone-reset here, as the 
+        * discard code won't discard the whole block-group, but only single
+        * extents.
+        */
+       if (btrfs_is_zoned(fs_info)) {
+               ret = btrfs_discard_extent(fs_info, chunk_offset, length, NULL);
+               if (ret) /* Non working discard is not fatal */
+                       btrfs_warn(fs_info, "discarding chunk %llu failed",
+                                  chunk_offset);
+       }
+
        trans = btrfs_start_trans_remove_block_group(root->fs_info,
                                                     chunk_offset);
        if (IS_ERR(trans)) {

Reply via email to