On Wed, Apr 14, 2021 at 12:22 PM Johannes Thumshirn <johannes.thumsh...@wdc.com> wrote: > > On 14/04/2021 13:17, Filipe Manana wrote: > > Yep, that's what puzzled me, why the need to do it for non-zoned file > > systems when using -o discard=sync. > > I assumed you ran into a case where discard was not happening due to > > some bug bug in the extent pinning/unpinning mechanism. > > > >> So the correct fix would > >> be to get the block group into the 'trans->transaction->deleted_bgs' list > >> after relocation, which would work if we wouldn't check for > >> block_group->ro in > >> btrfs_delete_unused_bgs(), but I suppose this check is there for a reason. > > > > Actually the check for ->ro does not make sense anymore since I > > introduced the delete_unused_bgs_mutex in commit > > 67c5e7d464bc466471b05e027abe8a6b29687ebd. > > > > When the ->ro check was added > > (47ab2a6c689913db23ccae38349714edf8365e0a), it was meant to prevent > > the cleaner kthread and relocation tasks from calling > > btrfs_remove_chunk() concurrently, but checking for ->ro only was > > buggy, hence the addition of delete_unused_bgs_mutex later. > > > > > I'll have a look at these commits. > > > >> > >> How about changing the patch to the following: > > > > Looks good. > > However would just removing the ->ro check by enough as well? > > From how I understand the code, yes. It's a quick test, so let's just do it > and see what breaks.
The thing most worth checking is if concurrent scrubs can cause any issue with deletion of unused block groups. I think they won't, but there were races in the past I remember fixing. Several test cases from btrfs/060 to btrfs/074 exercise scrub running with fsstress, balance, etc. They are a good way to stress test this code. > > I'd prefer to just drop the ->ro check, it's less special casing for zoned > btrfs that we have to keep in mind when changing things. > > Thanks for helping me with this, > Johannes -- Filipe David Manana, “Whether you think you can, or you think you can't — you're right.”