When a file gets deleted on a zoned file system, the space freed is not returned back into the block group's free space, but is migrated to zone_unusable.
As this zone_unusable space is behind the current write pointer it is not possible to use it for new allocations. In the current implementation a zone is reset once all of the block group's space is accounted as zone unusable. This behaviour can lead to premature ENOSPC errors on a busy file system. Instead of only reclaiming the zone once it is completely unusable, kick off a reclaim job once the amount of unusable bytes exceeds a user configurable threshold between 51% and 100%. It can be set per mounted filesystem via the sysfs tunable bg_reclaim_threshold which is set to 75% per default. Similar to reclaiming unused block groups, these dirty block groups are added to a to_reclaim list and then on a transaction commit, the reclaim process is triggered but after we deleted unused block groups, which will free space for the relocation process. Changes to v1: - Document sysfs parameter (David) - Add info print for reclaim (Josef) - Rename delete_unused_bgs_mutex to reclaim_bgs_lock (Filipe) - Remove list_is_singular check (Filipe) - Document of space_info->groups_sem use (Filipe) Johannes Thumshirn (2): btrfs: rename delete_unused_bgs_mutex btrfs: zoned: automatically reclaim zones fs/btrfs/block-group.c | 90 ++++++++++++++++++++++++++++++++++-- fs/btrfs/block-group.h | 2 + fs/btrfs/ctree.h | 5 +- fs/btrfs/disk-io.c | 17 +++++-- fs/btrfs/free-space-cache.c | 9 +++- fs/btrfs/sysfs.c | 35 ++++++++++++++ fs/btrfs/volumes.c | 48 +++++++++---------- fs/btrfs/volumes.h | 1 + include/trace/events/btrfs.h | 12 +++++ 9 files changed, 187 insertions(+), 32 deletions(-) -- 2.30.0