On 1/22/21 10:28 AM, [email protected] wrote:
From: Filipe Manana <[email protected]>
After a sudden power failure we may end up with a space cache on disk that
is not valid and needs to be rebuilt from scratch.
If that happens, during log replay when we attempt to pin an extent buffer
from a log tree, at btrfs_pin_extent_for_log_replay(), we do not wait for
the space cache to be rebuilt through the call to:
btrfs_cache_block_group(cache, 1);
That is because that only waits for the task (work queue job) that loads
the space cache to change the cache state from BTRFS_CACHE_FAST to any
other value. That is ok when the space cache on disk exists and is valid,
but when the cache is not valid and needs to be rebuilt, it ends up
returning as soon as the cache state changes to BTRFS_CACHE_STARTED (done
at caching_thread()).
So this means that we can end up trying to unpin a range which is not yet
marked as free in the block group. This results in the call to
btrfs_remove_free_space() to return -EINVAL to
btrfs_pin_extent_for_log_replay(), which in turn makes the log replay fail
as well as mounting the filesystem. When this happens we got the following
in dmesg/syslog:
[72383.415114] BTRFS: device fsid 32b95b69-0ea9-496a-9f02-3f5a56dc9322 devid 1
transid 1432 /dev/sdb scanned by mount (3816007)
[72383.417837] BTRFS info (device sdb): disk space caching is enabled
[72383.418536] BTRFS info (device sdb): has skinny extents
[72383.423846] BTRFS info (device sdb): start tree-log replay
[72383.426416] BTRFS warning (device sdb): block group 30408704 has wrong
amount of free space
[72383.427686] BTRFS warning (device sdb): failed to load free space cache for
block group 30408704, rebuilding it now
[72383.454291] BTRFS: error (device sdb) in btrfs_recover_log_trees:6203:
errno=-22 unknown (Failed to pin buffers while recovering log root tree.)
[72383.456725] BTRFS: error (device sdb) in btrfs_replay_log:2253: errno=-22
unknown (Failed to recover log tree)
[72383.460241] BTRFS error (device sdb): open_ctree failed
So fix this by making sure that during log replay we wait for the caching
task to finish completely when we need to rebuild a space cache.
Fixes: e747853cae3ae3 ("btrfs: load free space cache asynchronously")
Signed-off-by: Filipe Manana <[email protected]>
This could actually happen before, it was just less likely because we'd start
the async thread from the callers context. I assume we're getting the -EINVAL
from the remove_from_bitmap() function? So we've loaded part of the free space
but not all of it, and thus get the -EINVAL. This probably needs the earlier
Fixes, all the async patch did was make it easier to hit.
<snip>
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 30b1a630dc2f..594534482ad3 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2600,6 +2600,7 @@ int btrfs_pin_extent_for_log_replay(struct
btrfs_trans_handle *trans,
u64 bytenr, u64 num_bytes)
{
struct btrfs_block_group *cache;
+ struct btrfs_caching_control *caching_ctl;
int ret;
btrfs_add_excluded_extent(trans->fs_info, bytenr, num_bytes);
@@ -2615,6 +2616,13 @@ int btrfs_pin_extent_for_log_replay(struct
btrfs_trans_handle *trans,
* the pinned extents.
*/
btrfs_cache_block_group(cache, 1);
Instead we could probably just do
if (btrfs_test_opt(fs_info, SPACE_CACHE)) {
btrfs_cache_block_group(cache, 0);
btrfs_wait_block_group_cache_done(cache);
}
here instead of changing all of the function arguments and such. Thanks,
Josef