On 2024/3/27 3:40, Kent Overstreet wrote:
On Tue, Mar 26, 2024 at 02:04:50PM +0800, Hongbo Li wrote:
Do you mean to add these assertions at the end of
bkey_cached_move_to_freelist and bkey_cached_alloc?
Assertions like this I generally put in the shutdown path - would that
have caught this bug?
If it wouldn't, we could make it a debug mode assertion - probably also
randomly 1/128th of the time or somoesuch, this will be a bit of an
expensive one otherwise
I found this by code review on this part. Do you mean I should add some
code like:
#ifdef CONFIG_BCACHEFS_DEBUG
BUG_ON(list_count_nodes(&bc->freed_pcpu) != bc->nr_freed_pcpu);
BUG_ON(list_count_nodes(&bc->freed_nonpcpu) != bc->nr_freed_nonpcpu);
#endif
after every time when the nr_freed_pcpu/nr_freed_nonpcpu changed?
Surely, It's a bit difficult for me to reproduce this problem without a
deep understand about this procedure now. And I have tried constructing
a case to call bkey_cached_alloc, but is is difficult to ensure that
bkey_cached is allocated from the bc->freed_pcpu list.
On 2024/3/26 12:17, Kent Overstreet wrote:
On Tue, Mar 26, 2024 at 12:04:56PM +0800, Hongbo Li wrote:
When allocating bkey_cached from bc->freed_pcpu list, it missed
decreasing the count of nr_freed_pcpu which would cause the mismatch
between the value of nr_freed_pcpu and the list items. This problem
also exists in moving new bkey_cached to bc->freed_pcpu list.
If these happened, the bug info may appear in
bch2_fs_btree_key_cache_exit by the follow code:
BUG_ON(list_count_nodes(&bc->freed_pcpu) != bc->nr_freed_pcpu);
BUG_ON(list_count_nodes(&bc->freed_nonpcpu) != bc->nr_freed_nonpcpu);
Please include the assertions too :)
Fixes: c65c13f0eac6 ("bcachefs: Run btree key cache shrinker less aggressively")
Signed-off-by: Hongbo Li <[email protected]>
---
fs/bcachefs/btree_key_cache.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/bcachefs/btree_key_cache.c b/fs/bcachefs/btree_key_cache.c
index 581edcb0911b..e1031b9ba3f4 100644
--- a/fs/bcachefs/btree_key_cache.c
+++ b/fs/bcachefs/btree_key_cache.c
@@ -169,6 +169,7 @@ static void bkey_cached_move_to_freelist(struct
btree_key_cache *bc,
} else {
mutex_lock(&bc->lock);
list_move_tail(&ck->list, &bc->freed_pcpu);
+ bc->nr_freed_pcpu++;
mutex_unlock(&bc->lock);
}
}
@@ -245,6 +246,7 @@ bkey_cached_alloc(struct btree_trans *trans, struct
btree_path *path,
if (!list_empty(&bc->freed_pcpu)) {
ck = list_last_entry(&bc->freed_pcpu, struct
bkey_cached, list);
list_del_init(&ck->list);
+ bc->nr_freed_pcpu--;
}
mutex_unlock(&bc->lock);
}
--
2.34.1