On 16.12.20 г. 18:46 ч., Josef Bacik wrote:
> My recent set of patches to reduce lock contention on the extent root by
> running delayed refs resulted in a regression in generic/371. This test
> fallocate()'s the fs until it's full, deletes all the files, and then
> tries to fallocate() until full again.
>
> Before my delayed refs patches we would run all of the delayed refs
> during flushing, and then would commit the transaction because we had
> plenty of pinned space to recover in order to allocate. However my
> patches made it so we weren't running the delayed refs as aggressively,
> which meant that we appeared to have less pinned space when we were
> deciding to commit the transaction.
>
> We use the space_info->total_bytes_pinned to approximate how much space
> we have pinned. It's approximate because if we remove a reference to an
> extent we may free it, but there may be more references to it than we
> know of at that point, but we account it as pinned at the creation time,
> and then it's properly accounted when the delayed ref runs.
>
> The way we account for pinned space is if the
> delayed_ref_head->total_ref_mod is < 0, because that is clearly a
> free'ing option. However there is another case, and that is where
> ->total_ref_mod == 0 && ->must_insert_reserved == 1.
>
> When we allocate a new extent, we have ->total_ref_mod == 1 and we have
> ->must_insert_reserved == 1. This is used to indicate that it is a
> brand new extent and will need to have its extent entry added before we
> modify any references on the delayed ref head. But if we subsequently
> remove that extent reference, our ->total_ref_mod will be 0, and that
> space will be pinned and freed. Accounting for this case properly
> allows for generic/371 to pass with my delayed refs patches applied.
>
> It's important to note that this problem exists without my delayed refs
> patches, it just was uncovered by them.
>
> Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: Nikolay Borisov <[email protected]>