Re: [RFC PATCH v1 00/10] guest_memfd: Track amount of memory allocated on inode

David Hildenbrand (Arm) Wed, 25 Feb 2026 01:31:56 -0800

On 2/25/26 08:31, Ackerley Tng wrote:
> Ackerley Tng <[email protected]> writes:
> 
>> "David Hildenbrand (Arm)" <[email protected]> writes:
>>
>>>
>>> [...snip...]
>>>
>>>
>>> If that avoids having to implement truncation completely ourselves, that 
>>> might be one
>>> option we could discuss, yes.
>>>
>>> Something like:
>>>
>>> diff --git a/Documentation/filesystems/vfs.rst 
>>> b/Documentation/filesystems/vfs.rst
>>> index 7c753148af88..94f8bb81f017 100644
>>> --- a/Documentation/filesystems/vfs.rst
>>> +++ b/Documentation/filesystems/vfs.rst
>>> @@ -764,6 +764,7 @@ cache in your filesystem.  The following members are 
>>> defined:
>>>                 sector_t (*bmap)(struct address_space *, sector_t);
>>>                 void (*invalidate_folio) (struct folio *, size_t start, 
>>> size_t len);
>>>                 bool (*release_folio)(struct folio *, gfp_t);
>>> +               void (*remove_folio)(struct folio *folio);
>>>                 void (*free_folio)(struct folio *);
>>>                 ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter);
>>>                 int (*migrate_folio)(struct mapping *, struct folio *dst,
>>> @@ -922,6 +923,11 @@ cache in your filesystem.  The following members are 
>>> defined:
>>>         its release_folio will need to ensure this.  Possibly it can
>>>         clear the uptodate flag if it cannot free private data yet.
>>>
>>> +``remove_folio``
>>> +       remove_folio is called just before the folio is removed from the
>>> +       page cache in order to allow the cleanup of properties (e.g.,
>>> +       accounting) that needs the address_space mapping.
>>> +
>>>  ``free_folio``
>>>         free_folio is called once the folio is no longer visible in the
>>>         page cache in order to allow the cleanup of any private data.
>>> diff --git a/include/linux/fs.h b/include/linux/fs.h
>>> index 8b3dd145b25e..f7f6930977a1 100644
>>> --- a/include/linux/fs.h
>>> +++ b/include/linux/fs.h
>>> @@ -422,6 +422,7 @@ struct address_space_operations {
>>>         sector_t (*bmap)(struct address_space *, sector_t);
>>>         void (*invalidate_folio) (struct folio *, size_t offset, size_t 
>>> len);
>>>         bool (*release_folio)(struct folio *, gfp_t);
>>> +       void (*remove_folio)(struct folio *folio);
>>>         void (*free_folio)(struct folio *folio);
>>>         ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter);
>>>         /*
>>> diff --git a/mm/filemap.c b/mm/filemap.c
>>> index 6cd7974d4ada..5a810eaacab2 100644
>>> --- a/mm/filemap.c
>>> +++ b/mm/filemap.c
>>> @@ -250,8 +250,14 @@ void filemap_free_folio(struct address_space *mapping, 
>>> struct folio *folio)
>>>  void filemap_remove_folio(struct folio *folio)
>>>  {
>>>         struct address_space *mapping = folio->mapping;
>>> +       void (*remove_folio)(struct folio *);
>>>
>>>         BUG_ON(!folio_test_locked(folio));
>>> +
>>> +       remove_folio = mapping->a_ops->remove_folio;
>>> +       if (unlikely(remove_folio))
>>> +               remove_folio(folio);
>>> +
>>>         spin_lock(&mapping->host->i_lock);
>>>         xa_lock_irq(&mapping->i_pages);
>>>         __filemap_remove_folio(folio, NULL);
>>>
>>
>> Thanks for this suggestion, I'll try this out and send another revision.
>>
>>>
>>> Ideally we'd perform it under the lock just after clearing folio->mapping, 
>>> but I guess that
>>> might be more controversial.
>>>
> 
> I'm not sure which lock you were referring to, I hope it's not the
> inode's i_lock? Why is calling the callback under lock frowned upon?


I meant the two locks: mapping->host->i_lock and mapping->i_pages.

I'd assume new callbacks that might result in holding these precious
locks longer might be a problem for some people. Well, maybe, maybe not.

I guess .free_folio() is called outside the lock because it's assumed to
possibly do more expensive operations.

-- 
Cheers,

David

Re: [RFC PATCH v1 00/10] guest_memfd: Track amount of memory allocated on inode

Reply via email to