Ackerley Tng <[email protected]> writes: > "David Hildenbrand (Arm)" <[email protected]> writes: > >> >> [...snip...] >> >>>> Could we maybe have a >>>> different callback (when the mapping is still guaranteed to be around) >>>> from where we could update i_blocks on the freeing path? >>> >>> Do you mean that we should add a new callback to struct >>> address_space_operations? >> >> If that avoids having to implement truncation completely ourselves, that >> might be one >> option we could discuss, yes. >> >> Something like: >> >> diff --git a/Documentation/filesystems/vfs.rst >> b/Documentation/filesystems/vfs.rst >> index 7c753148af88..94f8bb81f017 100644 >> --- a/Documentation/filesystems/vfs.rst >> +++ b/Documentation/filesystems/vfs.rst >> @@ -764,6 +764,7 @@ cache in your filesystem. The following members are >> defined: >> sector_t (*bmap)(struct address_space *, sector_t); >> void (*invalidate_folio) (struct folio *, size_t start, >> size_t len); >> bool (*release_folio)(struct folio *, gfp_t); >> + void (*remove_folio)(struct folio *folio); >> void (*free_folio)(struct folio *); >> ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter); >> int (*migrate_folio)(struct mapping *, struct folio *dst, >> @@ -922,6 +923,11 @@ cache in your filesystem. The following members are >> defined: >> its release_folio will need to ensure this. Possibly it can >> clear the uptodate flag if it cannot free private data yet. >> >> +``remove_folio`` >> + remove_folio is called just before the folio is removed from the >> + page cache in order to allow the cleanup of properties (e.g., >> + accounting) that needs the address_space mapping. >> + >> ``free_folio`` >> free_folio is called once the folio is no longer visible in the >> page cache in order to allow the cleanup of any private data. >> diff --git a/include/linux/fs.h b/include/linux/fs.h >> index 8b3dd145b25e..f7f6930977a1 100644 >> --- a/include/linux/fs.h >> +++ b/include/linux/fs.h >> @@ -422,6 +422,7 @@ struct address_space_operations { >> sector_t (*bmap)(struct address_space *, sector_t); >> void (*invalidate_folio) (struct folio *, size_t offset, size_t len); >> bool (*release_folio)(struct folio *, gfp_t); >> + void (*remove_folio)(struct folio *folio); >> void (*free_folio)(struct folio *folio); >> ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter); >> /* >> diff --git a/mm/filemap.c b/mm/filemap.c >> index 6cd7974d4ada..5a810eaacab2 100644 >> --- a/mm/filemap.c >> +++ b/mm/filemap.c >> @@ -250,8 +250,14 @@ void filemap_free_folio(struct address_space *mapping, >> struct folio *folio) >> void filemap_remove_folio(struct folio *folio) >> { >> struct address_space *mapping = folio->mapping; >> + void (*remove_folio)(struct folio *); >> >> BUG_ON(!folio_test_locked(folio)); >> + >> + remove_folio = mapping->a_ops->remove_folio; >> + if (unlikely(remove_folio)) >> + remove_folio(folio); >> + >> spin_lock(&mapping->host->i_lock); >> xa_lock_irq(&mapping->i_pages); >> __filemap_remove_folio(folio, NULL); >> > > Thanks for this suggestion, I'll try this out and send another revision. > >> >> Ideally we'd perform it under the lock just after clearing folio->mapping, >> but I guess that >> might be more controversial. >>
I'm not sure which lock you were referring to, I hope it's not the inode's i_lock? Why is calling the callback under lock frowned upon? I found .remove_folio also had to be called from delete_from_page_cache_batch() for it to work. Then I saw that both of those functions already use filemap_unaccount_folio(), and after all, like you said, guest_memfd will be using this callback for accounting, so in RFC v2 [1] I used .unaccount_folio instead, and it is called under the inode's i_lock from filemap_unaccount_folio(). [1] https://lore.kernel.org/all/[email protected]/T/ >> For accounting you need the above might be good enough, but I am not sure >> for how many >> other use cases there might be. >> >> -- >> Cheers, >> >> David

