On Mon, Mar 09, 2026, Ackerley Tng wrote: > "David Hildenbrand (Arm)" <[email protected]> writes: > > > On 3/9/26 10:53, Ackerley Tng wrote: > >> The guest memfd currently does not update the inode's i_blocks and i_bytes > >> count when memory is allocated or freed. Hence, st_blocks returned from > >> fstat() is always 0. > >> > >> Introduce byte accounting for guest memfd inodes. When a new folio is > >> added to the filemap, add the folio's size. Use the .invalidate_folio() > >> callback to subtract the folio's size from inode fields when folios are > >> truncated and removed from the filemap. > >> > >> Signed-off-by: Ackerley Tng <[email protected]> > >> --- > >> virt/kvm/guest_memfd.c | 14 ++++++++++++++ > >> 1 file changed, 14 insertions(+) > >> > >> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c > >> index 462c5c5cb602a..77219551056a7 100644 > >> --- a/virt/kvm/guest_memfd.c > >> +++ b/virt/kvm/guest_memfd.c > >> @@ -136,6 +136,9 @@ static struct folio *kvm_gmem_get_folio(struct inode > >> *inode, pgoff_t index) > >> mapping_gfp_mask(inode->i_mapping), > >> policy); > >> mpol_cond_put(policy); > >> > >> + if (!IS_ERR(folio)) > >> + inode_add_bytes(inode, folio_size(folio)); > >> + > > > > Can't we have two concurrent calls to __filemap_get_folio_mpol(), and we > > don't really know whether our call allocated the folio or simply found > > one (the other caller allocated) in the pagecache? > > > > Ah that is true. Two threads can get past filemap_lock_folio(), then get > to __filemap_get_folio_mpol(), and then thread 1 will return from > __filemap_get_folio_mpol() with an allocated folio while thread 2 > returns with the folio allocated by thread 1. Both threads would end up > incrementing the number of bytes in the inode. > > Sean, Vlastimil, is this a good argument for open coding, like in RFC v2 > [1]? So that guest_memfd can do inode_add_bytes() specifically when the > folio is added to the filemap.
Heh, I assumed that was going to be _the_ argument, i.e. I was expecting the answer to my implicit question of "if this greatly simplifies accounting" was going to be "trying to do the right thing while using __filemap_get_folio_mpol() is insane". > An alternative I can think of is to add a callback that is called from > within __filemap_add_folio(). Would that be preferred? Probably not. Poking around, it definitely seems like guest_memfd is the oddball. E.g. as David pointed out, even shmem participates in disk quota stuff, and HugeTLB is its own beast. In other words, I doubt any "real" filesystem will want to hook __filemap_add_folio() in this way. So as I said before, "if this greatly simplifies accounting, then I'm ok with it". And it sounds like the answer is an emphatic "yes". And again as I said before, all I ask at this point is that the refactoring changelog focuses on that point. P.S. In future versions, please explain _why_ you want to add fstat() support, i.e. why you want to account allocated bytes/folios. For folks like me that do very little userspace programming, and even less filesystems work, fstat() not working means nothing. Even if the answer is "because literally every other FS in Linux works".

