On Tue, Aug 26, 2025 at 06:45:43PM +0100, Peter Maydell wrote: > On Thu, 21 Aug 2025 at 19:18, Peter Xu <pet...@redhat.com> wrote: > > I remember I provided a draft somewhere during the discussion, anyway.. I > > redid it, and attached a formal patch below that will contain the removal > > of the mr->container check (plus auto-detach when it happens). The hope is > > this should work.. and comparing to the v8 of Akihiko's, it won't make MR's > > own refcount any more complicated. > > > > If necessary, I can send a formal patch. > > This patch seems to work, in that it fixes the memory-region > related leaks I previously was seeing. Review comments below > (which are only about the commit message and docs). > > I also can't remember the details of the previous discussions about > these patches, so I'm reviewing the one below essentially from > scratch. Apologies in advance if we end up going back around > a conversation loop that we've already had... > > > Thanks, > > > > ===8<=== > > From f985c54af3e276b175bcb02725a5fb996c3f5fe5 Mon Sep 17 00:00:00 2001 > > From: Peter Xu <pet...@redhat.com> > > Date: Thu, 21 Aug 2025 12:59:02 -0400 > > Subject: [PATCH] memory: Fix leaks due to owner-shared MRs circular > > references > > > > Currently, QEMU refcounts the MR by always taking it from the owner. > > > > It should be non-issue if all the owners of MRs will properly detach all > > the MRs from their parents by memory_region_del_subregion() when the owner > > will be freed. However, it turns out many of the device emulations do not > > do that properly. It might be a challenge to fix all of such occurances. > > I think that it's not really right to cast this as "some devices > don't do this right". The pattern of "a device has a container MR C > and some other MRs A, B... which it puts into C" is a legitimate one. > If you do this then (with the current code in QEMU) there is *no* > place where a device emulation can do a manual "remove A, B.. from > the container C so the refcounts go down": the place where devices > undo what they did in instance_init is instance_deinit, but we > will never call instance_deinit because the refcount of the owning > device never goes to 0.
It should still be able to reach zero if we skip the refcount of internal MR subregions like what this patch does. Said that, I think you're right.. the problem is we have object_deinit() after object_property_del_all() (in object_finalize()), which means memory_region_finalize() will be invoked before object_deinit().. Then it looks wrong now to delete subregions in a deinit() even if reachable, because the MRs should have been freed.. > > Further, even if we had some place where devices could somehow > undo the "put A, B... in the container so the refcounts go down > correctly", it is better API design to have the memory.c code > automatically handle this situation. "This just works" is more > reliable than "this works if you do cleanup step X", because it's > impossible to have the bug where you forget to write the code to > do the cleanup step. Fair enough. > > > Without fixing it, QEMU faces circular reference issue when an owner can > > contain more than one MRs, then when they are linked within each other in > > form of subregions, those links keep the owner alive forever, causing > > memory leaks even if all the external refcounts are released for the owner. > > > > To fix that, teach memory API to stop refcount on MRs that share the same > > owner because if they share the lifecycle of the owner, then they share the > > same lifecycle between themselves. > > > > Meanwhile, allow auto-detachments of MRs during finalize() of MRs even > > against its container, as long as they belong to the same owner. > > > > The latter is needed because now it's possible to have MR finalize() happen > > in any order when they exactly share the same lifespan. In this case, we > > should allow finalize() to happen in any order of either the parent or > > child MR, however it should be guaranteed that the mr->container shares the > > same owner of this MR to be finalized. > > > > Proper document this behavior in code. > > > > This patch is heavily based on the work done by Akihiko Odaki: > > > > https://lore.kernel.org/r/cafeaca8dv40fgsci76r4yep1p-sp_qjnrdd2ozpxjx5wrs0...@mail.gmail.com > > > > Signed-off-by: Peter Xu <pet...@redhat.com> > > --- > > docs/devel/memory.rst | 9 +++++++-- > > system/memory.c | 45 ++++++++++++++++++++++++++++++++++--------- > > 2 files changed, 43 insertions(+), 11 deletions(-) > > > > diff --git a/docs/devel/memory.rst b/docs/devel/memory.rst > > index 57fb2aec76..1367c7caf7 100644 > > --- a/docs/devel/memory.rst > > +++ b/docs/devel/memory.rst > > @@ -158,8 +158,13 @@ ioeventfd) can be changed during the region lifecycle. > > They take effect > > as soon as the region is made visible. This can be immediately, later, > > or never. > > > > -Destruction of a memory region happens automatically when the owner > > -object dies. > > +Destruction of a memory region happens automatically when the owner object > > +dies. Note that the MRs under the same owner can be destroyed in any order > > +when the owner object dies. It's because the MRs will share the same > > +lifespan of the owner, no matter if its a container or child MR. It's > > +suggested to always cleanly detach the MRs under the owner object when the > > +owner object is going to be destroyed, however it is not required, as the > > +memory core will automatically detach the links when MRs are destroyed. > > I think we should not say "we suggest you always cleanly detach the MRs": > instead we should say "you can rely on this happening, so you don't > need to write manual code to do it". > > The actual code changes look OK to me. I'll send the patch out officially for review, with above point taken. Akihiko, let me know anytime when you want to either add your SoB or take over the ownership of the patch. I'll be OK with it. Thanks, -- Peter Xu