On Mon, Sep 29, 2025 at 01:45:05PM +0100, Peter Maydell wrote:
> On Fri, 26 Sept 2025 at 16:16, Peter Xu <[email protected]> wrote:
> >
> > On Fri, Sep 26, 2025 at 10:09:29AM +0100, Peter Maydell wrote:
> > > On Thu, 25 Sept 2025 at 21:06, Peter Xu <[email protected]> wrote:
> > > >
> > > > On Thu, Sep 25, 2025 at 10:03:45AM +0100, Peter Maydell wrote:
> > > > > On Wed, 24 Sept 2025 at 22:14, Peter Xu <[email protected]> wrote:
> > > > > > Side note: when I was trying to test hotplugs with i386/q35, 
> > > > > > unfortunately
> > > > > > I didn't really see when the address space was destroyed, maybe 
> > > > > > there's a
> > > > > > bug somewhere; I put that info into appendix at the end.
> > > > >
> > > > > This is https://gitlab.com/qemu-project/qemu/-/issues/2517
> > > > >
> > > > > I got blocked on that because I ran into a weird "I have some
> > > > > memory that needs to be freed by the RCU callback, but only
> > > > > after the callback has freed some other RCU stuff". I see
> > > > > Paolo made a reply on that bug -- I would need to get back
> > > > > to it and reproduce whatever it was I was doing.
> > > >
> > > > Thanks for the link, right that looks exactly like what I hit.
> > > >
> > > > I am curious if FIFO is guaranteed for RCU in general, or it is an impl
> > > > detail only specific to QEMU.
> > > >
> > > > The other thing is I feel like it should be OK to reorder callbacks, if 
> > > > all
> > > > the call_rcu() users can make sure the rcu-freed object is completely
> > > > detached from the rest world, e.g. resetting all relevant pointers to 
> > > > NULL.
> > > > With that, it seems the order won't matter too, because nobody will be 
> > > > able
> > > > to reference the internal object anyway, so the two objects (after 
> > > > reseting
> > > > all referers to NULL pointer of the inner object) are completely 
> > > > standalone.
> > >
> > > The specific ordering problem for cpu_address_space is that
> > > there's a g_new allocated array of memory which contains
> > > the AddressSpace objects (not pointers to them). The ASes need
> > > to be RCU-deallocated first so they can clean up their internal
> > > data structures; only once that has happened can we free the
> > > memory that holds the AddressSpace structs themselves.
> >
> > If it's about cpu_address_space_destroy(), then IIUC it can also be done by
> > providing a destroy_free() function so that instead of trying to serialize
> > two rcu callbacks, we could easily serialize the operations in one
> > callback.  One sample patch attached to avoid relying on order of rcu
> > enqueue.
> 
> I figured out what my problem was here: like the existing
> cpu_address_space_destroy(), it wants to first destroy the AS
> and then free the memory the AS is using. So it does the
> obvious thing:
>     address_space_destroy(cpuas->as);
>     g_free_rcu(cpuas->as, rcu);
> 
> This doesn't work, because address_space_destroy() sets
> up an RCU callback using the 'rcu' node in the AddressSpace
> struct. But then g_free_rcu() tries to do exactly the same
> thing and overwrites the info in the 'rcu' node: so we never
> call the do_address_space_destroy() hook.
> 
> (1) Is there some way we can make this "tried to use the RCU
> node twice" assert?

Good point.  Maybe we should assert node->func==NULL in call_rcu1().

> 
> (2) I think the simplest fix here is something like the
> patch you propose that does the "destroy + free" in one
> RCU callback.

Yes, I agree.

Note that Akihiko has another series to QOMify Address space.  This problem
should be relevant there too. Currently I believe it's similarly broken in
his series, but I think maybe we should fix this first on x86/arm hotplugs.

Thanks,

-- 
Peter Xu


Reply via email to