On Fri, 26 Sept 2025 at 16:16, Peter Xu <[email protected]> wrote:
>
> On Fri, Sep 26, 2025 at 10:09:29AM +0100, Peter Maydell wrote:
> > On Thu, 25 Sept 2025 at 21:06, Peter Xu <[email protected]> wrote:
> > >
> > > On Thu, Sep 25, 2025 at 10:03:45AM +0100, Peter Maydell wrote:
> > > > On Wed, 24 Sept 2025 at 22:14, Peter Xu <[email protected]> wrote:
> > > > > Side note: when I was trying to test hotplugs with i386/q35,
> > > > > unfortunately
> > > > > I didn't really see when the address space was destroyed, maybe
> > > > > there's a
> > > > > bug somewhere; I put that info into appendix at the end.
> > > >
> > > > This is https://gitlab.com/qemu-project/qemu/-/issues/2517
> > > >
> > > > I got blocked on that because I ran into a weird "I have some
> > > > memory that needs to be freed by the RCU callback, but only
> > > > after the callback has freed some other RCU stuff". I see
> > > > Paolo made a reply on that bug -- I would need to get back
> > > > to it and reproduce whatever it was I was doing.
> > >
> > > Thanks for the link, right that looks exactly like what I hit.
> > >
> > > I am curious if FIFO is guaranteed for RCU in general, or it is an impl
> > > detail only specific to QEMU.
> > >
> > > The other thing is I feel like it should be OK to reorder callbacks, if
> > > all
> > > the call_rcu() users can make sure the rcu-freed object is completely
> > > detached from the rest world, e.g. resetting all relevant pointers to
> > > NULL.
> > > With that, it seems the order won't matter too, because nobody will be
> > > able
> > > to reference the internal object anyway, so the two objects (after
> > > reseting
> > > all referers to NULL pointer of the inner object) are completely
> > > standalone.
> >
> > The specific ordering problem for cpu_address_space is that
> > there's a g_new allocated array of memory which contains
> > the AddressSpace objects (not pointers to them). The ASes need
> > to be RCU-deallocated first so they can clean up their internal
> > data structures; only once that has happened can we free the
> > memory that holds the AddressSpace structs themselves.
>
> If it's about cpu_address_space_destroy(), then IIUC it can also be done by
> providing a destroy_free() function so that instead of trying to serialize
> two rcu callbacks, we could easily serialize the operations in one
> callback. One sample patch attached to avoid relying on order of rcu
> enqueue.
I figured out what my problem was here: like the existing
cpu_address_space_destroy(), it wants to first destroy the AS
and then free the memory the AS is using. So it does the
obvious thing:
address_space_destroy(cpuas->as);
g_free_rcu(cpuas->as, rcu);
This doesn't work, because address_space_destroy() sets
up an RCU callback using the 'rcu' node in the AddressSpace
struct. But then g_free_rcu() tries to do exactly the same
thing and overwrites the info in the 'rcu' node: so we never
call the do_address_space_destroy() hook.
(1) Is there some way we can make this "tried to use the RCU
node twice" assert?
(2) I think the simplest fix here is something like the
patch you propose that does the "destroy + free" in one
RCU callback.
thanks
-- PMM