On Mon, Sep 29, 2025 at 01:45:05PM +0100, Peter Maydell wrote: > On Fri, 26 Sept 2025 at 16:16, Peter Xu <[email protected]> wrote: > > > > On Fri, Sep 26, 2025 at 10:09:29AM +0100, Peter Maydell wrote: > > > On Thu, 25 Sept 2025 at 21:06, Peter Xu <[email protected]> wrote: > > > > > > > > On Thu, Sep 25, 2025 at 10:03:45AM +0100, Peter Maydell wrote: > > > > > On Wed, 24 Sept 2025 at 22:14, Peter Xu <[email protected]> wrote: > > > > > > Side note: when I was trying to test hotplugs with i386/q35, > > > > > > unfortunately > > > > > > I didn't really see when the address space was destroyed, maybe > > > > > > there's a > > > > > > bug somewhere; I put that info into appendix at the end. > > > > > > > > > > This is https://gitlab.com/qemu-project/qemu/-/issues/2517 > > > > > > > > > > I got blocked on that because I ran into a weird "I have some > > > > > memory that needs to be freed by the RCU callback, but only > > > > > after the callback has freed some other RCU stuff". I see > > > > > Paolo made a reply on that bug -- I would need to get back > > > > > to it and reproduce whatever it was I was doing. > > > > > > > > Thanks for the link, right that looks exactly like what I hit. > > > > > > > > I am curious if FIFO is guaranteed for RCU in general, or it is an impl > > > > detail only specific to QEMU. > > > > > > > > The other thing is I feel like it should be OK to reorder callbacks, if > > > > all > > > > the call_rcu() users can make sure the rcu-freed object is completely > > > > detached from the rest world, e.g. resetting all relevant pointers to > > > > NULL. > > > > With that, it seems the order won't matter too, because nobody will be > > > > able > > > > to reference the internal object anyway, so the two objects (after > > > > reseting > > > > all referers to NULL pointer of the inner object) are completely > > > > standalone. > > > > > > The specific ordering problem for cpu_address_space is that > > > there's a g_new allocated array of memory which contains > > > the AddressSpace objects (not pointers to them). The ASes need > > > to be RCU-deallocated first so they can clean up their internal > > > data structures; only once that has happened can we free the > > > memory that holds the AddressSpace structs themselves. > > > > If it's about cpu_address_space_destroy(), then IIUC it can also be done by > > providing a destroy_free() function so that instead of trying to serialize > > two rcu callbacks, we could easily serialize the operations in one > > callback. One sample patch attached to avoid relying on order of rcu > > enqueue. > > I figured out what my problem was here: like the existing > cpu_address_space_destroy(), it wants to first destroy the AS > and then free the memory the AS is using. So it does the > obvious thing: > address_space_destroy(cpuas->as); > g_free_rcu(cpuas->as, rcu); > > This doesn't work, because address_space_destroy() sets > up an RCU callback using the 'rcu' node in the AddressSpace > struct. But then g_free_rcu() tries to do exactly the same > thing and overwrites the info in the 'rcu' node: so we never > call the do_address_space_destroy() hook. > > (1) Is there some way we can make this "tried to use the RCU > node twice" assert?
Good point. Maybe we should assert node->func==NULL in call_rcu1(). > > (2) I think the simplest fix here is something like the > patch you propose that does the "destroy + free" in one > RCU callback. Yes, I agree. Note that Akihiko has another series to QOMify Address space. This problem should be relevant there too. Currently I believe it's similarly broken in his series, but I think maybe we should fix this first on x86/arm hotplugs. Thanks, -- Peter Xu
