On 02/12/2015 12:34 PM, Paolo Bonzini wrote: > > > On 12/02/2015 17:21, Matthew Rosato wrote: >> Since 374f2981d1 "memory: protect current_map by RCU", >> address_space_update_topology unrefs the old_flatview twice, >> once by call_rcu and once by direct call. This patch removes >> the direct call in favor of the call_rcu. Fixes at least one >> assertion failure seen in s390, where a ref count for a memory >> region attempts to go negative during hot-unplug of guest memory. > > This is buggy: > > MemoryRegion *standby_ram = g_new(MemoryRegion, 1); > memory_region_init_ram(standby_ram, NULL, id, > this_subregion_size, &error_abort); > > ... > object_unparent(OBJECT(mr)); > g_free(mr); > > Memory might not be released after object_unparent (and will not be > released if the RCU callbacks haven't run yet). > > Your patch worked around it because it never freed the FlatView, and > thus the RCU callbacks never did anything on the memory regions. > > Instead, you have to do this: > > MemoryRegion *standby_ram = g_new(MemoryRegion, 1); > Object *obj = OBJECT(standby_ram); > > memory_region_init_ram(standby_ram, NULL, id, > this_subregion_size, &error_abort); > OBJECT(mr)->free = g_free; > ... > object_unparent(OBJECT(mr)); > > and the freeing will happen once the reference count goes to zero. > > Paolo >
Thanks Paolo, I like this change but it doesn't seem to help; still hitting the same assertion with it applied. Could it be that the order in which flatview_unref (and therefore memory_region_unref) vs object_unparent(mr) matters (ie, object_unparent should always happen last)? Prior to RCUification, seems like the old_view was always unreferenced prior to return from memory_region_del_subregion (so, memory_region_unref always occurred before the object_unparent(mr)). Now, the two could happen in either order -- and looking at memory_region_unref, it is specifically looking at the existence of obj->parent. Still investigating this, but I performed a litmus test: if I directly call flatview_unref twice in address_space_update_topology (rather than 1 direct & 1 via call_rcu), the assertion goes away. Matt >> >> Signed-off-by: Matthew Rosato <mjros...@linux.vnet.ibm.com> >> --- >> memory.c | 3 +-- >> 1 file changed, 1 insertion(+), 2 deletions(-) >> >> diff --git a/memory.c b/memory.c >> index 130152c..d08abe5 100644 >> --- a/memory.c >> +++ b/memory.c >> @@ -755,7 +755,6 @@ static void address_space_update_topology(AddressSpace >> *as) >> >> /* Writes are protected by the BQL. */ >> atomic_rcu_set(&as->current_map, new_view); >> - call_rcu(old_view, flatview_unref, rcu); >> >> /* Note that all the old MemoryRegions are still alive up to this >> * point. This relieves most MemoryListeners from the need to >> @@ -763,7 +762,7 @@ static void address_space_update_topology(AddressSpace >> *as) >> * outside the iothread mutex, in which case precise reference >> * counting is necessary. >> */ >> - flatview_unref(old_view); >> + call_rcu(old_view, flatview_unref, rcu); >> >> address_space_update_ioeventfds(as); >> } >> >