On Tue, Oct 29, 2024 at 05:16:00PM -0400, Peter Xu wrote:
> v1: https://lore.kernel.org/r/20241024165627.1372621-1-pet...@redhat.com

> Meanwhile, migration has a long standing issue on current_migration
> pointer, where it can point to freed data after the migration object is
> finalized.  It is debatable that the pointer can be cleared after the main
> thread (1) join() the migration thread first, then (2) release the last
> refcount for the migration object and clear the pointer.  However there's
> still major challenges [1].  With singleton, we could have a slightly but
> hopefully working workaround to clear the pointer during finalize().

I'm still not entirely convinced that this singleton proposal is
fixing the migration problem correctly.

Based on discussions in v1, IIUC, the situation is that we have
migration_shutdown() being called from qemu_cleanup(). The former
will call object_unref(current_migration), but there may still
be background migration threads running that access 'current_migration',
and thus a potential use-after-free.

Based on what the 7th patch here does, the key difference is that
the finalize() method for MigrationState will set 'current_migration'
to NULL after free'ing it.

I don't believe that is safe.

Back to the current code, if there is a use-after-free today, that
implies that the background threads are *not* holding their own
reference on 'current_migration', allowing the object to be free'd
while they're still using it. If they held their own reference then
the object_unref in migration_shutdown would not have any use after
free risk.

The new code is not changing the ref counting done by any threads.
Therefore if there's a use-after-free in existing code, AFAICT, the
same use-after-free *must* still exist in the current code.

The 7th patch only fixes the use-after-free, *if and only if* the
background thread tries to access 'current_migration', /after/
finalize as completed. The use-after-free in this case, has been
turned into a NULL pointer reference.

A background thread could be accessing the 'current_migration' pointer
*concurrently* with the finalize method executing though. In this case
we still have a use after free problem, only the time window in which
it exists has been narrowed a little.

Shouldn't the problem with migration be solved by every migration thread
holding a reference on current_migration, that the thread releases when
it exits, such that MigrationState is only finalized once every thread
has exited ? That would not require any join() synchronization point.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Reply via email to