On Fri, Feb 02, 2024 at 04:11:27PM -0300, Fabiano Rosas wrote:
> We currently have an unfavorable situation around multifd channels
> creation and the migration thread execution.
> 
> We create the multifd channels with qio_channel_socket_connect_async
> -> qio_task_run_in_thread, but only connect them at the
> multifd_new_send_channel_async callback, called from
> qio_task_complete, which is registered as a glib event.
> 
> So at multifd_save_setup() we create the channels, but they will only
> be actually usable after the whole multifd_save_setup() calling stack
> returns back to the main loop. Which means that the migration thread
> is already up and running without any possibility for the multifd
> channels to be ready on time.
> 
> We currently rely on the channels-ready semaphore blocking
> multifd_send_sync_main() until channels start to come up and release
> it. However there have been bugs recently found when a channel's
> creation fails and multifd_save_cleanup() is allowed to run while
> other channels are still being created.
> 
> Let's start to organize this situation by moving the
> multifd_save_setup() call into the migration thread. That way we
> unblock the main-loop to dispatch the completion callbacks and
> actually have a chance of getting the multifd channels ready for when
> the migration thread needs them.
> 
> The next patches will deal with the synchronization aspects.
> 
> Note that this takes multifd_save_setup() out of the BQL.
> 
> Signed-off-by: Fabiano Rosas <faro...@suse.de>

Reviewed-by: Peter Xu <pet...@redhat.com>

-- 
Peter Xu


Reply via email to