On Fri, Sep 22, 2023 at 12:59:37PM -0300, Fabiano Rosas wrote: > Peter Xu <pet...@redhat.com> writes: > > > On Wed, Sep 20, 2023 at 05:04:11PM +0800, Li Zhijian wrote: > >> From: Li Zhijian <lizhij...@cn.fujitsu.com> > >> > >> Destination will fail with: > >> qemu-system-x86_64: rdma: Too many requests in this message > >> (3638950032).Bailing. > >> > >> migrate with RDMA is different from tcp. RDMA has its own control > >> message, and all traffic between RDMA_CONTROL_REGISTER_REQUEST and > >> RDMA_CONTROL_REGISTER_FINISHED should not be disturbed. > >> > >> find_dirty_block() will be called during RDMA_CONTROL_REGISTER_REQUEST > >> and RDMA_CONTROL_REGISTER_FINISHED, it will send a extra traffic to > >> destination and cause migration to fail. > >> > >> Since there's no existing subroutine to indicate whether it's migrated > >> by RDMA or not, and RDMA is not compatible with multifd, we use > >> migrate_multifd() here. > >> > >> Fixes: 294e5a4034 ("multifd: Only flush once each full round of memory") > >> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> > >> --- > >> migration/ram.c | 3 ++- > >> 1 file changed, 2 insertions(+), 1 deletion(-) > >> > >> diff --git a/migration/ram.c b/migration/ram.c > >> index 9040d66e61..89ae28e21a 100644 > >> --- a/migration/ram.c > >> +++ b/migration/ram.c > >> @@ -1399,7 +1399,8 @@ static int find_dirty_block(RAMState *rs, > >> PageSearchStatus *pss) > >> pss->page = 0; > >> pss->block = QLIST_NEXT_RCU(pss->block, next); > >> if (!pss->block) { > >> - if (!migrate_multifd_flush_after_each_section()) { > >> + if (migrate_multifd() && > >> + !migrate_multifd_flush_after_each_section()) { > >> QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel; > >> int ret = multifd_send_sync_main(f); > >> if (ret < 0) { > >> -- > >> 2.31.1 > >> > > > > Maybe better to put that check at the entry of > > migrate_multifd_flush_after_each_section()? > > > > I also hope that some day there's no multifd function called in generic > > migration code paths.. > > I wonder what happened with that MigrationOps idea. We added the > ram_save_target_page pointer and nothing else. It seems like it could be > something in the direction of allowing different parts of the migration > code to provide different behavior without having to put these explicit > checks all over the place.
Yeah.. https://lore.kernel.org/qemu-devel/20230130080956.3047-12-quint...@redhat.com/ Juan should know better. Personally I think it'll be good we only introduce hook when there's a 2nd user already. I assume Juan merged it planning that'll land soon but it didn't. -- Peter Xu