On 20/09/2023 20:46, Fabiano Rosas wrote: > Li Zhijian <lizhij...@fujitsu.com> writes: > >> From: Li Zhijian <lizhij...@cn.fujitsu.com> >> >> Destination will fail with: >> qemu-system-x86_64: rdma: Too many requests in this message >> (3638950032).Bailing. >> >> migrate with RDMA is different from tcp. RDMA has its own control >> message, and all traffic between RDMA_CONTROL_REGISTER_REQUEST and >> RDMA_CONTROL_REGISTER_FINISHED should not be disturbed. > > Yeah, this is really fragile. We need a long term solution to this. Any > other change to multifd protocol as well as any other change to the > migration ram handling might hit this issue again.
Yeah, it's pain point. Another option is that let RDMA control handler to know RAM_SAVE_FLAG_MULTIFD_FLUSH message and do nothing with it. > > Perhaps commit 294e5a4034 ("multifd: Only flush once each full round of > memory") should simply not have touched the stream at that point, but we > don't have any explicit safeguards to avoid interleaving flags from > different layers like that (assuming multifd is at another logical layer > than the ram handling)> > I don't have any good suggestions at this moment, so for now: > > Reviewed-by: Fabiano Rosas <faro...@suse.de>