Li Zhijian <lizhij...@fujitsu.com> writes: > From: Li Zhijian <lizhij...@cn.fujitsu.com> > > Destination will fail with: > qemu-system-x86_64: rdma: Too many requests in this message > (3638950032).Bailing. > > migrate with RDMA is different from tcp. RDMA has its own control > message, and all traffic between RDMA_CONTROL_REGISTER_REQUEST and > RDMA_CONTROL_REGISTER_FINISHED should not be disturbed.
Yeah, this is really fragile. We need a long term solution to this. Any other change to multifd protocol as well as any other change to the migration ram handling might hit this issue again. Perhaps commit 294e5a4034 ("multifd: Only flush once each full round of memory") should simply not have touched the stream at that point, but we don't have any explicit safeguards to avoid interleaving flags from different layers like that (assuming multifd is at another logical layer than the ram handling). I don't have any good suggestions at this moment, so for now: Reviewed-by: Fabiano Rosas <faro...@suse.de>