On Thu, Oct 24, 2019 at 09:53:24PM +0800, cenjiahui wrote: > On 2019/10/24 17:52, Daniel P. Berrangé wrote: > > On Wed, Oct 23, 2019 at 11:32:14AM +0800, cenjiahui wrote: > >> From: Jiahui Cen <cenjia...@huawei.com> > >> > >> Multifd assumes the migration thread IOChannel is always established before > >> the multifd IOChannels, but this assumption will be broken in many > >> situations > >> like network packet loss. > >> > >> For example: > >> Step1: Source (migration thread IOChannel) --SYN--> Destination > >> Step2: Source (migration thread IOChannel) <--SYNACK Destination > >> Step3: Source (migration thread IOChannel, lost) --ACK-->X Destination > >> Step4: Source (multifd IOChannel) --SYN--> Destination > >> Step5: Source (multifd IOChannel) <--SYNACK Destination > >> Step6: Source (multifd IOChannel, ESTABLISHED) --ACK--> Destination > >> Step7: Destination accepts multifd IOChannel > >> Step8: Source (migration thread IOChannel, ESTABLISHED) -ACK,DATA-> > >> Destination > >> Step9: Destination accepts migration thread IOChannel > >> > >> The above situation can be reproduced by creating a weak network > >> environment, > >> such as "tc qdisc add dev eth0 root netem loss 50%". The wrong acception > >> order > >> will cause magic check failure and thus lead to migration failure. > >> > >> This patch fixes this issue by sending a migration IOChannel initial > >> packet with > >> a unique id when using multifd migration. Since the multifd IOChannels > >> will also > >> send initial packets, the destination can judge whether the processing > >> IOChannel > >> belongs to multifd by checking the id in the initial packet. This > >> mechanism can > >> ensure that different IOChannels will go to correct branches in our test. > > > > Isn't this going to break back compatibility when new QEMU talks to old > > QEMU with multifd enabled ? New QEMU will be sending a packet that old > > QEMU isn't expecting IIUC. > > Yes, it actually breaks back compatibility. But since the old QEMU has bug > with > multifd, it may be not suitable to use multifd to migrate from new QEMU to old > QEMU in my opinion.
We declared multifd supported from v4.0.0 onwards, so changing the wire protocol in non-backwards compatibles ways is not acceptable IMHO. Ideally we'd change QEMU so that the src QEMU serializes the connections, such that the migration thread I/O channel is established before we attempt to establish the multifd channels. If changing the wire protocol is unavoidable, then we'd need to invent a new migration capability for the mgmt apps to detect & opt-in to when both sides support it. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|