Prasad Pandit <ppan...@redhat.com> writes: > From: Prasad Pandit <p...@fedoraproject.org> > > Hello, > > * This series (v7) adds 'MULTIFD_RECV_SYNC' migration command. It is used > to notify the destination migration thread to synchronise with the Multifd > threads. This allows Multifd ('mig/dst/recv_x') threads on the destination > to receive all their data, before they are shutdown. > > This series also updates the channel discovery function and qtests as > suggested in the previous review comments.
You forgot to split patch 2. We cannot have a commit that revamps the channel discovery logic and enables a new feature at the same time. Changing the channel discovery affects all the migration use-cases, that change cannot be smuggled along with multifd+postcopy enablement. Similarly, the multifd+postcopy enablement is a new feature that needs to be tested and reasoned upon in isolation, it cannot bring along a bunch of previously existing code that was shuffled around. We need to be able to understand clearly what is done _in preparation for_ the feature and what is done _as part of_ the feature. Not to mention bisect and backporting. Many people will be looking at this code in the future without any knowledge of migration, but as part of a bisect section or when searching for missing backports in the distros. I also suggested to move that logic into channel.c. The code is now well-contained enough that we don't need to be reading it every time someone is going over the migration flow, it becomes just a helper function. > === > 67/67 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test > OK 147.84s 81 subtests passed I see postcopy/multifd/plain hanging from time to time. Probably due to the changes in patch 5. Please take a look. > === > > > v6: > https://lore.kernel.org/qemu-devel/20250215123119.814345-1-ppan...@redhat.com/T/#t > * This series (v6) shuts down Multifd threads before starting Postcopy > migration. It helps to avoid an issue of multifd pages arriving late > at the destination during Postcopy phase and corrupting the vCPU > state. It also reorders the qtest patches and does some refactoring > changes as suggested in previous review. > === > 67/67 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test > OK 161.35s 73 subtests passed > === > > > v5: > https://lore.kernel.org/qemu-devel/20250205122712.229151-1-ppan...@redhat.com/T/#t > * This series (v5) consolidates migration capabilities setting in one > 'set_migration_capabilities()' function, thus simplifying test sources. > It passes all migration tests. > === > 66/66 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test > OK 143.66s 71 subtests passed > === > > > v4: > https://lore.kernel.org/qemu-devel/20250127120823.144949-1-ppan...@redhat.com/T/#t > * This series (v4) adds more 'multifd+postcopy' qtests which test > Precopy migration with 'postcopy-ram' attribute set. And run > Postcopy migrations with 'multifd' channels enabled. > === > $ ../qtest/migration-test --tap -k -r '/x86_64/migration/multifd+postcopy' | > grep -i 'slow test' > # slow test /x86_64/migration/multifd+postcopy/plain executed in 1.29 secs > # slow test /x86_64/migration/multifd+postcopy/recovery/tls/psk executed in > 2.48 secs > # slow test /x86_64/migration/multifd+postcopy/preempt/plain executed in 1.49 > secs > # slow test /x86_64/migration/multifd+postcopy/preempt/recovery/tls/psk > executed in 2.52 secs > # slow test /x86_64/migration/multifd+postcopy/tcp/tls/psk/match executed in > 3.62 secs > # slow test /x86_64/migration/multifd+postcopy/tcp/plain/zstd executed in > 1.34 secs > # slow test /x86_64/migration/multifd+postcopy/tcp/plain/cancel executed in > 2.24 secs > ... > 66/66 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test > OK 148.41s 71 subtests passed > === > > > v3: > https://lore.kernel.org/qemu-devel/20250121131032.1611245-1-ppan...@redhat.com/T/#t > * This series (v3) passes all existing 'tests/qtest/migration/*' tests > and adds a new one to enable multifd channels with postcopy migration. > > > v2: > https://lore.kernel.org/qemu-devel/20241129122256.96778-1-ppan...@redhat.com/T/#u > * This series (v2) further refactors the 'ram_save_target_page' > function to make it independent of the multifd & postcopy change. > > > v1: > https://lore.kernel.org/qemu-devel/20241126115748.118683-1-ppan...@redhat.com/T/#u > * This series removes magic value (4-bytes) introduced in the > previous series for the Postcopy channel. > > > v0: > https://lore.kernel.org/qemu-devel/20241029150908.1136894-1-ppan...@redhat.com/T/#u > * Currently Multifd and Postcopy migration can not be used together. > QEMU shows "Postcopy is not yet compatible with multifd" message. > > When migrating guests with large (100's GB) RAM, Multifd threads > help to accelerate migration, but inability to use it with the > Postcopy mode delays guest start up on the destination side. > > * This patch series allows to enable both Multifd and Postcopy > migration together. Precopy and Multifd threads work during > the initial guest (RAM) transfer. When migration moves to the > Postcopy phase, Multifd threads are restrained and the Postcopy > threads start to request pages from the source side. > > * This series introduces magic value (4-bytes) to be sent on the > Postcopy channel. It helps to differentiate channels and properly > setup incoming connections on the destination side. > > > Thank you. > --- > Prasad Pandit (5): > migration/multifd: move macros to multifd header > migration: enable multifd and postcopy together > tests/qtest/migration: consolidate set capabilities > tests/qtest/migration: add postcopy tests with multifd > migration: add MULTIFD_RECV_SYNC migration command > > migration/migration.c | 136 +++++++++++++--------- > migration/multifd-nocomp.c | 28 +++-- > migration/multifd.c | 17 +-- > migration/multifd.h | 6 + > migration/options.c | 5 - > migration/ram.c | 7 +- > migration/savevm.c | 13 +++ > migration/savevm.h | 1 + > tests/qtest/migration/compression-tests.c | 37 +++++- > tests/qtest/migration/cpr-tests.c | 6 +- > tests/qtest/migration/file-tests.c | 58 +++++---- > tests/qtest/migration/framework.c | 76 ++++++++---- > tests/qtest/migration/framework.h | 9 +- > tests/qtest/migration/misc-tests.c | 4 +- > tests/qtest/migration/postcopy-tests.c | 35 +++++- > tests/qtest/migration/precopy-tests.c | 48 +++++--- > tests/qtest/migration/tls-tests.c | 69 ++++++++++- > 17 files changed, 388 insertions(+), 167 deletions(-)