Prasad Pandit <ppan...@redhat.com> writes:

> From: Prasad Pandit <p...@fedoraproject.org>
>
> Hello,
>
> * This series (v7) adds 'MULTIFD_RECV_SYNC' migration command. It is used
>   to notify the destination migration thread to synchronise with the Multifd
>   threads. This allows Multifd ('mig/dst/recv_x') threads on the destination
>   to receive all their data, before they are shutdown.
>
>   This series also updates the channel discovery function and qtests as
>   suggested in the previous review comments.

You forgot to split patch 2. We cannot have a commit that revamps the
channel discovery logic and enables a new feature at the same
time. Changing the channel discovery affects all the migration
use-cases, that change cannot be smuggled along with multifd+postcopy
enablement.

Similarly, the multifd+postcopy enablement is a new feature that needs
to be tested and reasoned upon in isolation, it cannot bring along a
bunch of previously existing code that was shuffled around. We need to
be able to understand clearly what is done _in preparation for_ the
feature and what is done _as part of_ the feature.

Not to mention bisect and backporting. Many people will be looking at
this code in the future without any knowledge of migration, but as part
of a bisect section or when searching for missing backports in the
distros.

I also suggested to move that logic into channel.c. The code is now
well-contained enough that we don't need to be reading it every time
someone is going over the migration flow, it becomes just a helper
function.

> ===
> 67/67 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test                 
> OK             147.84s   81 subtests passed

I see postcopy/multifd/plain hanging from time to time. Probably due to
the changes in patch 5. Please take a look.

> ===
>
>
> v6: 
> https://lore.kernel.org/qemu-devel/20250215123119.814345-1-ppan...@redhat.com/T/#t
> * This series (v6) shuts down Multifd threads before starting Postcopy
>   migration. It helps to avoid an issue of multifd pages arriving late
>   at the destination during Postcopy phase and corrupting the vCPU
>   state. It also reorders the qtest patches and does some refactoring
>   changes as suggested in previous review.
> ===
> 67/67 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test                 
> OK             161.35s   73 subtests passed
> ===
>
>
> v5: 
> https://lore.kernel.org/qemu-devel/20250205122712.229151-1-ppan...@redhat.com/T/#t
> * This series (v5) consolidates migration capabilities setting in one
>   'set_migration_capabilities()' function, thus simplifying test sources.
>   It passes all migration tests.
> ===
> 66/66 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test                 
> OK             143.66s   71 subtests passed
> ===
>
>
> v4: 
> https://lore.kernel.org/qemu-devel/20250127120823.144949-1-ppan...@redhat.com/T/#t
> * This series (v4) adds more 'multifd+postcopy' qtests which test
>   Precopy migration with 'postcopy-ram' attribute set. And run
>   Postcopy migrations with 'multifd' channels enabled.
> ===
> $ ../qtest/migration-test --tap -k -r '/x86_64/migration/multifd+postcopy' | 
> grep -i 'slow test'
> # slow test /x86_64/migration/multifd+postcopy/plain executed in 1.29 secs
> # slow test /x86_64/migration/multifd+postcopy/recovery/tls/psk executed in 
> 2.48 secs
> # slow test /x86_64/migration/multifd+postcopy/preempt/plain executed in 1.49 
> secs
> # slow test /x86_64/migration/multifd+postcopy/preempt/recovery/tls/psk 
> executed in 2.52 secs
> # slow test /x86_64/migration/multifd+postcopy/tcp/tls/psk/match executed in 
> 3.62 secs
> # slow test /x86_64/migration/multifd+postcopy/tcp/plain/zstd executed in 
> 1.34 secs
> # slow test /x86_64/migration/multifd+postcopy/tcp/plain/cancel executed in 
> 2.24 secs
> ...
> 66/66 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test                 
> OK             148.41s   71 subtests passed
> ===
>
>
> v3: 
> https://lore.kernel.org/qemu-devel/20250121131032.1611245-1-ppan...@redhat.com/T/#t
> * This series (v3) passes all existing 'tests/qtest/migration/*' tests
>   and adds a new one to enable multifd channels with postcopy migration.
>
>
> v2: 
> https://lore.kernel.org/qemu-devel/20241129122256.96778-1-ppan...@redhat.com/T/#u
> * This series (v2) further refactors the 'ram_save_target_page'
>   function to make it independent of the multifd & postcopy change.
>
>
> v1: 
> https://lore.kernel.org/qemu-devel/20241126115748.118683-1-ppan...@redhat.com/T/#u
> * This series removes magic value (4-bytes) introduced in the
>   previous series for the Postcopy channel.
>
>
> v0: 
> https://lore.kernel.org/qemu-devel/20241029150908.1136894-1-ppan...@redhat.com/T/#u
> * Currently Multifd and Postcopy migration can not be used together.
>   QEMU shows "Postcopy is not yet compatible with multifd" message.
>
>   When migrating guests with large (100's GB) RAM, Multifd threads
>   help to accelerate migration, but inability to use it with the
>   Postcopy mode delays guest start up on the destination side.
>
> * This patch series allows to enable both Multifd and Postcopy
>   migration together. Precopy and Multifd threads work during
>   the initial guest (RAM) transfer. When migration moves to the
>   Postcopy phase, Multifd threads are restrained and the Postcopy
>   threads start to request pages from the source side.
>
> * This series introduces magic value (4-bytes) to be sent on the
>   Postcopy channel. It helps to differentiate channels and properly
>   setup incoming connections on the destination side.
>
>
> Thank you.
> ---
> Prasad Pandit (5):
>   migration/multifd: move macros to multifd header
>   migration: enable multifd and postcopy together
>   tests/qtest/migration: consolidate set capabilities
>   tests/qtest/migration: add postcopy tests with multifd
>   migration: add MULTIFD_RECV_SYNC migration command
>
>  migration/migration.c                     | 136 +++++++++++++---------
>  migration/multifd-nocomp.c                |  28 +++--
>  migration/multifd.c                       |  17 +--
>  migration/multifd.h                       |   6 +
>  migration/options.c                       |   5 -
>  migration/ram.c                           |   7 +-
>  migration/savevm.c                        |  13 +++
>  migration/savevm.h                        |   1 +
>  tests/qtest/migration/compression-tests.c |  37 +++++-
>  tests/qtest/migration/cpr-tests.c         |   6 +-
>  tests/qtest/migration/file-tests.c        |  58 +++++----
>  tests/qtest/migration/framework.c         |  76 ++++++++----
>  tests/qtest/migration/framework.h         |   9 +-
>  tests/qtest/migration/misc-tests.c        |   4 +-
>  tests/qtest/migration/postcopy-tests.c    |  35 +++++-
>  tests/qtest/migration/precopy-tests.c     |  48 +++++---
>  tests/qtest/migration/tls-tests.c         |  69 ++++++++++-
>  17 files changed, 388 insertions(+), 167 deletions(-)

Reply via email to