On Tue, Mar 04, 2025 at 03:17:14PM +0530, Prasad Pandit wrote:
> Hi,
> 
> On Mon, 3 Mar 2025 at 19:42, Peter Xu <pet...@redhat.com> wrote:
> > On Mon, Mar 03, 2025 at 04:17:53PM +0530, Prasad Pandit wrote:
> > > * I think we (you, me, Peter) are all looking at things differently.
> > >     - In my view Patch-2 is the minimal change _required_  to enable
> > > multifd & postcopy. In your view we are _revamping_ channel discovery
> > > parts while _sneaking_ in a feature of enabling multifd & postcopy
> > > together.
> > >     - In my view Patch-5 in this series is an isolated change because
> > > it adds a new migration command to allow multifd threads sync from
> > > source side. But Peter thinks without that 'flush and sync' Patch-2 is
> > > incomplete, so we should merge it back there.
> >
> > Just to mention, my suggestion does not conflict with splitting patch 2, as
> > long as you keep every patch complete on its own.
> >
> > Patch 5 needs to be squashed to either patch 2 or a split patch out of
> > patch 2, because current patch 2 (or any possible way to split it into
> > smaller ones, then one of them which enables the feature) is buggy.
> 
> * I'll try to segregate different parts, then we can discuss how to
> split them across patches:
> 
> Terminology:
>     _requires_  => is without which migration shall not work at all.
>     _essential_ => is without which there may be issues.
> 
> 1. Enable Multifd and Postcopy together
>     - It _requires_ removal of the Multifd capability check in
> migration/options.c
>     - It _requires_ identification of the CH_POSTCOPY connection when it 
> arrives
>         - so enum { CH_MAIN, CH_MULTIFD, CH_POSTCOPY } is defined
>         - To identify channels, related changes are made to the
> channel discovery part (if .. else conditionals) in
> migration_ioc_process_incoming() function.
>         - These changes help to establish all channel connections.
> After that, the migration proceeds as usual until it's time to start
> the Postcopy phase.
>     - When time comes to start Postcopy phase, we shutdown multifd channels.
>         - it _requires_ calling multifd_send_shutdown()
>         - It _requires_ moving
> file_cleanup_outgoing/socket_cleanup_outgoing calls to
> migration_cleanup() function.
>     - When Postcopy phase starts, we don't want ram_save_target
> _page() function to call ram_save_multifd_page() function, because
> migrate_multifd() is still true.
>         - It _requires_ adding the !migration_in_postcopy() checks.

IIUC Fabiano is not asking you to drop them, but split them.  Split still
"requires" them to be present, as long as before the enablement patch.

For example, if you want you can put the channel changes into a separate
patch, but without enabling the feature.  That single patch (after applied)
should keep migration working as before.

> 
> * Above changes are minimal _required_ to enable multifd and postcopy
> together, while also ensuring that migration continues to work when
> those options are not enabled together. With these changes, guest
> migration across two machines works without any observed failures.
> 
> 2. The multifd_ram_flush_and_sync() call/command and the
> assert(!migration_in_postcopy()) call in multifd_recv_thread()
>     - These calls help to ensure that no multifd data is left behind
> when the Postcopy phase starts.
>     - And that multifd_recv threads shall not be active when the
> Postcopy is running.
>     - They protect the guests from potential state corruption.
> 
> * It is up to us to decide whether (2) is _required_ OR _essential_
> for the feature. Individual opinions can vary here.
> 
> 3. Revamp of the channel discovery parts by moving those bits to
> connection.c or other places.
>     - This entails moving the channel discovery parts from
> migration_ioc_process_incoming() function to somewhere else because it
> makes more sense to move it there and maybe it reduces complexity and
> makes the sources easier to understand.*
> 
> * It is up to us to decide whether (3) is _required_  OR  _essential_
> for the feature. Individual opinions can vary here.
> 
> * IMHO (1) is _required_,  (2) is _essential_,  and (3) is neither
> _required_ nor _essential_ for the - Enable multifd and postcopy
> together - feature. (3) is a completely unrelated change to this
> feature.
> 
> Since it is an individual opinion, we all can think differently here
> and that is perfectly fine. Once we have some consensus, we can decide
> how to split or merge patches and move forward.
> 
> Hope it helps. Thank you.
> ---
>   - Prasad
> 

-- 
Peter Xu


Reply via email to