Re: [PATCH v1 3/4] migration: refactor ram_save_target_page functions
On Wed, Nov 27, 2024 at 11:12:09AM -0300, Fabiano Rosas wrote: > Prasad Pandit writes: > > > On Wed, 27 Nov 2024 at 02:49, Fabiano Rosas wrote: > >> This patch should be just the actual refactoring on top of master, with > >> no mention to postcopy at all. > > > > * Okay. We'll have to ensure that it is merged before multifd+postcopy > > change. > > > >> > +if (migrate_multifd() && !migration_in_postcopy() > >> > +&& (!migrate_multifd_flush_after_each_section() > >> > +|| migrate_mapped_ram())) { > >> > >> This is getting out of hand. We can't keep growing this condition every > >> time something new comes up. Any ideas? > > > > * In 'v0' this series, !migration_in_postcopy() was added to > > migrate_multifd(), which worked, but was not okay. > > > > * Another could be to define a new helper/macro to include above 3-4 > > checks. Because migrate_multifd(), > > migrate_multifd_flush_after_each_section() and migrate_mapped_ram() > > appear together at multiple points. But strangely the equation is not > > the same. Sometimes migrate_mapped_ram() is 'true' and sometimes it is > > 'false', so a combined helper may not work. It is all to accommodate > > different workings of multifd IIUC, if it and precopy used the same > > protocol/stream, maybe such conditionals would go away automatically. > > Maybe this would improve the situation. Peter, what do you think? > > -->8-- > From e9110360eb0efddf6945f37c518e3cc38d12b600 Mon Sep 17 00:00:00 2001 > From: Fabiano Rosas > Date: Wed, 27 Nov 2024 11:03:04 -0300 > Subject: [PATCH] migration: Rationalize multifd flushes from ram code > > We currently have a mess of conditionals to achieve the correct > combination of multifd local flushes, where we sync the local > (send/recv) multifd threads between themselves, and multifd remote > flushes, where we put a flag on the stream to inform the recv side to > perform a local flush. > > On top of that we also have the multifd_flush_after_each_section > property, which is there to provide backward compatibility from when > we used to flush after every vmstate section. > > And lastly, there's the mapped-ram feature which always wants the > non-backward compatible behavior and also does not support extraneous > flags on the stream (such as the MULTIFD_FLUSH flag). > > Move the conditionals into a separate function that encapsulates and > explains the whole situation. > > Signed-off-by: Fabiano Rosas > --- > migration/ram.c | 198 ++-- > 1 file changed, 157 insertions(+), 41 deletions(-) > > diff --git a/migration/ram.c b/migration/ram.c > index 05ff9eb328..ce6fdc 100644 > --- a/migration/ram.c > +++ b/migration/ram.c > @@ -1277,6 +1277,149 @@ static int ram_save_page(RAMState *rs, > PageSearchStatus *pss) > return pages; > } > > +enum RamMultifdFlushSpots { > +FLUSH_SAVE_SETUP, > +FLUSH_SAVE_ITER, > +FLUSH_DIRTY_BLOCK, > +FLUSH_SAVE_COMPLETE, > + > +FLUSH_LOAD_POSTCOPY_EOS, > +FLUSH_LOAD_POSTCOPY_FLUSH, > +FLUSH_LOAD_PRECOPY_EOS, > +FLUSH_LOAD_PRECOPY_FLUSH, POSTCOPY ones could be confusing, because we don't support them at all. We should remove them actually from ram_load_postcopy().. > +}; > + > +static int ram_multifd_flush(QEMUFile *f, enum RamMultifdFlushSpots spot) > +{ > +int ret; > +bool always_flush, do_local_flush, do_remote_flush; > +bool mapped_ram = migrate_mapped_ram(); > + > +if (!migrate_multifd()) { > +return 0; > +} > + > +/* > + * For backward compatibility, whether to flush multifd after each > + * section is sent. This is mutually exclusive with a > + * RAM_SAVE_FLAG_MULTIFD_FLUSH on the stream > + */ > +always_flush = migrate_multifd_flush_after_each_section(); > + > +/* > + * Save side flushes > + */ > + > +do_local_flush = false; > + > +switch (spot) { > +case FLUSH_SAVE_SETUP: > +assert(!bql_locked()); > +do_local_flush = true; > +break; > + > +case FLUSH_SAVE_ITER: > +/* > + * This flush is not necessary, only do for backward > + * compatibility. Mapped-ram assumes the new scheme. > + */ > +do_local_flush = always_flush && !mapped_ram; > +break; > + > +case FLUSH_DIRTY_BLOCK: > +/* > + * This is the flush that's actually required, always do it > + * unless we're emulating the old behavior. > + */ > +do_local_flush = !always_flush || mapped_ram; > +break; > + > +case FLUSH_SAVE_COMPLETE: > +do_local_flush = true; > +break; > + > +default: > +break; > +} > + > +if (do_local_flush) { > +ret = multifd_ram_flush_and_sync(); > +if (ret < 0) { > +return ret; > +} > +} > + > +/* > + * There's never a remote flush with mapped-ram because any flags > + * put on the strea
Re: [PATCH v1 3/4] migration: refactor ram_save_target_page functions
Hello Fabiano, On Mon, 2 Dec 2024 at 19:42, Fabiano Rosas wrote: > > ...multifd_send/_recv_sync_main();<= do the 'flush' and > > 'sync' mean the same thing here? > > No, that patch is indeed inconsistent in the terminology, good point. > Well, flush and sync are not reserved terms, we can use them however we > see fit. As long as it's consistent, of course. > * It'll help to define what 'flush' does and what 'sync' does in the Multifd context and how they differ from each other. Maybe here or code comments or both places or somewhere. Knowing their intended meaning will help while reading code, reviewing patches and also avoid confusion. Thank you. --- - Prasad
Re: [PATCH v1 3/4] migration: refactor ram_save_target_page functions
Hello Fabiano, On Thu, 28 Nov 2024 at 18:50, Fabiano Rosas wrote: >>> We currently have a mess of conditionals to achieve the correct >>> combination of multifd local flushes, where we sync the local >>> (send/recv) multifd threads between themselves, and multifd remote >>> flushes, where we put a flag on the stream to inform the recv side to >>> perform a local flush. ... > >> +if (do_local_flush) { > >> +ret = multifd_ram_flush_and_sync(); > >> +if (ret < 0) { > >> +return ret; > >> +} > >> +} > >> + ... > >> +/* Put a flag on the stream to trigger a remote flush */ > >> +if (do_remote_flush) { > >> +qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH); > >> +qemu_fflush(f); > >> +} > >> + > >> +if (do_local_flush) { > >> +multifd_recv_sync_main(); > >> +} ... >These syncs are all related and we'd like to be able to reason about > them in a single place, not in several little pieces all over the place. > When a new feature comes up, like it did when mapped-ram was introduced, > we don't want to go around having to squint at conditionals to know whether > it applies to the new case or not. > > Also, ram.c is not the place for open-coded multifd code. The same > mapped-ram example applies: having to add if (migrate_mapped_ram()) > throughout the ram code was a pain and we had some iterations of > flipping the logic until we got it right. * I see. If it is the larger complexity of how multifd threads do flush/sync, not just about long conditionals with 3-4 sub-expressions, I think this can be done separately, instead of as part of this patch series. > There is no fflush() going on here. This code is about the multifd > flush, which sends the remaining data from MultiFDPages_t and the > multifd sync, which synchronizes the multifd threads. That qemu_fflush > is just to make sure the destination sees flag on the stream. * Yes, there is no fflush(3) call. I mentioned fflush(3) as indicative of the operation performed. ie. the description above reads the same as what fflush(3) does to streams. "...fflush() forces a write of all user-space buffered data for the given output or update stream via the stream's underlying write function." In the multifd case we are sending remaining data from MultiFDPages_t buffers onto the respective channels IIUC. The multifd_send_sync_main() function sets the 'p->pending_sync' field and when it is set miltifd_send_thread() function calls ret = qio_channel_write_all(p->c, (void *)p->packet, p->packet_len, &local_err); multifd_send_sync_main() also has 'flush_zero_copy', but that only happens when using --zerocopy option is used -> $ virsh migrate --zerocopy ... > There is no flush on the receive side. The RAM_SAVE_FLAG_MULTIFD_FLUSH > flag is there to indicate to the destination that at that point in the > stream the source has done a flush + sync operation and the destination > should sync it's threads as well. * The comment around where 'RAM_SAVE_FLAG_MULTIFD_FLUSH' gets written above, says -> "...trigger remote flush." * We seem to use terms 'flush' and 'sync' quite freely and interchangeably. ie. variables (ex: do_local_flush) and constants are named with _FLUSH and functions and fields are named as _sync_main() and &p->pending_sync. if (do_local_flush) { multifd_send/_recv_sync_main();<= do the 'flush' and 'sync' mean the same thing here? } Even in multifd_ram_flush_and_sync() routine, it is named with _flush_ and eventually multifd_send() sets the '&p->pending_job' variable to true. There is no field in MultiFDSendParams structure named with 'flush'. It defines 'pending_sync' and 'sem_sync'. * Such free usage of these terms is bound to create confusion. Because while reading code the reader may relate flush with fflush(3) and sync with fsync(2) calls/operations. It will help if we use these terms more judiciously. * Coming back to the issue of simplifying multifd threads 'flush or sync' operation: 1. I think it can be done separately, outside the scope of this patch series. 2. Must we tie the flush/sync operations with specific spots? Isn't there any other way, where we call multifd_*_sync() unconditionally, without any side-effects? As I see it, we have those conditions, because of the side-effects. Thank you. --- - Prasad
Re: [PATCH v1 3/4] migration: refactor ram_save_target_page functions
Prasad Pandit writes: > On Wed, 27 Nov 2024 at 19:42, Fabiano Rosas wrote: >> From e9110360eb0efddf6945f37c518e3cc38d12b600 Mon Sep 17 00:00:00 2001 >> From: Fabiano Rosas >> Date: Wed, 27 Nov 2024 11:03:04 -0300 >> Subject: [PATCH] migration: Rationalize multifd flushes from ram code >> >> We currently have a mess of conditionals to achieve the correct >> combination of multifd local flushes, where we sync the local >> (send/recv) multifd threads between themselves, and multifd remote >> flushes, where we put a flag on the stream to inform the recv side to >> perform a local flush. >> >> On top of that we also have the multifd_flush_after_each_section >> property, which is there to provide backward compatibility from when >> we used to flush after every vmstate section. >> >> And lastly, there's the mapped-ram feature which always wants the >> non-backward compatible behavior and also does not support extraneous >> flags on the stream (such as the MULTIFD_FLUSH flag). >> >> Move the conditionals into a separate function that encapsulates and >> explains the whole situation. >> >> Signed-off-by: Fabiano Rosas >> --- >> migration/ram.c | 198 ++-- >> 1 file changed, 157 insertions(+), 41 deletions(-) >> >> diff --git a/migration/ram.c b/migration/ram.c >> index 05ff9eb328..ce6fdc 100644 >> --- a/migration/ram.c >> +++ b/migration/ram.c >> @@ -1277,6 +1277,149 @@ static int ram_save_page(RAMState *rs, >> PageSearchStatus *pss) >> return pages; >> } >> >> +enum RamMultifdFlushSpots { >> +FLUSH_SAVE_SETUP, >> +FLUSH_SAVE_ITER, >> +FLUSH_DIRTY_BLOCK, >> +FLUSH_SAVE_COMPLETE, >> + >> +FLUSH_LOAD_POSTCOPY_EOS, >> +FLUSH_LOAD_POSTCOPY_FLUSH, >> +FLUSH_LOAD_PRECOPY_EOS, >> +FLUSH_LOAD_PRECOPY_FLUSH, >> +}; >> + >> +static int ram_multifd_flush(QEMUFile *f, enum RamMultifdFlushSpots spot) >> +{ >> +int ret; >> +bool always_flush, do_local_flush, do_remote_flush; >> +bool mapped_ram = migrate_mapped_ram(); >> + >> +if (!migrate_multifd()) { >> +return 0; >> +} >> + >> +/* >> + * For backward compatibility, whether to flush multifd after each >> + * section is sent. This is mutually exclusive with a >> + * RAM_SAVE_FLAG_MULTIFD_FLUSH on the stream >> + */ >> +always_flush = migrate_multifd_flush_after_each_section(); >> + >> +/* >> + * Save side flushes >> + */ >> + >> +do_local_flush = false; >> + >> +switch (spot) { >> +case FLUSH_SAVE_SETUP: >> +assert(!bql_locked()); >> +do_local_flush = true; >> +break; >> + >> +case FLUSH_SAVE_ITER: >> +/* >> + * This flush is not necessary, only do for backward >> + * compatibility. Mapped-ram assumes the new scheme. >> + */ >> +do_local_flush = always_flush && !mapped_ram; >> +break; >> + >> +case FLUSH_DIRTY_BLOCK: >> +/* >> + * This is the flush that's actually required, always do it >> + * unless we're emulating the old behavior. >> + */ >> +do_local_flush = !always_flush || mapped_ram; >> +break; >> + >> +case FLUSH_SAVE_COMPLETE: >> +do_local_flush = true; >> +break; >> + >> +default: >> +break; >> +} >> + >> +if (do_local_flush) { >> +ret = multifd_ram_flush_and_sync(); >> +if (ret < 0) { >> +return ret; >> +} >> +} >> + >> +/* >> + * There's never a remote flush with mapped-ram because any flags >> + * put on the stream (aside from RAM_SAVE_FLAG_MEM_SIZE and >> + * RAM_SAVE_FLAG_EOS) break mapped-ram's assumption that ram pages >> + * can be read contiguously from the stream. >> + * >> + * On the recv side, there's no local flush, even at EOS because >> + * mapped-ram does its own flush after loading the ramblock. >> + */ >> +if (mapped_ram) { >> +return 0; >> +} >> + >> +do_remote_flush = false; >> + >> +/* Save side remote flush */ >> +switch (spot) { >> +case FLUSH_SAVE_SETUP: >> +do_remote_flush = !always_flush; >> +break; >> + >> +case FLUSH_SAVE_ITER: >> +do_remote_flush = false; >> +break; >> + >> +case FLUSH_DIRTY_BLOCK: >> +do_remote_flush = do_local_flush; >> +break; >> + >> +case FLUSH_SAVE_COMPLETE: >> +do_remote_flush = false; >> +break; >> + >> +default: >> +break; >> +} >> + >> +/* Put a flag on the stream to trigger a remote flush */ >> +if (do_remote_flush) { >> +qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH); >> +qemu_fflush(f); >> +} >> + >> +/* >> + * Load side flushes. >> + */ >> + >> +do_local_flush = false; >> + >> +switch (spot) { >> +case FLUSH_LOAD_PRECOPY_EOS: >> +case FLUSH_LOAD_POSTCOPY_EOS: >> +do_local_flush = always_flush; >> +break; >> + >>
Re: [PATCH v1 3/4] migration: refactor ram_save_target_page functions
On Wed, 27 Nov 2024 at 17:49, Fabiano Rosas wrote: > > On Wed, 27 Nov 2024 at 02:49, Fabiano Rosas wrote: > >> This patch should be just the actual refactoring on top of master, with > >> no mention to postcopy at all. ... > It doesn't need to be a single patch submission, it could be a patch at > the start of this series. It's just that it needs to logically do only > one thing, which is to move the code around without adding new bits that > don't exist in current master (a strict definition of refactoring). The > multifd+postcopy changes come in a subsequent patch so it's clear that > one patch was just shuffling code around while the rest is part of the > feature enablement. * Okay, will do. Thank you. --- - Prasad
Re: [PATCH v1 3/4] migration: refactor ram_save_target_page functions
On Wed, 27 Nov 2024 at 19:42, Fabiano Rosas wrote: > From e9110360eb0efddf6945f37c518e3cc38d12b600 Mon Sep 17 00:00:00 2001 > From: Fabiano Rosas > Date: Wed, 27 Nov 2024 11:03:04 -0300 > Subject: [PATCH] migration: Rationalize multifd flushes from ram code > > We currently have a mess of conditionals to achieve the correct > combination of multifd local flushes, where we sync the local > (send/recv) multifd threads between themselves, and multifd remote > flushes, where we put a flag on the stream to inform the recv side to > perform a local flush. > > On top of that we also have the multifd_flush_after_each_section > property, which is there to provide backward compatibility from when > we used to flush after every vmstate section. > > And lastly, there's the mapped-ram feature which always wants the > non-backward compatible behavior and also does not support extraneous > flags on the stream (such as the MULTIFD_FLUSH flag). > > Move the conditionals into a separate function that encapsulates and > explains the whole situation. > > Signed-off-by: Fabiano Rosas > --- > migration/ram.c | 198 ++-- > 1 file changed, 157 insertions(+), 41 deletions(-) > > diff --git a/migration/ram.c b/migration/ram.c > index 05ff9eb328..ce6fdc 100644 > --- a/migration/ram.c > +++ b/migration/ram.c > @@ -1277,6 +1277,149 @@ static int ram_save_page(RAMState *rs, > PageSearchStatus *pss) > return pages; > } > > +enum RamMultifdFlushSpots { > +FLUSH_SAVE_SETUP, > +FLUSH_SAVE_ITER, > +FLUSH_DIRTY_BLOCK, > +FLUSH_SAVE_COMPLETE, > + > +FLUSH_LOAD_POSTCOPY_EOS, > +FLUSH_LOAD_POSTCOPY_FLUSH, > +FLUSH_LOAD_PRECOPY_EOS, > +FLUSH_LOAD_PRECOPY_FLUSH, > +}; > + > +static int ram_multifd_flush(QEMUFile *f, enum RamMultifdFlushSpots spot) > +{ > +int ret; > +bool always_flush, do_local_flush, do_remote_flush; > +bool mapped_ram = migrate_mapped_ram(); > + > +if (!migrate_multifd()) { > +return 0; > +} > + > +/* > + * For backward compatibility, whether to flush multifd after each > + * section is sent. This is mutually exclusive with a > + * RAM_SAVE_FLAG_MULTIFD_FLUSH on the stream > + */ > +always_flush = migrate_multifd_flush_after_each_section(); > + > +/* > + * Save side flushes > + */ > + > +do_local_flush = false; > + > +switch (spot) { > +case FLUSH_SAVE_SETUP: > +assert(!bql_locked()); > +do_local_flush = true; > +break; > + > +case FLUSH_SAVE_ITER: > +/* > + * This flush is not necessary, only do for backward > + * compatibility. Mapped-ram assumes the new scheme. > + */ > +do_local_flush = always_flush && !mapped_ram; > +break; > + > +case FLUSH_DIRTY_BLOCK: > +/* > + * This is the flush that's actually required, always do it > + * unless we're emulating the old behavior. > + */ > +do_local_flush = !always_flush || mapped_ram; > +break; > + > +case FLUSH_SAVE_COMPLETE: > +do_local_flush = true; > +break; > + > +default: > +break; > +} > + > +if (do_local_flush) { > +ret = multifd_ram_flush_and_sync(); > +if (ret < 0) { > +return ret; > +} > +} > + > +/* > + * There's never a remote flush with mapped-ram because any flags > + * put on the stream (aside from RAM_SAVE_FLAG_MEM_SIZE and > + * RAM_SAVE_FLAG_EOS) break mapped-ram's assumption that ram pages > + * can be read contiguously from the stream. > + * > + * On the recv side, there's no local flush, even at EOS because > + * mapped-ram does its own flush after loading the ramblock. > + */ > +if (mapped_ram) { > +return 0; > +} > + > +do_remote_flush = false; > + > +/* Save side remote flush */ > +switch (spot) { > +case FLUSH_SAVE_SETUP: > +do_remote_flush = !always_flush; > +break; > + > +case FLUSH_SAVE_ITER: > +do_remote_flush = false; > +break; > + > +case FLUSH_DIRTY_BLOCK: > +do_remote_flush = do_local_flush; > +break; > + > +case FLUSH_SAVE_COMPLETE: > +do_remote_flush = false; > +break; > + > +default: > +break; > +} > + > +/* Put a flag on the stream to trigger a remote flush */ > +if (do_remote_flush) { > +qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH); > +qemu_fflush(f); > +} > + > +/* > + * Load side flushes. > + */ > + > +do_local_flush = false; > + > +switch (spot) { > +case FLUSH_LOAD_PRECOPY_EOS: > +case FLUSH_LOAD_POSTCOPY_EOS: > +do_local_flush = always_flush; > +break; > + > +case FLUSH_LOAD_PRECOPY_FLUSH: > +case FLUSH_LOAD_POSTCOPY_FLUSH: > +do_local_flush = true; > +break; > + > +default: > +break; > +} > + > +if (d
Re: [PATCH v1 3/4] migration: refactor ram_save_target_page functions
Prasad Pandit writes: > On Wed, 27 Nov 2024 at 02:49, Fabiano Rosas wrote: >> This patch should be just the actual refactoring on top of master, with >> no mention to postcopy at all. > > * Okay. We'll have to ensure that it is merged before multifd+postcopy change. > >> > +if (migrate_multifd() && !migration_in_postcopy() >> > +&& (!migrate_multifd_flush_after_each_section() >> > +|| migrate_mapped_ram())) { >> >> This is getting out of hand. We can't keep growing this condition every >> time something new comes up. Any ideas? > > * In 'v0' this series, !migration_in_postcopy() was added to > migrate_multifd(), which worked, but was not okay. > > * Another could be to define a new helper/macro to include above 3-4 > checks. Because migrate_multifd(), > migrate_multifd_flush_after_each_section() and migrate_mapped_ram() > appear together at multiple points. But strangely the equation is not > the same. Sometimes migrate_mapped_ram() is 'true' and sometimes it is > 'false', so a combined helper may not work. It is all to accommodate > different workings of multifd IIUC, if it and precopy used the same > protocol/stream, maybe such conditionals would go away automatically. Maybe this would improve the situation. Peter, what do you think? -->8-- >From e9110360eb0efddf6945f37c518e3cc38d12b600 Mon Sep 17 00:00:00 2001 From: Fabiano Rosas Date: Wed, 27 Nov 2024 11:03:04 -0300 Subject: [PATCH] migration: Rationalize multifd flushes from ram code We currently have a mess of conditionals to achieve the correct combination of multifd local flushes, where we sync the local (send/recv) multifd threads between themselves, and multifd remote flushes, where we put a flag on the stream to inform the recv side to perform a local flush. On top of that we also have the multifd_flush_after_each_section property, which is there to provide backward compatibility from when we used to flush after every vmstate section. And lastly, there's the mapped-ram feature which always wants the non-backward compatible behavior and also does not support extraneous flags on the stream (such as the MULTIFD_FLUSH flag). Move the conditionals into a separate function that encapsulates and explains the whole situation. Signed-off-by: Fabiano Rosas --- migration/ram.c | 198 ++-- 1 file changed, 157 insertions(+), 41 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 05ff9eb328..ce6fdc 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -1277,6 +1277,149 @@ static int ram_save_page(RAMState *rs, PageSearchStatus *pss) return pages; } +enum RamMultifdFlushSpots { +FLUSH_SAVE_SETUP, +FLUSH_SAVE_ITER, +FLUSH_DIRTY_BLOCK, +FLUSH_SAVE_COMPLETE, + +FLUSH_LOAD_POSTCOPY_EOS, +FLUSH_LOAD_POSTCOPY_FLUSH, +FLUSH_LOAD_PRECOPY_EOS, +FLUSH_LOAD_PRECOPY_FLUSH, +}; + +static int ram_multifd_flush(QEMUFile *f, enum RamMultifdFlushSpots spot) +{ +int ret; +bool always_flush, do_local_flush, do_remote_flush; +bool mapped_ram = migrate_mapped_ram(); + +if (!migrate_multifd()) { +return 0; +} + +/* + * For backward compatibility, whether to flush multifd after each + * section is sent. This is mutually exclusive with a + * RAM_SAVE_FLAG_MULTIFD_FLUSH on the stream + */ +always_flush = migrate_multifd_flush_after_each_section(); + +/* + * Save side flushes + */ + +do_local_flush = false; + +switch (spot) { +case FLUSH_SAVE_SETUP: +assert(!bql_locked()); +do_local_flush = true; +break; + +case FLUSH_SAVE_ITER: +/* + * This flush is not necessary, only do for backward + * compatibility. Mapped-ram assumes the new scheme. + */ +do_local_flush = always_flush && !mapped_ram; +break; + +case FLUSH_DIRTY_BLOCK: +/* + * This is the flush that's actually required, always do it + * unless we're emulating the old behavior. + */ +do_local_flush = !always_flush || mapped_ram; +break; + +case FLUSH_SAVE_COMPLETE: +do_local_flush = true; +break; + +default: +break; +} + +if (do_local_flush) { +ret = multifd_ram_flush_and_sync(); +if (ret < 0) { +return ret; +} +} + +/* + * There's never a remote flush with mapped-ram because any flags + * put on the stream (aside from RAM_SAVE_FLAG_MEM_SIZE and + * RAM_SAVE_FLAG_EOS) break mapped-ram's assumption that ram pages + * can be read contiguously from the stream. + * + * On the recv side, there's no local flush, even at EOS because + * mapped-ram does its own flush after loading the ramblock. + */ +if (mapped_ram) { +return 0; +} + +do_remote_flush = false; + +/* Save side remote flush */ +switch (spot) { +case FLUSH_SAVE_SETUP:
Re: [PATCH v1 3/4] migration: refactor ram_save_target_page functions
Prasad Pandit writes: > On Wed, 27 Nov 2024 at 02:49, Fabiano Rosas wrote: >> This patch should be just the actual refactoring on top of master, with >> no mention to postcopy at all. > > * Okay. We'll have to ensure that it is merged before multifd+postcopy change. That's ok, just put it at the start of the series. > >> > +if (migrate_multifd() && !migration_in_postcopy() >> > +&& (!migrate_multifd_flush_after_each_section() >> > +|| migrate_mapped_ram())) { >> >> This is getting out of hand. We can't keep growing this condition every >> time something new comes up. Any ideas? > > * In 'v0' this series, !migration_in_postcopy() was added to > migrate_multifd(), which worked, but was not okay. > > * Another could be to define a new helper/macro to include above 3-4 > checks. Because migrate_multifd(), > migrate_multifd_flush_after_each_section() and migrate_mapped_ram() > appear together at multiple points. But strangely the equation is not > the same. Sometimes migrate_mapped_ram() is 'true' and sometimes it is > 'false', so a combined helper may not work. It is all to accommodate > different workings of multifd IIUC, if it and precopy used the same > protocol/stream, maybe such conditionals would go away automatically. > >> Yes, although I wonder if we should keep documenting this when we only >> call this function for a single page and it always returns at most 1. >> > if (page_dirty) { >> > /* Be strict to return code; it must be 1, or what else? */ >> >> ... this^ comment, for instance. >> > > * Okay, can remove them. > > So to confirm: this refactoring patch should be a separate one by > itself? And then in the following multifd+postcopy series we just add > !migration_in_postcopy() check to it? It doesn't need to be a single patch submission, it could be a patch at the start of this series. It's just that it needs to logically do only one thing, which is to move the code around without adding new bits that don't exist in current master (a strict definition of refactoring). The multifd+postcopy changes come in a subsequent patch so it's clear that one patch was just shuffling code around while the rest is part of the feature enablement. > > Thank you. > --- > - Prasad
Re: [PATCH v1 3/4] migration: refactor ram_save_target_page functions
On Wed, 27 Nov 2024 at 02:49, Fabiano Rosas wrote: > This patch should be just the actual refactoring on top of master, with > no mention to postcopy at all. * Okay. We'll have to ensure that it is merged before multifd+postcopy change. > > +if (migrate_multifd() && !migration_in_postcopy() > > +&& (!migrate_multifd_flush_after_each_section() > > +|| migrate_mapped_ram())) { > > This is getting out of hand. We can't keep growing this condition every > time something new comes up. Any ideas? * In 'v0' this series, !migration_in_postcopy() was added to migrate_multifd(), which worked, but was not okay. * Another could be to define a new helper/macro to include above 3-4 checks. Because migrate_multifd(), migrate_multifd_flush_after_each_section() and migrate_mapped_ram() appear together at multiple points. But strangely the equation is not the same. Sometimes migrate_mapped_ram() is 'true' and sometimes it is 'false', so a combined helper may not work. It is all to accommodate different workings of multifd IIUC, if it and precopy used the same protocol/stream, maybe such conditionals would go away automatically. > Yes, although I wonder if we should keep documenting this when we only > call this function for a single page and it always returns at most 1. > > if (page_dirty) { > > /* Be strict to return code; it must be 1, or what else? */ > > ... this^ comment, for instance. > * Okay, can remove them. So to confirm: this refactoring patch should be a separate one by itself? And then in the following multifd+postcopy series we just add !migration_in_postcopy() check to it? Thank you. --- - Prasad
Re: [PATCH v1 3/4] migration: refactor ram_save_target_page functions
Prasad Pandit writes: > From: Prasad Pandit > > Refactor ram_save_target_page legacy and multifd > functions into one. Other than simplifying it, > it frees 'migration_ops' object from usage, so it > is expunged. > > When both Multifd and Postcopy modes are enabled, > to avoid errors, the Multifd threads are active until > migration reaches the Postcopy mode. This is done by > checking the state returned by migration_in_postcopy(). This patch should be just the actual refactoring on top of master, with no mention to postcopy at all. > > Signed-off-by: Prasad Pandit > --- > migration/multifd-nocomp.c | 3 +- > migration/ram.c| 74 -- > 2 files changed, 24 insertions(+), 53 deletions(-) > > v1: Further refactor ram_save_target_page() function to conflate > save_zero_page() calls. > > Add migration_in_postcopy() call to check migration state > instead of combining it with migrate_multifd(). > > v0: > https://lore.kernel.org/qemu-devel/20241029150908.1136894-1-ppan...@redhat.com/T/#u > > diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c > index 55191152f9..e92821e8f6 100644 > --- a/migration/multifd-nocomp.c > +++ b/migration/multifd-nocomp.c > @@ -14,6 +14,7 @@ > #include "exec/ramblock.h" > #include "exec/target_page.h" > #include "file.h" > +#include "migration.h" > #include "multifd.h" > #include "options.h" > #include "qapi/error.h" > @@ -345,7 +346,7 @@ retry: > > int multifd_ram_flush_and_sync(void) > { > -if (!migrate_multifd()) { > +if (!migrate_multifd() || migration_in_postcopy()) { > return 0; > } > > diff --git a/migration/ram.c b/migration/ram.c > index 05ff9eb328..9d1ec6209c 100644 > --- a/migration/ram.c > +++ b/migration/ram.c > @@ -467,13 +467,6 @@ void ram_transferred_add(uint64_t bytes) > } > } > > -struct MigrationOps { > -int (*ram_save_target_page)(RAMState *rs, PageSearchStatus *pss); > -}; > -typedef struct MigrationOps MigrationOps; > - > -MigrationOps *migration_ops; > - > static int ram_save_host_page_urgent(PageSearchStatus *pss); > > /* NOTE: page is the PFN not real ram_addr_t. */ > @@ -1323,9 +1316,9 @@ static int find_dirty_block(RAMState *rs, > PageSearchStatus *pss) > pss->page = 0; > pss->block = QLIST_NEXT_RCU(pss->block, next); > if (!pss->block) { > -if (migrate_multifd() && > -(!migrate_multifd_flush_after_each_section() || > - migrate_mapped_ram())) { > +if (migrate_multifd() && !migration_in_postcopy() > +&& (!migrate_multifd_flush_after_each_section() > +|| migrate_mapped_ram())) { This is getting out of hand. We can't keep growing this condition every time something new comes up. Any ideas? > QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel; > int ret = multifd_ram_flush_and_sync(); > if (ret < 0) { > @@ -1986,55 +1979,39 @@ int ram_save_queue_pages(const char *rbname, > ram_addr_t start, ram_addr_t len, > } > > /** > - * ram_save_target_page_legacy: save one target page > + * ram_save_target_page: save one target page to the precopy thread > + * OR to multifd workers. > * > - * Returns the number of pages written > + * Multifd mode: returns 1 if the page was queued, -1 otherwise. > + * Non-multifd mode: returns the number of pages written. Yes, although I wonder if we should keep documenting this when we only call this function for a single page and it always returns at most 1. > * > * @rs: current RAM state > * @pss: data about the page we want to send > */ > -static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss) > +static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss) > { > ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS; > int res; > > +if (!migrate_multifd() > +|| migrate_zero_page_detection() == ZERO_PAGE_DETECTION_LEGACY) { > +if (save_zero_page(rs, pss, offset)) { > +return 1; > +} > +} > + > +if (migrate_multifd() && !migration_in_postcopy()) { > +RAMBlock *block = pss->block; > +return ram_save_multifd_page(block, offset); > +} > + > if (control_save_page(pss, offset, &res)) { > return res; > } > > -if (save_zero_page(rs, pss, offset)) { > -return 1; > -} > - > return ram_save_page(rs, pss); > } > > -/** > - * ram_save_target_page_multifd: send one target page to multifd workers > - * > - * Returns 1 if the page was queued, -1 otherwise. > - * > - * @rs: current RAM state > - * @pss: data about the page we want to send > - */ > -static int ram_save_target_page_multifd(RAMState *rs, PageSearchStatus *pss) > -{ > -RAMBlock *block = pss->block; > -ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS; > - > -/* > - * While using mu
[PATCH v1 3/4] migration: refactor ram_save_target_page functions
From: Prasad Pandit Refactor ram_save_target_page legacy and multifd functions into one. Other than simplifying it, it frees 'migration_ops' object from usage, so it is expunged. When both Multifd and Postcopy modes are enabled, to avoid errors, the Multifd threads are active until migration reaches the Postcopy mode. This is done by checking the state returned by migration_in_postcopy(). Signed-off-by: Prasad Pandit --- migration/multifd-nocomp.c | 3 +- migration/ram.c| 74 -- 2 files changed, 24 insertions(+), 53 deletions(-) v1: Further refactor ram_save_target_page() function to conflate save_zero_page() calls. Add migration_in_postcopy() call to check migration state instead of combining it with migrate_multifd(). v0: https://lore.kernel.org/qemu-devel/20241029150908.1136894-1-ppan...@redhat.com/T/#u diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c index 55191152f9..e92821e8f6 100644 --- a/migration/multifd-nocomp.c +++ b/migration/multifd-nocomp.c @@ -14,6 +14,7 @@ #include "exec/ramblock.h" #include "exec/target_page.h" #include "file.h" +#include "migration.h" #include "multifd.h" #include "options.h" #include "qapi/error.h" @@ -345,7 +346,7 @@ retry: int multifd_ram_flush_and_sync(void) { -if (!migrate_multifd()) { +if (!migrate_multifd() || migration_in_postcopy()) { return 0; } diff --git a/migration/ram.c b/migration/ram.c index 05ff9eb328..9d1ec6209c 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -467,13 +467,6 @@ void ram_transferred_add(uint64_t bytes) } } -struct MigrationOps { -int (*ram_save_target_page)(RAMState *rs, PageSearchStatus *pss); -}; -typedef struct MigrationOps MigrationOps; - -MigrationOps *migration_ops; - static int ram_save_host_page_urgent(PageSearchStatus *pss); /* NOTE: page is the PFN not real ram_addr_t. */ @@ -1323,9 +1316,9 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss) pss->page = 0; pss->block = QLIST_NEXT_RCU(pss->block, next); if (!pss->block) { -if (migrate_multifd() && -(!migrate_multifd_flush_after_each_section() || - migrate_mapped_ram())) { +if (migrate_multifd() && !migration_in_postcopy() +&& (!migrate_multifd_flush_after_each_section() +|| migrate_mapped_ram())) { QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel; int ret = multifd_ram_flush_and_sync(); if (ret < 0) { @@ -1986,55 +1979,39 @@ int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len, } /** - * ram_save_target_page_legacy: save one target page + * ram_save_target_page: save one target page to the precopy thread + * OR to multifd workers. * - * Returns the number of pages written + * Multifd mode: returns 1 if the page was queued, -1 otherwise. + * Non-multifd mode: returns the number of pages written. * * @rs: current RAM state * @pss: data about the page we want to send */ -static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss) +static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss) { ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS; int res; +if (!migrate_multifd() +|| migrate_zero_page_detection() == ZERO_PAGE_DETECTION_LEGACY) { +if (save_zero_page(rs, pss, offset)) { +return 1; +} +} + +if (migrate_multifd() && !migration_in_postcopy()) { +RAMBlock *block = pss->block; +return ram_save_multifd_page(block, offset); +} + if (control_save_page(pss, offset, &res)) { return res; } -if (save_zero_page(rs, pss, offset)) { -return 1; -} - return ram_save_page(rs, pss); } -/** - * ram_save_target_page_multifd: send one target page to multifd workers - * - * Returns 1 if the page was queued, -1 otherwise. - * - * @rs: current RAM state - * @pss: data about the page we want to send - */ -static int ram_save_target_page_multifd(RAMState *rs, PageSearchStatus *pss) -{ -RAMBlock *block = pss->block; -ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS; - -/* - * While using multifd live migration, we still need to handle zero - * page checking on the migration main thread. - */ -if (migrate_zero_page_detection() == ZERO_PAGE_DETECTION_LEGACY) { -if (save_zero_page(rs, pss, offset)) { -return 1; -} -} - -return ram_save_multifd_page(block, offset); -} - /* Should be called before sending a host page */ static void pss_host_page_prepare(PageSearchStatus *pss) { @@ -2121,7 +2098,7 @@ static int ram_save_host_page_urgent(PageSearchStatus *pss) if (page_dirty) { /* Be strict to return code; it must be 1, or what else? */ -if (migration_ops->