Re: [PATCH 14/52] migration/rdma: Use bool for two RDMAContext flags

2023-09-18 Thread Fabiano Rosas
Markus Armbruster writes: > @error_reported and @received_error are flags. The latter is even > assigned bool true. Change them from int to bool. > > Signed-off-by: Markus Armbruster Reviewed-by: Fabiano Rosas

Re: [PATCH 18/52] migration/rdma: Fix qemu_rdma_broken_ipv6_kernel() to set error

2023-09-18 Thread Fabiano Rosas
neglects to set an Error when > ibv_open_device() fails. If a later address fails differently, we use > that Error instead, or else the generic one. Harmless enough, but > needs fixing all the same. > > Signed-off-by: Markus Armbruster Reviewed-by: Fabiano Rosas

Re: [PATCH 05/52] migration/rdma: Consistently use uint64_t for work request IDs

2023-09-18 Thread Fabiano Rosas
gative enumeration constants. > > Signed-off-by: Markus Armbruster Reviewed-by: Fabiano Rosas

Re: [PATCH 08/52] migration/rdma: Fix qemu_rdma_accept() to return failure on errors

2023-09-18 Thread Fabiano Rosas
Signed-off-by: Markus Armbruster Reviewed-by: Fabiano Rosas

Re: [PATCH 10/52] migration/rdma: Eliminate error_propagate()

2023-09-18 Thread Fabiano Rosas
Markus Armbruster writes: > When all we do with an Error we receive into a local variable is > propagating to somewhere else, we can just as well receive it there > right away. > > Signed-off-by: Markus Armbruster Reviewed-by: Fabiano Rosas

[PATCH 5/8] migration: Remove redundant cleanup of postcopy_qemufile_src

2023-09-18 Thread Fabiano Rosas
This file is owned by the return path thread which is already doing cleanup. Reviewed-by: Peter Xu Signed-off-by: Fabiano Rosas --- migration/migration.c | 6 -- 1 file changed, 6 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 6e09463466..4372b0fbbf 100644

[PATCH 7/8] migration: Replace the return path retry logic

2023-09-18 Thread Fabiano Rosas
by it is the 'from_dst_file' and it is only allowed to proceed after a migrate resume is issued and the semaphore released at migrate_fd_connect(). Move the retry logic to outside the thread by waiting for the thread to finish before pausing the migration. Reviewed-by: Peter Xu Signed-off-by: Fabiano Rosas

[PATCH 6/8] migration: Consolidate return path closing code

2023-09-18 Thread Fabiano Rosas
We'll start calling the await_return_path_close_on_source() function from other parts of the code, so move all of the related checks and tracepoints into it. Reviewed-by: Peter Xu Signed-off-by: Fabiano Rosas --- migration/migration.c | 29 ++--- 1 file changed, 14

[PATCH 8/8] migration: Move return path cleanup to main migration thread

2023-09-18 Thread Fabiano Rosas
Now that the return path thread is allowed to finish during a paused migration, we can move the cleanup of the QEMUFiles to the main migration thread. Reviewed-by: Peter Xu Signed-off-by: Fabiano Rosas --- migration/migration.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion

[PATCH 1/8] migration: Fix race that dest preempt thread close too early

2023-09-18 Thread Fabiano Rosas
field), so the main thread will always get a kick when that triggers correctly. Closes: https://gitlab.com/qemu-project/qemu/-/issues/1886 Debugged-by: Fabiano Rosas Signed-off-by: Peter Xu Signed-off-by: Fabiano Rosas --- migration/migration.c| 3 ++- migration/migration.h| 13

[PATCH 0/8] migration fixes

2023-09-18 Thread Fabiano Rosas
rror; CI run: https://gitlab.com/farosas/qemu/-/pipelines/1008652837 Fabiano Rosas (7): migration: Fix possible race when setting rp_state.error migration: Fix possible races when shutting down the return path migration: Fix possible race when shutting down to_dst_file migration: Rem

[PATCH 3/8] migration: Fix possible races when shutting down the return path

2023-09-18 Thread Fabiano Rosas
ars the to_dst_file pointer. Protect both accesses by taking the file lock. This was caught by inspection, it should be rare, but the next patches will start calling this code from other places, so let's do the correct thing. Reviewed-by: Peter Xu Signed-off-by: Fabiano Rosas --- migration/migration.c |

Re: [PATCH 13/52] migration/rdma: Make qemu_rdma_buffer_mergable() return bool

2023-09-18 Thread Fabiano Rosas
Markus Armbruster writes: > qemu_rdma_buffer_mergable() is semantically a predicate. It returns > int 0 or 1. Return bool instead. > > Signed-off-by: Markus Armbruster Reviewed-by: Fabiano Rosas

Re: [PATCH 16/52] migration/rdma: Fix io_writev(), io_readv() methods to obey contract

2023-09-18 Thread Fabiano Rosas
y from the definition. > > I elected not to investigate how callers are impacted. > > Expand the two bad macro uses, so we can set an error and return -1. > The next commit will then get rid of the macro altogether. > > Signed-off-by: Markus Armbruster Reviewed-by: Fabiano Rosas

Re: [PATCH 01/52] migration/rdma: Clean up qemu_rdma_poll()'s return type

2023-09-18 Thread Fabiano Rosas
by: Markus Armbruster Reviewed-by: Fabiano Rosas

Re: [PATCH 02/52] migration/rdma: Clean up qemu_rdma_data_init()'s return type

2023-09-18 Thread Fabiano Rosas
Markus Armbruster writes: > qemu_rdma_data_init() return type is void *. It actually returns > RDMAContext *, and all its callers assign the value to an > RDMAContext *. Unclean. > > Return RDMAContext * instead. > > Signed-off-by: Markus Armbruster Reviewed-by: Fabiano Rosas

Re: [PATCH 04/52] migration/rdma: Drop fragile wr_id formatting

2023-09-18 Thread Fabiano Rosas
k_for_wrid() print wrid_desc[wr_id] > and passes print_wrid(wr_id) to tracepoints. Could conceivably crash > trying to format a null string. I believe access out of bounds is not > possible. > > Not worth cleaning up. Dumb down to show just numeric wr_id. > > Signed-off-by: M

Re: [PATCH 06/52] migration/rdma: Clean up two more harmless signed vs. unsigned issues

2023-09-18 Thread Fabiano Rosas
ecting. Actual arguments > for the latter are non-negative enumeration constants. Change both > parameters to uint32_t. > > In qio_channel_rdma_readv(), loop control variable @i is ssize_t, and > counts from 0 up to @niov, which is size_t. Change @i to size_t. > > Signed-off-by: Mark

[PATCH 4/8] migration: Fix possible race when shutting down to_dst_file

2023-09-18 Thread Fabiano Rosas
-by: Peter Xu Signed-off-by: Fabiano Rosas --- migration/migration.c | 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 15b7258bb2..6e09463466 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1246,7

[PATCH 2/8] migration: Fix possible race when setting rp_state.error

2023-09-18 Thread Fabiano Rosas
it. Reviewed-by: Peter Xu Signed-off-by: Fabiano Rosas --- migration/migration.c | 1 - 1 file changed, 1 deletion(-) diff --git a/migration/migration.c b/migration/migration.c index 3ee1e6b0d6..d426b69ada 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2074,7 +2074,6

Re: [PATCH 07/52] migration/rdma: Give qio_channel_rdma_source_funcs internal linkage

2023-09-18 Thread Fabiano Rosas
Markus Armbruster writes: > Signed-off-by: Markus Armbruster Reviewed-by: Fabiano Rosas

Re: [PATCH 11/52] migration/rdma: Drop rdma_add_block() error handling

2023-09-18 Thread Fabiano Rosas
Markus Armbruster writes: > rdma_add_block() can't fail. Return void, and drop the unreachable > error handling. > > Signed-off-by: Markus Armbruster Reviewed-by: Fabiano Rosas

Re: [PATCH 09/52] migration/rdma: Put @errp parameter last

2023-09-18 Thread Fabiano Rosas
conform. Clean it up. > > Signed-off-by: Markus Armbruster Reviewed-by: Fabiano Rosas

Re: [PATCH 12/52] migration/rdma: Drop qemu_rdma_search_ram_block() error handling

2023-09-18 Thread Fabiano Rosas
Markus Armbruster writes: > qemu_rdma_search_ram_block() can't fail. Return void, and drop the > unreachable error handling. > > Signed-off-by: Markus Armbruster > --- > migration/rdma.c | 22 -- > 1 file changed, 8 insertions(+), 14 deletions(-) > > diff --git

Re: [PATCH 15/52] migration/rdma: Ditch useless numeric error codes in error messages

2023-09-18 Thread Fabiano Rosas
rting human-readable errno > information (which a number is not) can be useful, reporting an > error code that may or may not be an errno value is useless. > > Drop these error codes from the error messages. > > Signed-off-by: Markus Armbruster Reviewed-by: Fabiano Rosas

Re: [PATCH 17/52] migration/rdma: Replace dangerous macro CHECK_ERROR_STATE()

2023-09-18 Thread Fabiano Rosas
Markus Armbruster writes: > Hiding return statements in macros is a bad idea. Use a function > instead, and open code the return part. > > Signed-off-by: Markus Armbruster Reviewed-by: Fabiano Rosas

Re: [PATCH] qtest/migration: Add a test for the analyze-migration script

2023-09-28 Thread Fabiano Rosas
Thomas Huth writes: > On 27/09/2023 23.47, Fabiano Rosas wrote: >> Add a smoke test that migrates to a file and gives it to the >> script. It should catch the most annoying errors such as changes in >> the ram flags. >> >> After code has been merged it beco

Re: [PATCH] qtest/migration: Add a test for the analyze-migration script

2023-09-28 Thread Fabiano Rosas
Thomas Huth writes: > On 28/09/2023 15.32, Fabiano Rosas wrote: >> Thomas Huth writes: >> >>> On 27/09/2023 23.47, Fabiano Rosas wrote: >>>> Add a smoke test that migrates to a file and gives it to the >>>> script. It should catch the most annoy

Re: [PATCH v2 06/53] migration/rdma: Fix unwanted integer truncation

2023-09-28 Thread Fabiano Rosas
nt, 0); > -done += ret; > -want -= ret; > +len = qemu_rdma_fill(rdma, data, want, 0); > +done += len; > +want -= len; > > /* Still didn't get enough, so lets just return */ > if (want) { Reviewed-by: Fabiano Rosas

Re: [PATCH v2 16/53] migration/rdma: Fix or document problematic uses of errno

2023-09-29 Thread Fabiano Rosas
Markus Armbruster writes: > We use errno after calling Libibverbs functions that are not > documented to set errno (manual page does not mention errno), or where > the documentation is unclear ("returns [...] the value of errno on > failure"). While this could be read as "sets errno and returns

Re: [RFC PATCH 1/3] migration/multifd: Move channels_ready semaphore

2023-09-29 Thread Fabiano Rosas
Elena Ufimtseva writes: > On Fri, Sep 22, 2023 at 11:53:17AM -0300, Fabiano Rosas wrote: >> Commit d2026ee117 ("multifd: Fix the number of channels ready") moved >> the "post" of channels_ready to the start of the multifd_send_thread() >> loop and added a

Re: [PATCH v2 22/53] migration/rdma: Drop dead qemu_rdma_data_init() code for !@host_port

2023-09-29 Thread Fabiano Rosas
Markus Armbruster writes: > qemu_rdma_data_init() neglects to set an Error when it fails because > @host_port is null. Fortunately, no caller passes null, so this is > merely a latent bug. Drop the flawed code handling null argument. > > Signed-off-by: Markus Armbruster Revie

Re: [PATCH v2 53/53] migration/rdma: Replace flawed device detail dump by tracing

2023-09-29 Thread Fabiano Rosas
r function > qemu_rdma_dump_id() was converted to tracing in commit > 733252deb8b (Tracify migration/rdma.c). > > Convert qemu_rdma_dump_id(), too. > > While there, touch up qemu_rdma_dump_gid()'s outdated comment. > > Signed-off-by: Markus Armbruster Reviewed-by: Fabiano Rosas

Re: [PATCH 0/8] migration fixes

2023-09-27 Thread Fabiano Rosas
Fabiano Rosas writes: > This series contains fixes for the two currently know failures that > show up in migration tests plus a set of fixes for some theoretical > race conditions around QEMUFile handling. > > Patch 1 addresses the issue found in the postcopy/preempt/plai

[PATCH] qtest/migration: Add a test for the analyze-migration script

2023-09-27 Thread Fabiano Rosas
to know right away what the problem is. Signed-off-by: Fabiano Rosas --- I know this adds a python dependency to qtests and I'm not sure how much we care about this script, but on the other hand it would be nice to catch these errors early on. This would also help with future work that touches

Re: [PATCH] qtest/migration: Add a test for the analyze-migration script

2023-09-27 Thread Fabiano Rosas
Fabiano Rosas writes: > Add a smoke test that migrates to a file and gives it to the > script. It should catch the most annoying errors such as changes in > the ram flags. > > After code has been merged it becomes way harder to figure out what is > causing the script to fail,

Re: [PATCH] analyze-migration: ignore RAM_SAVE_FLAG_MULTIFD_FLUSH

2023-09-27 Thread Fabiano Rosas
> section.read() > File "scripts/analyze-migration.py", line 214, in read > raise Exception("Unknown RAM flags: %x" % flags) > Exception: Unknown RAM flags: 200 > > See commit 77c259a4cb ("multifd: Create property > multifd-flush-af

[PATCH] MAINTAINERS: Add myself as a migration reviewer

2023-09-27 Thread Fabiano Rosas
I've been dedicating time to reviewing migration patches already for a while. I'll continue to do so for the foreseeable future. Signed-off-by: Fabiano Rosas --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 355b1960ce..2f2fa60311 100644

Re: [RFC PATCH 1/1] qtest/migration: Support more than one QEMU binary

2023-10-03 Thread Fabiano Rosas
Daniel P. Berrangé writes: > On Tue, Oct 03, 2023 at 05:24:50PM +0200, Philippe Mathieu-Daudé wrote: >> Hi Fabiano, >> >> [+Alex & Daniel] >> >> On 3/10/23 16:19, Fabiano Rosas wrote: >> > We have strict rules around migration compatibility between

[RFC PATCH 1/1] qtest/migration: Support more than one QEMU binary

2023-10-03 Thread Fabiano Rosas
\ QTEST_QEMU_BINARY=../build-8.1.0/qemu-system-x86_64 \ ./tests/qtest/migration-test This code also works for when debugging with GDB to pass the same binary, but different GDB options for src and dst. Signed-off-by: Fabiano Rosas --- tests/qtest/migration-helpers.c | 168

[RFC PATCH 0/1] tests/migration-test: Allow testing older machine types

2023-10-03 Thread Fabiano Rosas
) as migration source ... # Using ../build-8.1.0/qemu-system-x86_64 (v8.1.0-dirty) as migration destination Let me know what you think. Fabiano Rosas (1): qtest/migration: Support more than one QEMU binary tests/qtest/migration-helpers.c | 168 tests/qtest/migration

Re: [PATCH] migration: Add co-maintainers for migration

2023-10-03 Thread Fabiano Rosas
Peter Xu writes: > Per the qemu upstream call a few hours ago, proposing Fabiano and myself as > the co-maintainer for migration subsystem to help Juan. > > Cc: Fabiano Rosas > Cc: Juan Quintela > Signed-off-by: Peter Xu Acked-by: Fabiano Rosas

Re: [PATCH v2 52/53] migration/rdma: Use error_report() & friends instead of stderr

2023-10-04 Thread Fabiano Rosas
Markus Armbruster writes: > Fabiano Rosas writes: > >> Markus Armbruster writes: >> >>> error_report() obeys -msg, reports the current error location if any, >>> and reports to the current monitor if any. Reporting to stderr >>> directly with fpr

Re: [PATCH v11 00/10] migration: Modify 'migrate' and 'migrate-incoming' QAPI commands for migration

2023-10-04 Thread Fabiano Rosas
Het Gala writes: > On 04/10/23 7:03 pm, Fabiano Rosas wrote: >> Het Gala writes: >> >>> This is v11 patchset of modified 'migrate' and 'migrate-incoming' QAPI >>> design >>> for upstream review. >>> >>> Update: Daniel has revi

Re: [PATCH v11 02/10] migration: convert migration 'uri' into 'MigrateAddress'

2023-10-04 Thread Fabiano Rosas
Het Gala writes: > This patch parses 'migrate' and 'migrate-incoming' QAPI's 'uri' > string containing migration connection related information > and stores them inside well defined 'MigrateAddress' struct. > > Suggested-by: Aravind Retnakaran > Signed-off-by: Het Gala > Reviewed-by: Daniel P.

Re: [PATCH v11 08/10] migration: Implement MigrateChannelList to qmp migration flow.

2023-10-04 Thread Fabiano Rosas
Het Gala writes: > Integrate MigrateChannelList with all transport backends > (socket, exec and rdma) for both src and dest migration > endpoints for qmp migration. > > For current series, limit the size of MigrateChannelList > to single element (single interface) as runtime check. > >

Re: [PATCH v11 00/10] migration: Modify 'migrate' and 'migrate-incoming' QAPI commands for migration

2023-10-04 Thread Fabiano Rosas
Het Gala writes: > This is v11 patchset of modified 'migrate' and 'migrate-incoming' QAPI design > for upstream review. > > Update: Daniel has reviewed all patches and is okay with them. Markus has also > given Acked-by tag for patches related to QAPI syntax change. > Fabiano, Juan and

Re: [PULL 09/11] migration: file URI

2023-10-04 Thread Fabiano Rosas
Juan Quintela writes: > Fabiano Rosas wrote: >> Juan Quintela writes: >> >>> From: Steve Sistare >>> >>> Extend the migration URI to support file:. This can be used for >>> any migration scenario that does not require a reverse path. It ca

Re: [PATCH v11 05/10] migration: convert exec backend to accept MigrateAddress.

2023-10-04 Thread Fabiano Rosas
Het Gala writes: > Exec transport backend for 'migrate'/'migrate-incoming' QAPIs accept > new wire protocol of MigrateAddress struct. > > It is achived by parsing 'uri' string and storing migration parameters > required for exec connection into strList struct. > > Suggested-by: Aravind

Re: [PATCH v11 00/10] migration: Modify 'migrate' and 'migrate-incoming' QAPI commands for migration

2023-10-04 Thread Fabiano Rosas
Fabiano Rosas writes: > Het Gala writes: > >> On 04/10/23 7:03 pm, Fabiano Rosas wrote: >>> Het Gala writes: >>> >>>> This is v11 patchset of modified 'migrate' and 'migrate-incoming' QAPI >>>> design >>>> for upstream rev

Re: [RFC PATCH 1/1] qtest/migration: Support more than one QEMU binary

2023-10-04 Thread Fabiano Rosas
Juan Quintela writes: > Fabiano Rosas wrote: >> Daniel P. Berrangé writes: >> >>> On Tue, Oct 03, 2023 at 05:24:50PM +0200, Philippe Mathieu-Daudé wrote: > [...] > >>> $ cat myqemu.dkr >>> FROM fedora:38 >>> >>> RUN dnf -y

Re: [PULL 09/11] migration: file URI

2023-10-04 Thread Fabiano Rosas
he > "fixed-ram" patch series. > > Signed-off-by: Steve Sistare > Tested-by: Michael Galaxy > Reviewed-by: Michael Galaxy > Reviewed-by: Fabiano Rosas > Reviewed-by: Peter Xu > Reviewed-by: Daniel P. Berrangé > Reviewed-by: Juan Quintela > Signe

Re: [PATCH v11 09/10] migration: Implement MigrateChannelList to hmp migration flow.

2023-10-04 Thread Fabiano Rosas
Het Gala writes: > Integrate MigrateChannelList with all transport backends > (socket, exec and rdma) for both src and dest migration > endpoints for hmp migration. > > Suggested-by: Aravind Retnakaran > Signed-off-by: Het Gala > Reviewed-by: Daniel P. Berrangé > --- >

Re: [PATCH v11 10/10] migration: modify test_multifd_tcp_none() to use new QAPI syntax.

2023-10-04 Thread Fabiano Rosas
Het Gala writes: > modify multifd tcp common test to incorporate the new QAPI > syntax defined. > > Suggested-by: Aravind Retnakaran > Signed-off-by: Het Gala > Reviewed-by: Daniel P. Berrangé > --- > tests/qtest/migration-test.c | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-)

Re: [PATCH v2 02/10] migration/rdma: Unfold ram_control_before_iterate()

2023-10-04 Thread Fabiano Rosas
"sysemu/dirtylimit.h" > #include "sysemu/kvm.h" > @@ -3054,7 +3055,10 @@ static int ram_save_setup(QEMUFile *f, void *opaque) > } > } > > -ram_control_before_iterate(f, RAM_CONTROL_SETUP); > +ret = qemu_rdma_registration_start(f, RAM_CONTROL_SETUP); > +if (ret < 0) { > +qemu_file_set_error(f, ret); Markus' patch 23 will turn the return code from qemu_rdma_registration_start() from -EIO into -1. Any code that uses strerr to report it will now see an EPERM (-1). We should someday give an Error argument to savevm functions and stop using QEMUFile as an error carrier. Or perhaps add an Error pointer to RAMState? Anyway, out scope for this patch. Reviewed-by: Fabiano Rosas

Re: [PATCH v11 02/10] migration: convert migration 'uri' into 'MigrateAddress'

2023-10-04 Thread Fabiano Rosas
Daniel P. Berrangé writes: > On Wed, Oct 04, 2023 at 11:43:12AM -0300, Fabiano Rosas wrote: >> Het Gala writes: >> >> > This patch parses 'migrate' and 'migrate-incoming' QAPI's 'uri' >> > string containing migration connection related information >>

Re: [PATCH 0/4] multifd: various fixes

2023-09-22 Thread Fabiano Rosas
Elena Ufimtseva writes: > Hello > > While working and testing various live migration scenarios, > a few issues were found. > > This is my first patches in live migration and I will > appreciate the suggestions from the community if these > patches could be done differently. > > [PATCH 1/4]

[PATCH] optionrom: Remove build-id section

2023-09-26 Thread Fabiano Rosas
bss and COMMON sections so I'm addressing only the immediate issue here. Reported-by: Vasiliy Ulyanov Signed-off-by: Fabiano Rosas --- pc-bios/optionrom/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pc-bios/optionrom/Makefile b/pc-bios/optionrom/Makefile index b1fff0ba6c..3

[RFC PATCH 1/3] migration/multifd: Move channels_ready semaphore

2023-09-22 Thread Fabiano Rosas
s_ready wait before the sem post to keep the sequence consistent. Also fix the error path to post to channels_ready and sem_sync in the correct order. Signed-off-by: Fabiano Rosas --- migration/multifd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/migration/multifd.c

[RFC PATCH 2/3] migration/multifd: Decouple control flow from the SYNC packet

2023-09-22 Thread Fabiano Rosas
f whether it will have more to send once it posts to channels_ready. Send it on an extra loop so it sees no pending_job and releases the semaphore. Signed-off-by: Fabiano Rosas --- migration/multifd.c| 89 -- migration/multifd.h| 8 ++-- migration/trace-ev

[RFC PATCH 0/3] migration/multifd: SYNC packet changes

2023-09-22 Thread Fabiano Rosas
too early to discuss that so I'm leaving it out. 1- https://lore.kernel.org/r/20230330180336.2791-1-faro...@suse.de Fabiano Rosas (3): migration/multifd: Move channels_ready semaphore migration/multifd: Decouple control flow from the SYNC packet migration/multifd: Extract sem_done waiting

[RFC PATCH 3/3] migration/multifd: Extract sem_done waiting into a function

2023-09-22 Thread Fabiano Rosas
This helps document the intent of the loop via the function name and we can reuse this in the future. Signed-off-by: Fabiano Rosas --- migration/multifd.c | 38 +- 1 file changed, 5 insertions(+), 33 deletions(-) diff --git a/migration/multifd.c b/migration

Re: [PATCH 1/2] migration: Fix rdma migration failed

2023-09-22 Thread Fabiano Rosas
Peter Xu writes: > On Wed, Sep 20, 2023 at 05:04:11PM +0800, Li Zhijian wrote: >> From: Li Zhijian >> >> Destination will fail with: >> qemu-system-x86_64: rdma: Too many requests in this message >> (3638950032).Bailing. >> >> migrate with RDMA is different from tcp. RDMA has its own control

Re: [PATCH 1/4] multifd: wait for channels_ready before sending sync

2023-09-22 Thread Fabiano Rosas
Elena Ufimtseva writes: > In multifd_send_sync_main we need to wait for channels_ready > before submitting sync packet as the threads may still be sending > their previous pages. > There is also no need to check for channels_ready in the loop > before the wait for sem_sync, next iteration of

Re: [PATCH v6 09/10] migration/yank: Keep track of registered yank instances

2023-09-25 Thread Fabiano Rosas
CC: Daniel for the QIOChannel discussion Lukas Straub writes: > On Thu, 14 Sep 2023 10:57:47 -0400 > Peter Xu wrote: > >> On Thu, Sep 14, 2023 at 10:23:38AM -0300, Fabiano Rosas wrote: >> > Peter Xu writes: >> > >> > > On Wed, Sep 13,

Re: [PATCH v2 51/53] migration/rdma: Downgrade qemu_rdma_cleanup() errors to warnings

2023-09-29 Thread Fabiano Rosas
over, qemu_rdma_cleanup() can't fail. It is called on error > paths, and QIOChannel close and finalization. Are the conditions it > reports really errors? I doubt it. > > Downgrade qemu_rdma_cleanup()'s errors to warnings. > > Signed-off-by: Markus Armbruster Reviewed-by: Fabiano Rosas

Re: [PATCH v2 29/53] migration/rdma: Check negative error values the same way everywhere

2023-09-29 Thread Fabiano Rosas
Markus Armbruster writes: > When a function returns 0 on success, negative value on error, > checking for non-zero suffices, but checking for negative is clearer. > So do that. > > Signed-off-by: Markus Armbruster Reviewed-by: Fabiano Rosas

Re: [PATCH v2 52/53] migration/rdma: Use error_report() & friends instead of stderr

2023-09-29 Thread Fabiano Rosas
Markus Armbruster writes: > error_report() obeys -msg, reports the current error location if any, > and reports to the current monitor if any. Reporting to stderr > directly with fprintf() or perror() is wrong, because it loses all > this. > > Fix the offenders. Bonus: resolves a FIXME about

Re: [PATCH v12 00/10] migration: Modify 'migrate' and 'migrate-incoming' QAPI commands for migration

2023-10-10 Thread Fabiano Rosas
Het Gala writes: > On 10/10/2023 2:34 AM, Fabiano Rosas wrote: >> Het Gala writes: >> >>> This is v12 patchset of modified 'migrate' and 'migrate-incoming' QAPI >>> design >>> for upstream review. >>> >>> Would like to thank al

Re: [PATCH v1] migration: fix RAMBlock add NULL check

2023-10-10 Thread Fabiano Rosas
(!rb) { > +error_report("RAM block not found"); > +return; > + } > + > if (migrate_ram_is_ignored(rb)) { > return; > } Reviewed-by: Fabiano Rosas

Re: [PATCH v11 02/10] migration: convert migration 'uri' into 'MigrateAddress'

2023-10-09 Thread Fabiano Rosas
Het Gala writes: > On 10/4/2023 11:42 PM, Fabiano Rosas wrote: >> Daniel P. Berrangé writes: >> >>> On Wed, Oct 04, 2023 at 11:43:12AM -0300, Fabiano Rosas wrote: >>>> Het Gala writes: >>>> >>>>> This patch parses 'migrate' and 'm

[RFC PATCH v2 6/6] migration/multifd: Bring back the 'ready' semaphore

2023-10-12 Thread Fabiano Rosas
if we're the channel qemu_sem_post(>ready); qemu_sem_wait(>sem); qemu_sem_post(>sem_done); - between ready and sem if we're not the channel qemu_sem_wait(>ready); qemu_sem_post(>sem); Signed-off-by: Fabiano Rosas --- One issue I can see with this is th

Re: [PATCH v3 02/13] migration/rdma: Unfold ram_control_before_iterate()

2023-10-12 Thread Fabiano Rosas
ed-off-by: Juan Quintela > > -- > > initilazize rioc after checknig that rdma is enabled. > > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas

[PATCH 1/3] migration/multifd: Remove direct "socket" references

2023-10-12 Thread Fabiano Rosas
We're about to enable support for other transports in multifd, so remove direct references to sockets. Signed-off-by: Fabiano Rosas --- migration/multifd.c | 22 -- 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index

[PATCH 3/3] migration/multifd: Clarify Error usage in multifd_channel_connect

2023-10-12 Thread Fabiano Rosas
into the function. Cc: Markus Armbruster Signed-off-by: Fabiano Rosas --- migration/multifd.c| 60 -- migration/trace-events | 3 ++- 2 files changed, 30 insertions(+), 33 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index

[PATCH 0/3] migration/multifd: General cleanups

2023-10-12 Thread Fabiano Rosas
the code more robust. Third patch is a refactoring to improve Error handling. We're currently doing some gymnastics that are not necessary. Thanks! Fabiano Rosas (3): migration/multifd: Remove direct "socket" references migration/multifd: Unify multifd_send_thread error paths migratio

[PATCH 2/3] migration/multifd: Unify multifd_send_thread error paths

2023-10-12 Thread Fabiano Rosas
this must be the case, otherwise the if (ret != 0) would be exiting the thread without calling multifd_send_terminate_threads() which is incorrect. Unify both paths to make it clear that both are taken when there's an error. Signed-off-by: Fabiano Rosas --- migration/multifd.c | 12 +++- 1

[RFC PATCH v2 0/6] migration/multifd: Locking changes

2023-10-12 Thread Fabiano Rosas
params lock. I'm also trying to show that sem_sync on the sending side could be made a bit more useful by using it to inform completion of any batch of packets, rather than only completion of the SYNC packet. I understand this is a disruptive series, so let's take it slow. Thanks! Fabiano Rosas (6

Re: [PATCH v3 12/13] migration/rdma: Declare for index variables local

2023-10-12 Thread Fabiano Rosas
Juan Quintela writes: > Declare all variables that are only used inside a for loop inside the > for statement. > > This makes clear that they are not used outside of the for loop. > > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas

Re: [PATCH v3 13/13] migration/rdma: Remove all "ret" variables that are used only once

2023-10-12 Thread Fabiano Rosas
nd of the line. The longer the list of arguments, the larger the chance of missing the < 0 when glancing over the code. Anyway: Reviewed-by: Fabiano Rosas

Re: [PATCH v3 11/13] migration/rdma: Use i as for index instead of idx

2023-10-12 Thread Fabiano Rosas
Juan Quintela writes: > Once there, all the uses are local to the for, so declare the variable > inside the for statement. > > Signed-off-by: Juan Quintela Reviewed-by: Fabiano Rosas

Re: [PATCH v2 3/9] tests/qtest: Allow qtest_get_machines to use an alternate QEMU binary

2023-10-12 Thread Fabiano Rosas
Thomas Huth writes: > On 06/10/2023 14.39, Fabiano Rosas wrote: >> We're adding support for using more than one QEMU binary in >> tests. Modify qtest_get_machines() to take an environment variable >> that contains the QEMU binary path. >> >> Since the function k

[RFC PATCH v2 4/6] migration/multifd: Extract sem_done waiting into a function

2023-10-12 Thread Fabiano Rosas
This helps document the intent of the loop via the function name and we can reuse this in the future. Signed-off-by: Fabiano Rosas --- migration/multifd.c | 47 + 1 file changed, 30 insertions(+), 17 deletions(-) diff --git a/migration/multifd.c b

[RFC PATCH v2 5/6] migration/multifd: Stop setting 'quit' outside of channels

2023-10-12 Thread Fabiano Rosas
lag and quit by itself. Signed-off-by: Fabiano Rosas --- migration/multifd.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index 4ffa67339c..b7ba3fe0e6 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -497

[RFC PATCH v2 3/6] migration/multifd: Decouple control flow from the SYNC packet

2023-10-12 Thread Fabiano Rosas
. This is surprising and not documented. Clarify the above situation by renaming 'sem_sync' to 'sem_done' and making the #2 usage the main one. Post to 'sem_sync' only when there's no more pending_jobs. Signed-off-by: Fabiano Rosas --- I remove the parts about the receiving side. I wasn't sure about them and we

[RFC PATCH v2 1/6] migration/multifd: Remove channels_ready semaphore

2023-10-12 Thread Fabiano Rosas
: This usage is correct, but it is made redundant by the wait on sem_sync. What this piece of code is doing is making sure all channels have sent the SYNC packet and became idle afterwards. Signed-off-by: Fabiano Rosas --- migration/multifd.c | 10 -- 1 file changed, 10 deletions(-)

[RFC PATCH v2 2/6] migration/multifd: Stop checking p->quit in multifd_send_thread

2023-10-12 Thread Fabiano Rosas
ill only see it once it loops, so 'exiting' will always be seen first. Note that setting p->quit at multifd_send_terminate_threads() still makes sense because we need a way to inform multifd_send_pages() that the channel has stopped. Signed-off-by: Fabiano Rosas --- migration/multifd.c

Re: [PATCH v4] migration: hold the BQL during setup

2023-10-12 Thread Fabiano Rosas
is removed, because it referred to the qemu_mutex_lock_iothread() > call. > > Signed-off-by: Fiona Ebner Thanks for taking the time to explain stuff in the commit message. I dislike having unnecessary dependencies on the BQL throughout the migration code, but I see people preferred that over conditional locking in the previous versions, so in the name of consensus: Reviewed-by: Fabiano Rosas

Re: [PATCH v3 03/10] migration: Refactor error handling in source return path

2023-10-05 Thread Fabiano Rosas
Peter Xu writes: > @@ -1882,48 +1870,46 @@ static void *source_return_path_thread(void *opaque) > uint32_t tmp32, sibling_error; > ram_addr_t start = 0; /* =0 to silence warning */ > size_t len = 0, expected_len; > +Error *err = NULL; > int res; > >

Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER

2023-10-05 Thread Fabiano Rosas
Fabiano Rosas writes: > Peter Xu writes: > >> From: Fabiano Rosas >> >> To do so, create two paired sockets, but make them not providing real data. >> Feed those fake sockets to src/dst QEMUs for recovery to let them go into >> RECOVER stage without goin

Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER

2023-10-05 Thread Fabiano Rosas
Peter Xu writes: > From: Fabiano Rosas > > To do so, create two paired sockets, but make them not providing real data. > Feed those fake sockets to src/dst QEMUs for recovery to let them go into > RECOVER stage without going out. Test that we can always kick it out and

Re: [PATCH v3 08/10] migration: Allow network to fail even during recovery

2023-10-05 Thread Fabiano Rosas
> > After bouncing back to PAUSED stage, one can recover again. > > Reported-by: Xiaohui Li > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2111332 > Signed-off-by: Peter Xu Reviewed-by: Fabiano Rosas

Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER

2023-10-05 Thread Fabiano Rosas
Peter Xu writes: > On Thu, Oct 05, 2023 at 10:37:56AM -0300, Fabiano Rosas wrote: >> >> +/* >> >> + * Make sure both QEMU instances will go into RECOVER stage, then >> >> test >> >> + * kicking them out using migrate-pause. >

Re: [PATCH] qtest/migration: Add a test for the analyze-migration script

2023-10-05 Thread Fabiano Rosas
Peter Xu writes: > On Wed, Sep 27, 2023 at 06:47:56PM -0300, Fabiano Rosas wrote: >> I know this adds a python dependency to qtests and I'm not sure how >> much we care about this script, but on the other hand it would be nice >> to catch these errors early on. >&g

Re: [PATCH v3 10/10] tests/migration-test: Add a test for postcopy hangs during RECOVER

2023-10-05 Thread Fabiano Rosas
Peter Xu writes: > On Thu, Oct 05, 2023 at 06:10:20PM -0300, Fabiano Rosas wrote: >> Peter Xu writes: >> >> > On Thu, Oct 05, 2023 at 10:37:56AM -0300, Fabiano Rosas wrote: >> >> >> +/* >> >> >> + * Make sure both QE

Re: [PATCH] migration: Fix parse_ramblock() on overwritten retvals

2023-10-17 Thread Fabiano Rosas
Signed-off-by: Peter Xu Reviewed-by: Fabiano Rosas

Re: [PATCH v3 10/11] tests/qtest/migration: Support more than one QEMU binary

2023-10-18 Thread Fabiano Rosas
Juan Quintela writes: > Fabiano Rosas wrote: > D> We have strict rules around migration compatibility between different >> QEMU versions but no test to validate the migration state between >> different binaries. >> >> Add infrastructure to allow running the migr

[PATCH v3 02/11] tests/qtest: Introduce qtest_init_with_env

2023-10-18 Thread Fabiano Rosas
-off-by: Fabiano Rosas --- tests/qtest/libqtest.c | 26 +++--- tests/qtest/libqtest.h | 13 + 2 files changed, 32 insertions(+), 7 deletions(-) diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c index 03fa644663..9eebba8767 100644 --- a/tests/qtest

[PATCH v3 11/11] tests/qtest: Don't print messages from query instances

2023-10-18 Thread Fabiano Rosas
by: Thomas Huth Signed-off-by: Fabiano Rosas --- tests/qtest/libqtest.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c index fda163836c..1ac4f7d36a 100644 --- a/tests/qtest/libqtest.c +++ b/tests/qtest/libqtest.c @@ -

[PATCH v3 03/11] tests/qtest: Allow qtest_get_machines to use an alternate QEMU binary

2023-10-18 Thread Fabiano Rosas
changes. Reviewed-by: Juan Quintela Reviewed-by: Thomas Huth Signed-off-by: Fabiano Rosas --- tests/qtest/libqtest.c | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c index 9eebba8767..c52fab373f 100644

<    3   4   5   6   7   8   9   10   11   12   >