"Dr. David Alan Gilbert" <dgilb...@redhat.com> wrote: > * Juan Quintela (quint...@redhat.com) wrote: >> "Dr. David Alan Gilbert (git)" <dgilb...@redhat.com> wrote: >> > From: "Dr. David Alan Gilbert" <dgilb...@redhat.com> >> > >> > My e4d633207 patch has an over zealous sanity check that checked >> > the lengths of the RAM Blocks on source/destination were the same. This >> > isn't true because of the 'used_length' trick for RAM blocks like the >> > ACPI table that vary in size. >> > >> > Prior to that patch RDMA would also fail in this case, but it should >> > now work with the changes in the set e4d633207 is in. >> > >> > Signed-off-by: Dr. David Alan Gilbert <dgilb...@redhat.com> >> > >> > Fixes: e4d633207c129dc5b7d145240ac4a1997ef3902f >> > --- >> > migration/rdma.c | 13 +++++++------ >> > 1 file changed, 7 insertions(+), 6 deletions(-) >> > >> > diff --git a/migration/rdma.c b/migration/rdma.c >> > index f106b2a..1d094b0 100644 >> > --- a/migration/rdma.c >> > +++ b/migration/rdma.c >> > @@ -3338,14 +3338,15 @@ static int qemu_rdma_registration_stop(QEMUFile >> > *f, void *opaque, >> > for (i = 0; i < nb_dest_blocks; i++) { >> > network_to_dest_block(&rdma->dest_blocks[i]); >> > >> > - /* We require that the blocks are in the same order */ >> > + /* We require that the blocks are in the same order, >> > + * but the used_length trick for acpi blocks means that >> > + * the destination can validly be larger than the source >> > + */ >> > if (rdma->dest_blocks[i].length != local->block[i].length) { >> >> Should we change the check to be that destination is bigger or equal >> than source? >> >> With your change, we only remove the check? > > I'm actually going to drop this change; so keep the error if they're > different. > > My argument works like this (I've not yet found a good way to test it): > > 1) The source sends to the destination a list of RAM blocks in the > qemu-file stream > 2) The destination performs a resize on the RAM blocks to match the source > so at this point the destination's block sizes should match.
Humm, I *thought* that what the resize does is getting it bigger, but if destination is bigger, it does nothing, no? > 3) The source sends a series of RDMA block registration requests for the > RAM > 4) The destination sends a list of RAM registrations back to the source > 5) This check is checking that this destination list matches the local list > 6) As long as (4) happens after (2) then the size that the destination sees > should always match the source. > 7) I think 4 is after 2 due to a qemu_fflush > > So keeping this check guards against 7 not really being true and/or > the destination populating it's list of blocks prior to (2) - which I have > a sneaky feeling might be happening, but am not sure yet. > > Dave > >> >> Thanks, Juan. >> >> >> > - ERROR(errp, "Block %s/%d has a different length %" PRIu64 >> > - "vs %" PRIu64, local->block[i].block_name, i, >> > - local->block[i].length, >> > + fprintf(stderr, "INFO: Block %s/%d has a different length >> > %" >> > + PRIu64 "vs %" PRIu64, >> > local->block[i].block_name, >> > + i, local->block[i].length, >> > rdma->dest_blocks[i].length); >> > - rdma->error_state = -EINVAL; >> > - return -EINVAL; >> > } >> > local->block[i].remote_host_addr = >> > rdma->dest_blocks[i].remote_host_addr; > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK