On Wed, Dec 03, 2025 at 01:51:23PM -0500, Ben Chaney wrote:
> From: Steve Sistare <[email protected]>
>
> Provide the cpr=on option to preserve TAP and vhost descriptors during
> cpr-transfer, so the management layer does not need to create a new
> device for the target.
>
> Save all tap fd's in canonical order, leveraging the index argument of
> cpr_save_fd. For the i'th queue, the tap device fd is saved at index 2*i,
> and the vhostfd (if any) at index 2*i+1.
This interleaving feels risky from the POV of future extensibility.
Although its unlikely that we'll need a third type of FD per queue,
it would be easy to leave this possiblity open.
IOW, IMHO we should save all tap FDs, then all vhostfds with no
interleaving. If we ever get further FDs to save in future, then
they can be set to follow the vhostfds.
>
> tap and vhost fd's are passed by name to the monitor when a NIC is hot
> plugged, but the name is not known to qemu after cpr. Allow the manager
> to pass -1 for the fd "name" in the new qemu args to indicate that QEMU
> should search for a saved value. Example:
>
> -netdev tap,id=hostnet2,fds=-1:-1,vhostfds=-1:-1,cpr=on
This syntax feels redundant. If cpr==off then "fds" must
always be valid, or not specified at all. If cpr=on, then
"fds" will always be -1. I don't see any point in setting
the 'fds' or 'vhostfds' arg at all. It should simply be:
-netdev tap,id=hostnet2,cpr=on
this in turn avoids introducing special syntax for allowing
-1 in 'fds' or 'vhostfds' which Markus was concerned with.
> diff --git a/include/migration/cpr.h b/include/migration/cpr.h
> index d585fadc5b..68424b4b03 100644
> --- a/include/migration/cpr.h
> +++ b/include/migration/cpr.h
> @@ -48,7 +48,7 @@ void cpr_state_close(void);
> struct QIOChannel *cpr_state_ioc(void);
>
> bool cpr_incoming_needed(void *opaque);
> -int cpr_get_fd_param(const char *name, const char *fdname, int index,
> +int cpr_get_fd_param(const char *name, const char *fdname, int index, bool
> cpr,
> Error **errp);
>
> QEMUFile *cpr_transfer_output(MigrationChannel *channel, Error **errp);
> diff --git a/migration/cpr.c b/migration/cpr.c
> index c0bf93a7ba..19bd56339d 100644
> --- a/migration/cpr.c
> +++ b/migration/cpr.c
> @@ -316,6 +316,7 @@ bool cpr_incoming_needed(void *opaque)
> * @name: CPR name for the descriptor
> * @fdname: An integer-valued string, or a name passed to a getfd command
> * @index: CPR index of the descriptor
> + * @cpr: use cpr
This feels wierdly redundant too. THe method name already implies
use of 'cpr', and yet we now have another parameter to ask whether
to use 'cpr'. At the very least these semantics deserve a much
better explanation than "@cpr: use cpr", as I don't know what the
intention is here.
> * @errp: returned error message
> *
> * If CPR is not being performed, then use @fdname to find the fd.
> @@ -325,22 +326,22 @@ bool cpr_incoming_needed(void *opaque)
> * On success returns the fd value, else returns -1.
> */
> int cpr_get_fd_param(const char *name, const char *fdname, int index,
> - Error **errp)
> + bool cpr, Error **errp)
> {
> ERRP_GUARD();
> int fd;
>
> - if (cpr_is_incoming()) {
> + if (cpr && cpr_is_incoming()) {
> fd = cpr_find_fd(name, index);
> if (fd < 0) {
> error_setg(errp, "cannot find saved value for fd %s", fdname);
> }
> } else {
> fd = monitor_fd_param(monitor_cur(), fdname, errp);
> - if (fd >= 0) {
> - cpr_save_fd(name, index, fd);
> - } else {
> + if (fd < 0) {
> error_prepend(errp, "Could not parse object fd %s:", fdname);
> + } else if (cpr) {
> + cpr_save_fd(name, index, fd);
> }
> }
> return fd;
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|