Re: [PATCH V3 0/9] Live update: cpr-exec

Vladimir Sementsov-Ogievskiy Sat, 20 Sep 2025 19:23:22 -0700

On 14.08.25 20:17, Steve Sistare wrote:

This patch series adds the live migration cpr-exec mode.


The new user-visible interfaces are:
   * cpr-exec (MigMode migration parameter)
   * cpr-exec-command (migration parameter)

cpr-exec mode is similar in most respects to cpr-transfer mode, with the
primary difference being that old QEMU directly exec's new QEMU.  The user
specifies the command to exec new QEMU in the migration parameter
cpr-exec-command.

Why?

In a containerized QEMU environment, cpr-exec reuses an existing QEMU
container and its assigned resources.  By contrast, cpr-transfer mode
requires a new container to be created on the same host as the target of
the CPR operation.  Resources must be reserved for the new container, while
the old container still reserves resources until the operation completes.
Avoiding over commitment requires extra work in the management layer.
This is one reason why a cloud provider may prefer cpr-exec.  A second reason
is that the container may include agents with their own connections to the
outside world, and such connections remain intact if the container is reused.


My two cents:

We considered a possibility to switch to cpr-exec, and even more,
we thought about some kind of loading new version of QEMU binary to running
QEMU process (like library) and switching to it. But finally decided to
keep our current approach (starting new QEMU in a separate process) and
use CPR transfer (and finally come to my current in-list proposals of
just migrating all fds in main migration channel).

First, we don't run QEMU in docker, so probably we don't encounter some
problems around it. The real problem for us is migration downtime for
switching network and disk.

Still, why we don't want cpr-exec? Two reasons:

1. It seems, that current approach is more safe against different errors during
migration: we have more chances just to say "cont" on source process, if 
something
goes wrong.

2. It seems, that with second process we do have more possibilities to minimize
downtime, as we can do some initializations in a new QEMU process _before_ 
migration
(when second process starts, the first is still running).

I also thought about, could we do a kind of "exec", but still be able to avoid 
[2]?
This leads to an idea of loading new qemu binary to the running process (like 
library),
and .. start executing it in parallel with the old one? But that looks like 
trying
to reinvent processes again, which is obviously bad idea.


--
Best regards,
Vladimir

Re: [PATCH V3 0/9] Live update: cpr-exec

Reply via email to