On April 27, 2023 7:50 am, DERUMIER, Alexandre wrote: > Hi, > > Le mercredi 26 avril 2023 à 15:14 +0200, Fabian Grünbichler a écrit : >> On April 25, 2023 6:52 pm, Alexandre Derumier wrote: >> > This patch add support for remote migration when target >> > cpu model is different. >> > >> > The target vm is restart after the migration >> >> so this effectively introduces a new "hybrid" migration mode ;) the >> changes are a bit smaller than I expected (in part thanks to patch >> #1), >> which is good. >> >> there are semi-frequent requests for another variant (also applicable >> to >> containers) in the form of a two phase migration >> - storage migrate >> - stop guest >> - incremental storage migrate >> - start guest on target >> > > But I'm not sure how to to an incremental storage migrate, without > storage snapshot send|receiv. (so zfs && rbd could work). > > - Vm/ct is running > - do a first snapshot + sync to target with zfs|rbd send|receive > - stop the guest > - do a second snapshot + incremental sync + sync to target with zfs|rbd > send|receive > - start the guest on remote > > > (or maybe for vm, without snapshot, with a dirty bitmap ? But we need > to be able to write the dirty map content to disk somewhere after vm > stop, and reread it for the last increment )
theoretically, we could support such a mode for non-snapshot storages by using bitmaps+block-mirror, yes. either with a target VM, or with qemu-storage-daemon on the target node exposing the target volumes > - vm is running > - create a dirty-bitmap and start sync with qemu-block-storage > - stop the vm && save the dirty bitmap > - reread the dirtymap && do incremental sync (with the new qemu-daemon- > storage or starting the vm paused ? stop here could also just mean stop the guest OS, but leave the process for the incremental sync, so it would not need persistent bitmap support. > And currently we don't support yet offline storage migration. (BTW, > This is also breaking migration with unused disk). > I don't known if we can send send|receiv transfert through the tunnel ? > (I never tested it) we do, but maybe you tested with RBD which doesn't support storage migration yet? withing a cluster it doesn't need to, since it's a shared storage, but between cluster we need to implement it (it's on my TODO list and shouldn't be too hard since there is 'rbd export/import'). >> given that it might make sense to save-guard this implementation >> here, >> and maybe switch to a new "mode" parameter? >> >> online => switching CPU not allowed >> offline or however-we-call-this-new-mode (or in the future, two- >> phase-restart) => switching CPU allowed >> > > Yes, I was thinking about that too. > Maybe not "offline", because maybe we want to implement a real offline > mode later. > But simply "restart" ? no, I meant moving the existing --online switch to a new mode parameter, then we'd have "online" and "offline", and then add your new mode on top "however-we-call-this-new-mode", and then we could in the future also add "two-phase-restart" for the sync-twice mode I described :) target-cpu would of course also be supported for the (existing) offline mode, since it just needs to adapt the target-cpu in the config. the main thing I'd want to avoid is somebody accidentally setting "target-cpu", not knowing/noticing that that entails what amounts to a reset of the VM as part of the migration.. there were a few things down below that might also be worthy of discussion. I also wonder whether the two variants of "freeze FS" and "suspend without state" are enough - that only ensures that no more I/O happens so the volumes are bitwise identical, but shouldn't we also at least have the option of doing a clean shutdown at that point so that applications can serialize/flush their state properly and that gets synced across as well? else this is the equivalent of cutting the power cord, which might not be a good fit for all use cases ;) >> > >> > Signed-off-by: Alexandre Derumier <[email protected]> >> > --- >> > PVE/API2/Qemu.pm | 18 ++++++++++++++++++ >> > PVE/CLI/qm.pm | 6 ++++++ >> > PVE/QemuMigrate.pm | 25 +++++++++++++++++++++++++ >> > 3 files changed, 49 insertions(+) >> > >> > diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm >> > index 587bb22..6703c87 100644 >> > --- a/PVE/API2/Qemu.pm >> > +++ b/PVE/API2/Qemu.pm >> > @@ -4460,6 +4460,12 @@ __PACKAGE__->register_method({ >> > optional => 1, >> > default => 0, >> > }, >> > + 'target-cpu' => { >> > + optional => 1, >> > + description => "Target Emulated CPU model. For >> > online migration, the storage is live migrate, but the memory >> > migration is skipped and the target vm is restarted.", >> > + type => 'string', >> > + format => 'pve-vm-cpu-conf', >> > + }, >> > 'target-storage' => get_standard_option('pve- >> > targetstorage', { >> > completion => >> > \&PVE::QemuServer::complete_migration_storage, >> > optional => 0, >> > @@ -4557,11 +4563,14 @@ __PACKAGE__->register_method({ >> > raise_param_exc({ 'target-bridge' => "failed to parse >> > bridge map: $@" }) >> > if $@; >> > >> > + my $target_cpu = extract_param($param, 'target-cpu'); >> >> this is okay >> >> > + >> > die "remote migration requires explicit storage mapping!\n" >> > if $storagemap->{identity}; >> > >> > $param->{storagemap} = $storagemap; >> > $param->{bridgemap} = $bridgemap; >> > + $param->{targetcpu} = $target_cpu; >> >> but this is a bit confusing with the variable/hash key naming ;) >> >> > $param->{remote} = { >> > conn => $conn_args, # re-use fingerprint for tunnel >> > client => $api_client, >> > @@ -5604,6 +5613,15 @@ __PACKAGE__->register_method({ >> > PVE::QemuServer::nbd_stop($state->{vmid}); >> > return; >> > }, >> > + 'restart' => sub { >> > + PVE::QemuServer::vm_stop(undef, $state->{vmid}, >> > 1, 1); >> > + my $info = PVE::QemuServer::vm_start_nolock( >> > + $state->{storecfg}, >> > + $state->{vmid}, >> > + $state->{conf}, >> > + ); >> > + return; >> > + }, >> > 'resume' => sub { >> > if >> > (PVE::QemuServer::Helpers::vm_running_locally($state->{vmid})) { >> > PVE::QemuServer::vm_resume($state->{vmid}, >> > 1, 1); >> > diff --git a/PVE/CLI/qm.pm b/PVE/CLI/qm.pm >> > index c3c2982..06c74c1 100755 >> > --- a/PVE/CLI/qm.pm >> > +++ b/PVE/CLI/qm.pm >> > @@ -189,6 +189,12 @@ __PACKAGE__->register_method({ >> > optional => 1, >> > default => 0, >> > }, >> > + 'target-cpu' => { >> > + optional => 1, >> > + description => "Target Emulated CPU model. For >> > online migration, the storage is live migrate, but the memory >> > migration is skipped and the target vm is restarted.", >> > + type => 'string', >> > + format => 'pve-vm-cpu-conf', >> > + }, >> > 'target-storage' => get_standard_option('pve- >> > targetstorage', { >> > completion => >> > \&PVE::QemuServer::complete_migration_storage, >> > optional => 0, >> > diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm >> > index e182415..04f8053 100644 >> > --- a/PVE/QemuMigrate.pm >> > +++ b/PVE/QemuMigrate.pm >> > @@ -731,6 +731,11 @@ sub cleanup_bitmaps { >> > sub live_migration { >> > my ($self, $vmid, $migrate_uri, $spice_port) = @_; >> > >> > + if($self->{opts}->{targetcpu}){ >> > + $self->log('info', "target cpu is different - skip live >> > migration."); >> > + return; >> > + } >> > + >> > my $conf = $self->{vmconf}; >> > >> > $self->log('info', "starting online/live migration on >> > $migrate_uri"); >> > @@ -995,6 +1000,7 @@ sub phase1_remote { >> > my $remote_conf = PVE::QemuConfig->load_config($vmid); >> > PVE::QemuConfig->update_volume_ids($remote_conf, $self- >> > >{volume_map}); >> > >> > + $remote_conf->{cpu} = $self->{opts}->{targetcpu}; >> >> do we need permission checks here (or better, somewhere early on, for >> doing this here) >> >> > my $bridges = map_bridges($remote_conf, $self->{opts}- >> > >{bridgemap}); >> > for my $target (keys $bridges->%*) { >> > for my $nic (keys $bridges->{$target}->%*) { >> > @@ -1354,6 +1360,21 @@ sub phase2 { >> > live_migration($self, $vmid, $migrate_uri, $spice_port); >> > >> > if ($self->{storage_migration}) { >> > + >> > + #freeze source vm io/s if target cpu is different (no >> > livemigration) >> > + if ($self->{opts}->{targetcpu}) { >> > + my $agent_running = $self->{conf}->{agent} && >> > PVE::QemuServer::qga_check_running($vmid); >> > + if ($agent_running) { >> > + print "freeze filesystem\n"; >> > + eval { mon_cmd($vmid, "guest-fsfreeze-freeze"); }; >> > + die $@ if $@; >> >> die here >> >> > + } else { >> > + print "suspend vm\n"; >> > + eval { PVE::QemuServer::vm_suspend($vmid, 1); }; >> > + warn $@ if $@; >> >> but warn here? >> >> I'd like some more rationale for these two variants, what are the >> pros >> and cons? should we make it configurable? >> > + } >> > + } >> > + >> > # finish block-job with block-job-cancel, to disconnect >> > source VM from NBD >> > # to avoid it trying to re-establish it. We are in blockjob >> > ready state, >> > # thus, this command changes to it to blockjob complete >> > (see qapi docs) >> > @@ -1608,6 +1629,10 @@ sub phase3_cleanup { >> > # clear migrate lock >> > if ($tunnel && $tunnel->{version} >= 2) { >> > PVE::Tunnel::write_tunnel($tunnel, 10, "unlock"); >> > + if ($self->{opts}->{targetcpu}) { >> > + $self->log('info', "target cpu is different - restart >> > target vm."); >> > + PVE::Tunnel::write_tunnel($tunnel, 10, 'restart'); >> > + } >> > >> > PVE::Tunnel::finish_tunnel($tunnel); >> > } else { >> > -- >> > 2.30.2 >> > >> > >> > _______________________________________________ >> > pve-devel mailing list >> > [email protected] >> > https://antiphishing.cetsi.fr/proxy/v3?i=Zk92VEFKaGQ4Ums4cnZEUWMTpfHaXFQGRw1_CnOoOH0&r=bHA1dGV3NWJQVUloaWNFUZPm0fiiBviaiy_RDav2GQ1U4uy6lsDDv3uBszpvvWYQN5FqKqFD6WPYupfAUP1c9g&f=SlhDbE9uS2laS2JaZFpNWvmsxai1zlJP9llgnl5HIv-4jAji8Dh2BQawzxID5bzr6Uv-3EQd-eluQbsPfcUOTg&u=https%3A//lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel&k=XRKU >> > >> > >> > >> >> >> _______________________________________________ >> pve-devel mailing list >> [email protected] >> https://antiphishing.cetsi.fr/proxy/v3?i=Zk92VEFKaGQ4Ums4cnZEUWMTpfHaXFQGRw1_CnOoOH0&r=bHA1dGV3NWJQVUloaWNFUZPm0fiiBviaiy_RDav2GQ1U4uy6lsDDv3uBszpvvWYQN5FqKqFD6WPYupfAUP1c9g&f=SlhDbE9uS2laS2JaZFpNWvmsxai1zlJP9llgnl5HIv-4jAji8Dh2BQawzxID5bzr6Uv-3EQd-eluQbsPfcUOTg&u=https%3A//lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel&k=XRKU >> > > _______________________________________________ > pve-devel mailing list > [email protected] > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > _______________________________________________ pve-devel mailing list [email protected] https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
