On November 16, 2021 12:12 pm, Thomas Lamprecht wrote: > On 16.11.21 11:52, Fabian Grünbichler wrote: >> the tpmstate volume is not available in the VM directly, but we do >> migrate the state volume via a storage migration anyway if necessary. >> > > some context would be great to have in the commit message, iow. mentioning > that QEMU is already migrating this as part of its memory/state migration.
I tried to get some understanding of how this works, and I don't think that the stuff that Qemu copies as part of the TPM emulator state covers everything that is in the state volume. what happens is the following: - our migration code finds a tpmstate volume, it gets migrated via storage_migrate if on local storage (and replicated if that is enabled) - the VM is started on the remote node with the initial swtpm setup part skipped, since we already have a volume with state - the RAM migration happens (and rest of state, including 'tpm emulator state') so there is a window between storage_migrate/replication happening, and the migration being finished where changes to the TPM state volume from within the guest could potentially get lost (unless the state covered by the migrate stream covers ALL the state inside the state volume, which I don't think, but maybe I am mistaken on that front). but this is irrespective of this patch, which just fixes the wrong attempt of setting up an NBD server for the replicated tpm state volume. even attaching the volume (like we do for backups) and setting up that NBD server would not help, since changes to the state volume are not tracked in the source VM on the block level, as Qemu doesn't access the state volume directly, only swtpm does. > > Also, how is "migrate -> stop -> start" affected, is the TPM synced out to > the (previously replicated?) disk on the target side during stop? I am not sure I understand this question. nothing changes about the flow of migration with this patch, except that where the migration would fall apart previously if replication was enabled, it now works. the handling of the state volume is unchanged / identical to a VM that is not replicated. in either case we only sync the state volume once, before starting the VM on the target node, doing block mirror, and the ram/state migration. swtpm probably syncs it whenever state-changing operations are issued from within the VM - but that is not something that we can control when shutting down the VM. AFAIU, the 'raw' state of the TPM is not even available to Qemu directly, that's the whole point of the swtpm component after all? > >> this code path was only triggered for replicated VMs with TPM. >> >> Signed-off-by: Fabian Grünbichler <[email protected]> >> --- >> PVE/QemuServer.pm | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm >> index 580af9e..76d45a2 100644 >> --- a/PVE/QemuServer.pm >> +++ b/PVE/QemuServer.pm >> @@ -5238,6 +5238,7 @@ sub vm_migrate_get_nbd_disks { >> my ($ds, $drive) = @_; >> >> return if drive_is_cdrom($drive); >> + return if $ds eq 'tpmstate0'; >> >> my $volid = $drive->{file}; >> >> > > _______________________________________________ pve-devel mailing list [email protected] https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
