On 2025/09/19 0:23, Arun Menon wrote:
Hello,

Currently, when a migration of a VM with an encrypted vTPM
fails on the destination host (e.g., due to a mismatch in secret values),
the error message displayed on the source host is generic and unhelpful.

For example, a typical error looks like this:
"operation failed: job 'migration out' failed: Sibling indicated error 1.
operation failed: job 'migration in' failed: load of migration failed:
Input/output error"

This message does not provide any specific indication of a vTPM failure.
Such generic errors are logged using error_report(), which prints to
the console/monitor but does not make the detailed error accessible via
the QMP query-migrate command.

This series addresses the issue, by ensuring that specific TPM error
messages are propagated via the QEMU Error object.
To make this possible,
- A set of functions in the call stack is changed
   to incorporate an Error object as an additional parameter.
- Also, the TPM backend makes use of a new hook called post_load_errp()
   that explicitly passes an Error object.

It is organized as follows,
  - Patches 1-23 focuses on pushing Error object into the functions
    that are important in the call stack where TPM errors are observed.
    We still need to make changes in rest of the functions in savevm.c
    such that they also incorporate the errp object for propagating errors.
  - Patches 12, 13, 20, are minor refactoring changes.
  - Patch 24 removes error variant of vmstate_save_state() function.
  - Patch 25 renames post_save() to cleanup_save()
  - Patch 26 introduces the new variants of the hooks in VMStateDescription
    structure. These hooks should be used in future implementations.
  - Patch 27 focuses on changing the TPM backend such that the errors are
    set in the Error object.

While this series focuses specifically on TPM error reporting during
live migration, it lays the groundwork for broader improvements.
A lot of methods in savevm.c that previously returned an integer now capture
errors in the Error object, enabling other modules to adopt the
post_load_errp hook in the future.

One such change previously attempted:
https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg01727.html

Resolves: https://issues.redhat.com/browse/RHEL-82826

Signed-off-by: Arun Menon <arme...@redhat.com>

All comments I had are addressed and this series looks good to me.
For the whole series:

Reviewed-by: Akihiko Odaki <od...@rsg.ci.i.u-tokyo.ac.jp>

Regards,
Akihiko Odaki

Reply via email to