Because of a systemd issue [0], when a service that's 'partOf' a scope fails, the scope itself might end up being left-over, even after all processes in the scope exit. In particular, this can happen for the '$vmid.scope' when the 'pve-dbus-vmstate@$vmid.service' fails.
Doing a 'reset-failed' of the failed 'partOf' service leads to the left-over scope being cleaned up too. Without that users in that situation would get a difficult-to-make-sense-of "timeout waiting on systemd" error message. [0]: https://github.com/systemd/systemd/issues/39141 Signed-off-by: Fiona Ebner <[email protected]> --- src/PVE/QemuServer.pm | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/src/PVE/QemuServer.pm b/src/PVE/QemuServer.pm index 7d5ab718..8e2f03dc 100644 --- a/src/PVE/QemuServer.pm +++ b/src/PVE/QemuServer.pm @@ -5802,6 +5802,12 @@ sub vm_start_nolock { } my %silence_std_outs = (outfunc => sub { }, errfunc => sub { }); + eval { # See systemd GH #39141, need to reset failed PartOf units too, or scope might be blocked + run_command( + ['/bin/systemctl', 'reset-failed', "pve-dbus-vmstate\@$vmid.service"], + %silence_std_outs, + ); + }; eval { run_command(['/bin/systemctl', 'reset-failed', "$vmid.scope"], %silence_std_outs) }; eval { run_command(['/bin/systemctl', 'stop', "$vmid.scope"], %silence_std_outs) }; # Issues with the above 'stop' not being fully completed are extremely rare, a very low -- 2.47.3 _______________________________________________ pve-devel mailing list [email protected] https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
