Hello,

TL;DR : engine stops talking with rebooted host.


[oVirt 4.2.3.5-1.el7.centos]

- From the web gui, upgrading a host, allowing the reboot checkbox checked
- upgrade is OK (/var/log/yum.log is showing successful updates + the Ansible host deploy log is also OK)
- reboot is OK (clean, SSH OK...)
- the host eventually appears as "Install failed"
- the engine.log is telling :

2018-06-19 10:02:24,896+02 ERROR
[org.ovirt.engine.core.bll.SshHostRebootCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac] SSH
reboot command failed on host 'serv-hv-prds06': SSH session timeout
host 'root@ serv-hv-prds06' Stdout: Stderr: 2018-06-19
10:02:25,028+02 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac]
EVENT_ID: SYSTEM_FAILED_SSH_HOST_RESTART(198), A restart usin g SSH
initiated by the engine to Host serv-hv-prds06 has failed. 2018-06-19
10:02:25,185+02 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac]
START, SetVdsStatusVDSCommand(HostName = serv-hv-prds06,
SetVdsStatusVDSCom mandParameters:{hostId='9c1566a4-8432-4de6-b30d-fd3b8e5fafca',
status='InstallFailed', nonOperationalReason='NONE',
stopSpmFailureLogged='false', maintenanceReason='null'}), log id:
833f9bd 2018-06-19 10:02:25,191+02 INFO
[org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac]
FINISH, SetVdsStatusVDSCommand, log id: 833f9bd 2018-06-19
10:02:25,191+02 ERROR
[org.ovirt.engine.core.bll.hostdeploy.UpgradeHostInternalCommand]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7) [6e32b3ac]
Engine failed to restart via ssh host 'serv-hv-prds06' ('9c1566a4- 8432-4de6-b30d-fd3b8e5fafca') after upgrade 2018-06-19
10:02:25,256+02 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7)
[8b7c6e7d-1a22-407c-818b-849e67b94051] EVENT_ID:
HOST_UPGRADE_FAILED(841 ), Failed to upgrade Host serv-hv-prds06
(User: necar...@sdis.isere.fr@SDIS38-authz). 2018-06-19
10:02:30,755+02 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engineScheduled-Thread-69)
[8b7c6e7d-1a22-407c-818b-849e67b94051] EVENT_ID:
HOST_UPGRADE_FAILED(841), Failed to upgrade Host serv-hv-prds06
(User: necar...@sdis.isere.fr@SDIS38-authz).

- Manually activating the host puts it back on track without issue

The usual SSH communications between the engine and the host are usually very sound (VM migrations, maintenance...).

On this oVirt DC, I reproduced this issue twice on 2 different hosts.

In this engine log above, you see that I'm using my account to manage this engine, as I 'm doing for years with no issue. I'll try the exact same path with admin@internal to see what could change, but I don't see the link.

What other logs could I give you to debug this?

Regards,

--
Nicolas ECARNOT
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/CT5KHY3C2ASOXBVNUIEBG5WA42JKJGXH/

Reply via email to