Hello,
I have a 4.4.8 host that results as nonresponsive.
The DC is FC based
Tried to restart some daemons without effect (vdsmd, mom-vdsmd wdmd)
Then I executed a ssh host reboot but it seems it continues this way after
rebooting

>From storage and network point of view it seems all ok on the host.

In vdsm.log of the host I see every 5 seconds:

2021-12-23 18:54:53,053+0100 INFO  (vmrecovery) [vdsm.api] START
getConnectedStoragePoolsList() from=internal,
task_id=916bc455-ce37-4b50-9f38-b69e3b03807f (api:48)
2021-12-23 18:54:53,053+0100 INFO  (vmrecovery) [vdsm.api] FINISH
getConnectedStoragePoolsList return={'poollist': []} from=internal,
task_id=916bc455-ce37-4b50-9f38-b69e3b03807f (api:54)
2021-12-23 18:54:53,053+0100 INFO  (vmrecovery) [vds] recovery: waiting for
storage pool to go up (clientIF:735)
2021-12-23 18:54:53,444+0100 INFO  (periodic/0) [vdsm.api] START
repoStats(domains=()) from=internal,
task_id=eb5540e0-0f90-4996-bc9a-7c73949f390f (api:48)
2021-12-23 18:54:53,445+0100 INFO  (periodic/0) [vdsm.api] FINISH repoStats
return={} from=internal, task_id=eb5540e0-0f90-4996-bc9a-7c73949f390f
(api:54)

In engine.log

2021-12-23 18:54:38,745+01 INFO
 [org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService]
(EE-ManagedScheduledExecutorService-engineThreadMonitoringThreadPool-Thread-1)
[] Thread pool 'hostUpdatesChecker' is using 0 threads out of 5, 5 threads
waiting for tasks.
2021-12-23 18:55:27,479+01 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-73) []
EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM ov300 command Get Host
Capabilities failed: Message timeout which can be caused by communication
issues
2021-12-23 18:55:27,479+01 ERROR
[org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
(EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-73) []
Unable to RefreshCapabilities: VDSNetworkException: VDSGenericException:
VDSNetworkException: Message timeout which can be caused by communication
issues

I would like to try to put into maintenance the host and then activate, or
reinstall, but there is a power action still in place since 1 hour ago
(when I executed ssh host reboot attempt that got host rebooted but not
connected apparently) that prevents it... what is its timeout?

WHat can I check to understand the source of these supposed communication
problems?

Thanks,
Gianluca
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZUUP2VEHKSJB7XDAUZZ2UUGG3UMFU6AC/

Reply via email to