Hi, 
I am having a couple of issues with fresh ovirt 4.3.7  HCI setup with 3 nodes

------------------------------------------------------------------------------------------------------------------------------------------------------------
1.-vdsm is showing the following errors for HOST1 and HOST2 (HOST3 seems to be 
ok):
------------------------------------------------------------------------------------------------------------------------------------------------------------
     service vdsmd status
Redirecting to /bin/systemctl status vdsmd.service
● vdsmd.service - Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled; vendor 
preset: enabled)
   Active: active (running) since Tue 2020-02-11 18:50:28 PST; 28min ago
  Process: 25457 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh 
--pre-start (code=exited, status=0/SUCCESS)
 Main PID: 25549 (vdsmd)
    Tasks: 76
   CGroup: /system.slice/vdsmd.service
           ├─25549 /usr/bin/python2 /usr/share/vdsm/vdsmd
           ├─25707 /usr/libexec/ioprocess --read-pipe-fd 52 --write-pipe-fd 51 
--max-threads 10 --max-queued-requests 10
           ├─26314 /usr/libexec/ioprocess --read-pipe-fd 92 --write-pipe-fd 86 
--max-threads 10 --max-queued-requests 10
           ├─26325 /usr/libexec/ioprocess --read-pipe-fd 96 --write-pipe-fd 93 
--max-threads 10 --max-queued-requests 10
           └─26333 /usr/libexec/ioprocess --read-pipe-fd 102 --write-pipe-fd 
101 --max-threads 10 --max-queued-requests 10

Feb 11 18:50:28 tij-059-ovirt1.grupolucerna.local vdsmd_init_common.sh[25457]: 
vdsm: Running test_space
Feb 11 18:50:28 tij-059-ovirt1.grupolucerna.local vdsmd_init_common.sh[25457]: 
vdsm: Running test_lo
Feb 11 18:50:28 tij-059-ovirt1.grupolucerna.local systemd[1]: Started Virtual 
Desktop Server Manager.
Feb 11 18:50:29 tij-059-ovirt1.grupolucerna.local vdsm[25549]: WARN MOM not 
available.
Feb 11 18:50:29 tij-059-ovirt1.grupolucerna.local vdsm[25549]: WARN MOM not 
available, KSM stats will be missing.
Feb 11 18:51:25 tij-059-ovirt1.grupolucerna.local vdsm[25549]: ERROR failed to 
retrieve Hosted Engine HA score
                                                               Traceback (most 
recent call last):
                                                                 File 
"/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 182, in _getHaInfo...
Feb 11 18:51:34 tij-059-ovirt1.grupolucerna.local vdsm[25549]: ERROR failed to 
retrieve Hosted Engine HA score
                                                               Traceback (most 
recent call last):
                                                                 File 
"/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 182, in _getHaInfo...
Feb 11 18:51:35 tij-059-ovirt1.grupolucerna.local vdsm[25549]: ERROR failed to 
retrieve Hosted Engine HA score
                                                               Traceback (most 
recent call last):
                                                                 File 
"/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 182, in _getHaInfo...
Feb 11 18:51:43 tij-059-ovirt1.grupolucerna.local vdsm[25549]: ERROR failed to 
retrieve Hosted Engine HA score
                                                               Traceback (most 
recent call last):
                                                                 File 
"/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 182, in _getHaInfo...
Feb 11 18:56:32 tij-059-ovirt1.grupolucerna.local vdsm[25549]: WARN ping was 
deprecated in favor of ping2 and confirmConnectivity

------------------------------------------------------------------------------------------------------------------------------------------------------------
2.-"gluster vol engine heal info" is showing the following and it never 
finishes healing
------------------------------------------------------------------------------------------------------------------------------------------------------------
[root@host2~]# gluster vol heal engine info
Brick host1:/gluster_bricks/engine/engine
/7a68956e-3736-46d1-8932-8576f8ee8882/images/86196e10-8103-4b00-bd3e-0f577a8bb5b2/98d64fb4-df01-4981-9e5e-62be6ca7e07c.meta
 
/7a68956e-3736-46d1-8932-8576f8ee8882/images/b8ce22c5-8cbd-4d7f-b544-9ce930e04dcd/ed569aed-005e-40fd-9297-dd54a1e4946c.meta
 
Status: Connected
Number of entries: 2

Brick host2:/gluster_bricks/engine/engine
/7a68956e-3736-46d1-8932-8576f8ee8882/images/86196e10-8103-4b00-bd3e-0f577a8bb5b2/98d64fb4-df01-4981-9e5e-62be6ca7e07c.meta
 
/7a68956e-3736-46d1-8932-8576f8ee8882/images/b8ce22c5-8cbd-4d7f-b544-9ce930e04dcd/ed569aed-005e-40fd-9297-dd54a1e4946c.meta
 
Status: Connected
Number of entries: 2

Brick host3:/gluster_bricks/engine/engine
Status: Connected
Number of entries: 0

------------------------------------------------------------------------------------------------------------------------------------------------------------
3.-Every hour I see the following entries/errors
------------------------------------------------------------------------------------------------------------------------------------------------------------
VDSM command SetVolumeDescriptionVDS failed: Could not acquire resource. 
Probably resource factory threw an exception.: ()

------------------------------------------------------------------------------------------------------------------------------------------------------------
4.- I am also seeing the following pertaining to the engine volume
------------------------------------------------------------------------------------------------------------------------------------------------------------
Failed to update OVF disks 86196e10-8103-4b00-bd3e-0f577a8bb5b2, OVF data isn't 
updated on those OVF stores (Data Center Default, Storage Domain 
hosted_storage).

------------------------------------------------------------------------------------------------------------------------------------------------------------
5.-hosted-engine --vm-status
------------------------------------------------------------------------------------------------------------------------------------------------------------
--== Host host1 (id: 1) status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : host1
Host ID                            : 1
Engine status                      : {"reason": "vm not running on this host", 
"health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : be592659
local_conf_timestamp               : 480218
Host timestamp                     : 480217
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=480217 (Tue Feb 11 19:22:20 2020)
        host-id=1
        score=3400
        vm_conf_refresh_time=480218 (Tue Feb 11 19:22:21 2020)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineDown
        stopped=False


--== Host host3 (id: 2) status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : host3
Host ID                            : 2
Engine status                      : {"health": "good", "vm": "up", "detail": 
"Up"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : 1f4a8597
local_conf_timestamp               : 436681
Host timestamp                     : 436681
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=436681 (Tue Feb 11 19:22:18 2020)
        host-id=2
        score=3400
        vm_conf_refresh_time=436681 (Tue Feb 11 19:22:18 2020)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineUp
        stopped=False


--== Host host2 (id: 3) status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : host2
Host ID                            : 3
Engine status                      : {"reason": "vm not running on this host", 
"health": "bad", "vm": "down_missing", "detail": "unknown"}
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : ca5c1918
local_conf_timestamp               : 479644
Host timestamp                     : 479644
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=479644 (Tue Feb 11 19:22:21 2020)
        host-id=3
        score=3400
        vm_conf_refresh_time=479644 (Tue Feb 11 19:22:22 2020)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineDown
        stopped=False



------------------------------------------------------------------------------------------------------------------------------------------------------------

Any ideas on what might be going ?
_______________________________________________
Users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/ZIHE4E7RIVLPA3Y5JQ7LI5SXXM474CU4/

Reply via email to