[ovirt-users] 4.4 HCI Install Failure - Missing /etc/pki/CA/cacert.pem

2020-05-21 Thread Stephen Panicho
Hi all! I'm using Cockpit to perform an HCI install, and it fails at the
hosted engine deploy. Libvirtd can't restart because of a missing
/etc/pki/CA/cacert.pem file.

The log (tasks seemingly from
/usr/share/ansible/roles/ovirt.hosted_engine_setup/tasks/initial_clean.yml):
[ INFO ] TASK [ovirt.hosted_engine_setup : Stop libvirt service]
[ INFO ] changed: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Drop vdsm config statements]
[ INFO ] changed: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Restore initial abrt config
files]
[ INFO ] changed: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Restart abrtd service]
[ INFO ] changed: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Drop libvirt sasl2 configuration
by vdsm]
[ INFO ] changed: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Stop and disable services]
[ INFO ] ok: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Restore initial libvirt default
network configuration]
[ INFO ] changed: [localhost]
[ INFO ] TASK [ovirt.hosted_engine_setup : Start libvirt]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Unable
to start service libvirtd: Job for libvirtd.service failed because the
control process exited with error code.\nSee \"systemctl status
libvirtd.service\" and \"journalctl -xe\" for details.\n"}

journalctl -u libvirtd:
May 22 04:33:25 node1 libvirtd[26392]: libvirt version: 5.6.0, package:
10.el8 (CBS , 2020-02-27-01:09:46, )
May 22 04:33:25 node1 libvirtd[26392]: hostname: node1
May 22 04:33:25 node1 libvirtd[26392]: Cannot read CA certificate
'/etc/pki/CA/cacert.pem': No such file or directory
May 22 04:33:25 node1 systemd[1]: libvirtd.service: Main process exited,
code=exited, status=6/NOTCONFIGURED
May 22 04:33:25 node1 systemd[1]: libvirtd.service: Failed with result
'exit-code'.
May 22 04:33:25 node1 systemd[1]: Failed to start Virtualization daemon.

>From a fresh CentOS 8.1 minimal install, I've installed the following:
- The 4.4 repo
- cockpit
- ovirt-cockpit-dashboard
- vdsm-gluster (providing glusterfs-server and allowing the Gluster Wizard
to complete)
- gluster-ansible-roles (only on the bootstrap host)

I'm not exactly sure what that initial bit of the playbook does. Comparing
the bootstrap node with another that has yet to be touched, both
/etc/libvirt/libvirtd.conf and /etc/sysconfig/libvirtd are the same on both
hosts. Yet the bootstrap host can no longer start libvirtd while the other
host can. Neither host has the /etc/pki/CA/cacert.pem file.

Please let me know if I can provide any more information. Thanks!
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XNW4HWUQUTN44VMATT4B6ARSEYVURDP7/


[ovirt-users] Re: Ovirt 4.4 Migration assistance needed.

2020-05-21 Thread Strahil Nikolov via Users
On May 22, 2020 12:01:04 AM GMT+03:00, "Vinícius Ferrão via Users" 
 wrote:
>I think OVN is broken due to this:
>
>Some of the features included in the oVirt 4.4.0 release require
>content that will be available in CentOS Linux 8.2 but cannot be tested
>on RHEL 8.2 yet due to some incompatibility in the openvswitch package
>that is shipped in CentOS Virt SIG, which requires rebuilding
>openvswitch on top of CentOS 8.2. The cluster switch type OVS is not
>implemented for CentOS 8 hosts.
>
>https://blogs.ovirt.org/2020/05/ovirt-44-available/
>
>But I may be wrong.
>
>On 21 May 2020, at 12:06, Strahil Nikolov via Users
>mailto:users@ovirt.org>> wrote:
>
>Hello All,
>
>I  would like to ask  for some  assistance  with  the planing  of the
>upgrade to 4.4 .
>
>I have  issues with the  OVN (doesn't work at all),  thus  I would like
>to start fresh with the HE.
>
>The plan so far (downtime is not an issue) :
>
>1. Reinstall  the nodes one by 1 and  rejoin them in the Gluster  TSP
>2. Wipe  the HostedEngine's gluster  volume
>3. Deploy a fresh hosted  engine
>4. Import the storage  domains (gluster) back to the  engine and import
>the VMs
>
>Do you see any issues  with the plan ?
>Any problems expected  if the VMs do have snapshots?  What about the
>storage  domain version ?
>
>Thanks  in Advance.
>
>Best Regards,
>Strahil Nikolov
>___
>Users mailing list -- users@ovirt.org
>To unsubscribe send an email to
>users-le...@ovirt.org
>Privacy Statement: https://www.ovirt.org/privacy-policy.html
>oVirt Code of Conduct:
>https://www.ovirt.org/community/about/community-guidelines/
>List Archives:
>https://lists.ovirt.org/archives/list/users@ovirt.org/message/QNPOH55AAAYOX5GX3EN5H5ZMOZHKYELI/

I was  talking that my current OVN is broken (4.3.9) , not after the update. 
I'm sorry I didn't clariify that in a more clear way.

Best Regards,
Strahil Nikolov
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JC6BMZR23XY7AMB5ZTIOALFJLIIRHJE6/


[ovirt-users] Re: oVirt 4.4.0 Release is now generally available

2020-05-21 Thread Strahil Nikolov via Users
On May 21, 2020 6:08:19 PM GMT+03:00, Derek Atkins  wrote:
>Nir,
>
>Nir Soffer  writes:
>
>> Why not open RFE to add the feature you need?
>
>I did -- about 3-4 years ago.  SOME of them have been implemented, some
>have been partially implemented, but I am still waiting for ovirt to
>support the full VM startup functionality that I had in vmware-server
>from like 2007 (or earlier).
>
>Part of the issue here is that I suspect most ovirt users have multiple
>hosts and therefore rarely have to worry about how host-system
>maintenance affects the VMs, and probably live in data centers with
>redundant power supplies, UPSes, and backup generators.
>
>I, on the other hand, I've got a single system so when I need to
>perform any maintenance I need to take down everything, or if I have a
>power outage that outlasts my UPS, or...  I want the VMs to come back
>up
>automatically -- and in a particular order (e.g., I need my DNS and KDC
>servers to come up before others).
>
>I filed these RFEs during the 4.0 days, which is when I first started
>using ovirt and put it into deployment.
>
>> You can use the python SDK to do anything supported by oVirt API.
>> Did you look here?
>> https://github.com/oVirt/ovirt-engine-sdk/tree/master/sdk/examples
>
>I have looked there, but I stopped reading after seeing "python".  ;)
>Frankly I detest python.  I think it's an abomination.  There are so
>many other, better languages out there and I don't understand why so
>many people like it (and worse, force it down everyone else's throats).
>But I'll step off my soap-box (and get off my lawn!)  lol.
>
>Honestly, I already spent the time to build a tool to do what I need. 
>I
>even had to update the tool going from 4.1 to 4.3 because some startup
>assumptions changed.  I really don't want to spend the time again, time
>I frankly don't have right now, to re-implement what I've already got.
>It's easier for me to just stay put on 4.3.x.
>
>Yes, I realize that in about 2 years or so I will need to do so.  I'll
>worry about that then.
>
>Of course, since the (partial?) functionality is only in 4.4, I really
>have no way to test it to make sure it does what I need, so see what
>I'm
>missing.  I don't have a testbed to play with it, just my one system.
>
>Thanks,
>
>-derek

Actually,
You can use Ansible and 'uri' module to communicate wwith the engine via the 
API. Most probably the 'uri' module was written in python - but  you don't have 
to deal with python code - just ansible.
Also, it's worth checking the ansible Ovirt modules , as they are kept up to 
date evwn when the API endpoint changes.

I think it won't be too hard to get a list of the VMs and then create some 
logic how to order them for the 'ignition'.

Best Regards,
Strahil Nikolov
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/DQAGPB64SN33OOGYO3MBG4NS4ZBEUNSU/


[ovirt-users] Getting the same bug in 4.4 as I did in 4.3.. brand new install 100% repeatable for me.

2020-05-21 Thread dan . creed
Brand new install, I get past the gluster setup, setting up the the hosted VM, 
get to finalize and hits this issue with lockspace every time. 



[ INFO ] TASK [ovirt.hosted_engine_setup : Initialize lockspace volume]
[ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 5, "changed": true, 
"cmd": ["hosted-engine", "--reinitialize-lockspace", "--force"], "delta": 
"0:00:00.302914", "end": "2020-05-21 14:23:50.413353", "msg": "non-zero return 
code", "rc": 1, "start": "2020-05-21 14:23:50.110439", "stderr": "Traceback 
(most recent call last):\n File \"/usr/lib64/python3.6/runpy.py\", line 193, in 
_run_module_as_main\n \"__main__\", mod_spec)\n File 
\"/usr/lib64/python3.6/runpy.py\", line 85, in _run_code\n exec(code, 
run_globals)\n File 
\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/reinitialize_lockspace.py\",
 line 30, in \n ha_cli.reset_lockspace(force)\n File 
\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/client/client.py\", 
line 286, in reset_lockspace\n stats = broker.get_stats_from_storage()\n File 
\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py\", 
line 148, in get_stats_from_storage\n result = self._proxy.get_stats()\n File 
\"/usr/li
 b64/python3.6/xmlrpc/client.py\", line 1112, in __call__\n return 
self.__send(self.__name, args)\n File 
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1452, in __request\n 
verbose=self.__verbose\n File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 
1154, in request\n return self.single_request(host, handler, request_body, 
verbose)\n File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 1166, in 
single_request\n http_conn = self.send_request(host, handler, request_body, 
verbose)\n File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 1279, in 
send_request\n self.send_content(connection, request_body)\n File 
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1309, in send_content\n 
connection.endheaders(request_body)\n File 
\"/usr/lib64/python3.6/http/client.py\", line 1249, in endheaders\n 
self._send_output(message_body, encode_chunked=encode_chunked)\n File 
\"/usr/lib64/python3.6/http/client.py\", line 1036, in _send_output\n 
self.send(msg)\n File \"/usr/lib64/python3.6/http/client.py\
 ", line 974, in send\n self.connect()\n File 
\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py\", 
line 74, in connect\n 
self.sock.connect(base64.b16decode(self.host))\nFileNotFoundError: [Errno 2] No 
such file or directory", "stderr_lines": ["Traceback (most recent call last):", 
" File \"/usr/lib64/python3.6/runpy.py\", line 193, in _run_module_as_main", " 
\"__main__\", mod_spec)", " File \"/usr/lib64/python3.6/runpy.py\", line 85, in 
_run_code", " exec(code, run_globals)", " File 
\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/reinitialize_lockspace.py\",
 line 30, in ", " ha_cli.reset_lockspace(force)", " File 
\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/client/client.py\", 
line 286, in reset_lockspace", " stats = broker.get_stats_from_storage()", " 
File 
\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py\", 
line 148, in get_stats_from_storage", " result = self._proxy.get_stats()", " 
File \"/usr/lib64/
 python3.6/xmlrpc/client.py\", line 1112, in __call__", " return 
self.__send(self.__name, args)", " File 
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1452, in __request", " 
verbose=self.__verbose", " File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 
1154, in request", " return self.single_request(host, handler, request_body, 
verbose)", " File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 1166, in 
single_request", " http_conn = self.send_request(host, handler, request_body, 
verbose)", " File \"/usr/lib64/python3.6/xmlrpc/client.py\", line 1279, in 
send_request", " self.send_content(connection, request_body)", " File 
\"/usr/lib64/python3.6/xmlrpc/client.py\", line 1309, in send_content", " 
connection.endheaders(request_body)", " File 
\"/usr/lib64/python3.6/http/client.py\", line 1249, in endheaders", " 
self._send_output(message_body, encode_chunked=encode_chunked)", " File 
\"/usr/lib64/python3.6/http/client.py\", line 1036, in _send_output", " 
self.send(msg)", " File \"/usr/lib
 64/python3.6/http/client.py\", line 974, in send", " self.connect()", " File 
\"/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py\", 
line 74, in connect", " self.sock.connect(base64.b16decode(self.host))", 
"FileNotFoundError: [Errno 2] No such file or directory"], "stdout": "", 
"stdout_lines": []}
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/IB3CRX3IAM2GXMPISU3DG45UVDVKUW3Q/


[ovirt-users] Re: 4.4 bug? Seems 100% repeatable

2020-05-21 Thread dan . creed
Never mind... so resolv.conf was getting 2 X IPv6 DNS servers from the fact in 
IPv6 was set to auto, only IPv4 was set to static. 
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/B3PUR5ZATA7FJJ2NQHCA2AYOW6BTDJ47/


[ovirt-users] 4.4 bug? Seems 100% repeatable

2020-05-21 Thread dan . creed
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "The host 
has been set in non_operational status, deployment errors: code 505: Host 
quigonn installation failed. Failed to configure management network on the 
host., code 1120: Failed to configure management network on host quigonn due to 
setup networks failure., code 9000: Failed to verify Power Management 
configuration for Host quigonn., code 10802: VDSM quigonn command 
HostSetupNetworksVDS failed: Internal JSON-RPC error: {'reason': 'Three or more 
nameservers are only supported when using either IPv4 or IPv6 nameservers but 
not both.'}, fix accordingly and re-deploy."}

This has me utterly confused, I only have 1 name server specified in 
/etc/resolv.conf

Not sure where it's getting this from?
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PEX4GCFJWTB3CCQKXJ6BN2MHL73NE5NA/


[ovirt-users] Re: oVirt 4.4.0 Release is now generally available

2020-05-21 Thread Derek Atkins
Nir,

Nir Soffer  writes:

> Why not open RFE to add the feature you need?

I did -- about 3-4 years ago.  SOME of them have been implemented, some
have been partially implemented, but I am still waiting for ovirt to
support the full VM startup functionality that I had in vmware-server
from like 2007 (or earlier).

Part of the issue here is that I suspect most ovirt users have multiple
hosts and therefore rarely have to worry about how host-system
maintenance affects the VMs, and probably live in data centers with
redundant power supplies, UPSes, and backup generators.

I, on the other hand, I've got a single system so when I need to
perform any maintenance I need to take down everything, or if I have a
power outage that outlasts my UPS, or...  I want the VMs to come back up
automatically -- and in a particular order (e.g., I need my DNS and KDC
servers to come up before others).

I filed these RFEs during the 4.0 days, which is when I first started
using ovirt and put it into deployment.

> You can use the python SDK to do anything supported by oVirt API.
> Did you look here?
> https://github.com/oVirt/ovirt-engine-sdk/tree/master/sdk/examples

I have looked there, but I stopped reading after seeing "python".  ;)
Frankly I detest python.  I think it's an abomination.  There are so
many other, better languages out there and I don't understand why so
many people like it (and worse, force it down everyone else's throats).
But I'll step off my soap-box (and get off my lawn!)  lol.

Honestly, I already spent the time to build a tool to do what I need.  I
even had to update the tool going from 4.1 to 4.3 because some startup
assumptions changed.  I really don't want to spend the time again, time
I frankly don't have right now, to re-implement what I've already got.
It's easier for me to just stay put on 4.3.x.

Yes, I realize that in about 2 years or so I will need to do so.  I'll
worry about that then.

Of course, since the (partial?) functionality is only in 4.4, I really
have no way to test it to make sure it does what I need, so see what I'm
missing.  I don't have a testbed to play with it, just my one system.

Thanks,

-derek
-- 
   Derek Atkins 617-623-3745
   de...@ihtfp.com www.ihtfp.com
   Computer and Internet Security Consultant
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/X7LYSJ6M2YUUKSRT3H4A5RR4MUOTNYOS/


[ovirt-users] Ovirt 4.4 Migration assistance needed.

2020-05-21 Thread Strahil Nikolov via Users
Hello All,

I  would like to ask  for some  assistance  with  the planing  of the upgrade 
to 4.4 .

I have  issues with the  OVN (doesn't work at all),  thus  I would like to 
start fresh with the HE.

The plan so far (downtime is not an issue) :

1. Reinstall  the nodes one by 1 and  rejoin them in the Gluster  TSP
2. Wipe  the HostedEngine's gluster  volume
3. Deploy a fresh hosted  engine
4. Import the storage  domains (gluster) back to the  engine and import the VMs

Do you see any issues  with the plan ?
Any problems expected  if the VMs do have snapshots?  What about the storage  
domain version ?

Thanks  in Advance.

Best Regards,
Strahil Nikolov
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QNPOH55AAAYOX5GX3EN5H5ZMOZHKYELI/


[ovirt-users] Re: Non storage nodes erronously included in quota calculations for HCI?

2020-05-21 Thread Strahil Nikolov via Users
On May 21, 2020 12:29:24 PM GMT+03:00, Strahil Nikolov via Users 
 wrote:
>On May 20, 2020 5:12:05 PM GMT+03:00, tho...@hoberg.net wrote:
>>OK ;-)
>>
>>3 node HCI 2+1 data/arbiter
>>added 3 compute-only nodes via host install without HE support which
>>add no storage to the gluster (install still adds them as peers).
>>
>>With 2 compute-only nodes inactive/down I updated the third compute
>>node (no contributing bricks) and saw all VMs pausing and glusterd on
>>the HCI nodes "lost quorum on brick engine/vmstore/data" when it
>>rebooted to activate the new kernel.
>>
>>Had to launch additional compute-only node to let glusterd on HCI
>nodes
>>recover quorum.
>>Seems glusterd computes quorum based on total peers (6) not on
>>redundancy (2+1).
>>
>>With the gluster volumes down, running VMs remain paused according th
>>virsh, HE and UI aren't there, hosted-engine --vm-status reports "not
>>retrieved from storage"
>>___
>>Users mailing list -- users@ovirt.org
>>To unsubscribe send an email to users-le...@ovirt.org
>>Privacy Statement: https://www.ovirt.org/privacy-policy.html
>>oVirt Code of Conduct:
>>https://www.ovirt.org/community/about/community-guidelines/
>>List Archives:
>>https://lists.ovirt.org/archives/list/users@ovirt.org/message/F6QOGNZVPMCRAW4KP3MSMHOXSSRA4IMY/
>
>Hi Thomas,
>
>Quite  strange.
>Get to  one of the gluster tsp nodes and provide  some data:
>
>gluster volume list
>gluster pool list
>for i in $(gluster  volume list); do gluster  volume status $i;echo;
>gluster volume status $i; echo;echo;echo; done
>
>Best Regards,
>Strahil Nikolov
>___
>Users mailing list -- users@ovirt.org
>To unsubscribe send an email to users-le...@ovirt.org
>Privacy Statement: https://www.ovirt.org/privacy-policy.html
>oVirt Code of Conduct:
>https://www.ovirt.org/community/about/community-guidelines/
>List Archives:
>https://lists.ovirt.org/archives/list/users@ovirt.org/message/DPXTHW6WMAYKWJHPA2VPUF3BIUC7OKTE/

Yeah...
The for loop should  use  'status' and 'info' , so it should be somwthing like:

gluster volume list
gluster pool list
for i in $(gluster  volume list); do gluster  volume status $i;echo; gluster 
volume info $i; echo;echo;echo; done

Best Regards,
Strahil Nikolov
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/AVROC45QWHFDVRX7YSHHTWL2TCBPTL7X/


[ovirt-users] Re: Non storage nodes erronously included in quota calculations for HCI?

2020-05-21 Thread Strahil Nikolov via Users
On May 20, 2020 5:12:05 PM GMT+03:00, tho...@hoberg.net wrote:
>OK ;-)
>
>3 node HCI 2+1 data/arbiter
>added 3 compute-only nodes via host install without HE support which
>add no storage to the gluster (install still adds them as peers).
>
>With 2 compute-only nodes inactive/down I updated the third compute
>node (no contributing bricks) and saw all VMs pausing and glusterd on
>the HCI nodes "lost quorum on brick engine/vmstore/data" when it
>rebooted to activate the new kernel.
>
>Had to launch additional compute-only node to let glusterd on HCI nodes
>recover quorum.
>Seems glusterd computes quorum based on total peers (6) not on
>redundancy (2+1).
>
>With the gluster volumes down, running VMs remain paused according th
>virsh, HE and UI aren't there, hosted-engine --vm-status reports "not
>retrieved from storage"
>___
>Users mailing list -- users@ovirt.org
>To unsubscribe send an email to users-le...@ovirt.org
>Privacy Statement: https://www.ovirt.org/privacy-policy.html
>oVirt Code of Conduct:
>https://www.ovirt.org/community/about/community-guidelines/
>List Archives:
>https://lists.ovirt.org/archives/list/users@ovirt.org/message/F6QOGNZVPMCRAW4KP3MSMHOXSSRA4IMY/

Hi Thomas,

Quite  strange.
Get to  one of the gluster tsp nodes and provide  some data:

gluster volume list
gluster pool list
for i in $(gluster  volume list); do gluster  volume status $i;echo; gluster 
volume status $i; echo;echo;echo; done

Best Regards,
Strahil Nikolov
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/DPXTHW6WMAYKWJHPA2VPUF3BIUC7OKTE/


[ovirt-users] Re: HostedEngine HA Engine Network Status

2020-05-21 Thread Asaf Rachmani
In addition to what Didi suggested, you can enable DEBUG level in order to
get more details in broker.log:
1. Edit /etc/ovirt-hosted-engine-ha/broker-log.conf
2. In [logger_root] section change the level parameter to level=DEBUG
3. Restart the service: systemctl restart ovirt-ha-broker

Regards,
Asaf

On Thu, May 21, 2020 at 10:01 AM Yedidyah Bar David  wrote:

> On Thu, May 21, 2020 at 8:55 AM Joseph Goldman 
> wrote:
> >
> > Hi List,
> >
> >   Running a 3 node setup for a client, i'm constantly having the
> > HostedEngine move itself around, whatever node its on ends up penalizing
> > its score so low that it forces a migrate to the other node.
> >
> >   Looking at /var/log/ovirt-hosted-engine-ha/agent.log shows a decent
> > amount of:
> >
> > MainThread::INFO::2020-05-21
> >
> 15:47:54,742::states::135::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
> > Penalizing score by 319 due to network status
> >
> >   What I want to know is how do I get more debug out of this to know
> > what network status its concerned about, so I can go about stablising it.
>
> You can see some more info in broker.log in same log dir. Search for
> "network".
>
> If it's not enough to understand why it penalizes, you might want to add
> some
> logging to the code, which is:
>
>
> https://github.com/oVirt/ovirt-hosted-engine-ha/blob/master/ovirt_hosted_engine_ha/broker/submonitors/network.py
>
> or, on your machine, in
>
>
> /usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/submonitors/network.py
>
> (or python2.7, for <= 4.3).
>
> See also the git log ("History" button in github) for recent changes,
> including
> adding logging to the dns tester.
>
> >
> >   The system is heavily monitored with ping checks, never drops link and
> > never drops ICMP. None of its VM's falter accessing shared NFS space for
> > disk storage so I'm not sure what the concern is. The node will
> > literally over time penalise itself down to ~2000 and then HA agent will
> > want it to swap nodes. It's not necessarily a bad thing but generates a
> > heap of status emails multiple times a day which is just garbage - and
> > makes the HE unavailable sometimes when mid-admin task.
>
> Understood.
>
> Which network tester do you use? If it's dns (IIRC the default now),
> perhaps
> it's a problem with your dns server(s).
>
> Good luck and best regards,
> --
> Didi
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/JHZPC7IHU3LRPNWXMKKEIRHB6ZJ2V2I3/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/QDI6XQTGEPTXUGSCVZ7XTRI2WPUIZIF7/


[ovirt-users] Re: HostedEngine HA Engine Network Status

2020-05-21 Thread Yedidyah Bar David
On Thu, May 21, 2020 at 8:55 AM Joseph Goldman  wrote:
>
> Hi List,
>
>   Running a 3 node setup for a client, i'm constantly having the
> HostedEngine move itself around, whatever node its on ends up penalizing
> its score so low that it forces a migrate to the other node.
>
>   Looking at /var/log/ovirt-hosted-engine-ha/agent.log shows a decent
> amount of:
>
> MainThread::INFO::2020-05-21
> 15:47:54,742::states::135::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
> Penalizing score by 319 due to network status
>
>   What I want to know is how do I get more debug out of this to know
> what network status its concerned about, so I can go about stablising it.

You can see some more info in broker.log in same log dir. Search for "network".

If it's not enough to understand why it penalizes, you might want to add some
logging to the code, which is:

https://github.com/oVirt/ovirt-hosted-engine-ha/blob/master/ovirt_hosted_engine_ha/broker/submonitors/network.py

or, on your machine, in

/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/submonitors/network.py

(or python2.7, for <= 4.3).

See also the git log ("History" button in github) for recent changes, including
adding logging to the dns tester.

>
>   The system is heavily monitored with ping checks, never drops link and
> never drops ICMP. None of its VM's falter accessing shared NFS space for
> disk storage so I'm not sure what the concern is. The node will
> literally over time penalise itself down to ~2000 and then HA agent will
> want it to swap nodes. It's not necessarily a bad thing but generates a
> heap of status emails multiple times a day which is just garbage - and
> makes the HE unavailable sometimes when mid-admin task.

Understood.

Which network tester do you use? If it's dns (IIRC the default now), perhaps
it's a problem with your dns server(s).

Good luck and best regards,
-- 
Didi
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JHZPC7IHU3LRPNWXMKKEIRHB6ZJ2V2I3/