[ovirt-users] Re: Weird problem starting VMs in oVirt-4.4
Krutika Dhananjay wrote: Yes, so the bug has been fixed upstream and the backports to release-7 and release-8 of gluster pending merge. The fix should be available in the next .x release of gluster-7 and 8. Until then like Nir suggested, please turn off performance.stat-prefetch on your volumes. It looks like I ran exactly into this bug when I wrote this: https://lists.ovirt.org/archives/list/users@ovirt.org/message/3BQMCIGCEOLIOV3LSW47GVPKSMOOK7IL/ During my tests, the deployment went through when trying for the third time - only to discover of course that the problem persists and it, sure enough came back to haunt me when I rebooted the hosted engine. I'm not entirely sure I fully understand the problem. What I did, of course, was this: # gluster volume set engine performance.stat-prefetch off It doesn't help with my currently deployed HE - it gets stuck at the graphical BIOS screen which I can interact with using "hosted-engine --console" but the best outcome there is to "Reset" which turns the whole VM off. Assuming something got lost with the stat-prefetch setting turned on before: Is there any way to fix this? Will a redeployment surely fix it? Bonus question: I'm using oVirt Node for the VM and Gluster hosts. Will a fix be coming by way of package updates for this in the foreseeable future? Thank you Oliver ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/EFDKZOTY6I4QBGTXB275J2XCXOMOTBI6/
[ovirt-users] Re: oVirt 4.4.0 HE deployment on GlusterFS fails during health check
Hi, Your gluster mount option is not correct. > You need 'backup-volfile-servers=storagehost2:storagehost3' (without the > volume name as they all have thaylt volume) . yes, of course. I'm sorry but the appended volume name was a mistake I made for the email and not during deployment where only specified the FQDNs without the volname. As mentioned, the mount generally seema to work as data ist written during deployment. It fails later during health check :-( Best regards Oliver ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SP7ZNIIXGHQ73452J4D6FYWQT2DQIKHR/
[ovirt-users] oVirt 4.4.0 HE deployment on GlusterFS fails during health check
Hi, I have the following two components: 1.) A freshly installed VM host (oVirt Node 4.4.0 release ISO) 2.) 3 storage hosts, also freshly installed from oVirt Node 4.4.0 release ISO The storage hosts have been successfully installed with Gluster (through Cockpit). They have two volumes, both of which I can mount and read/write from a client. On the VM host, I ran "hosted-engine --deploy" (no backups imported). When prompted for storage, I answered "glusterfs" and specified "storagehost1:/engine" as storage for the HE deployment. For mount options, I specified "backup-volfile-servers=storagehost2:/engine:storagehost3:/engine" (Not the real hostnames, but all of them are resolvable via internal DNS) Everything seems to works fine, I also see the "engine" volume become populated with data. At some point I could ping and SSH login to the HE. When the setup proceed to health check, it failed and the whole process was aborted :-( "hosted-engine --vm-status" reported "failed liveliness check" when it was reachable via SSH. At some point the engine went down and, to my surprise, shows a grub prompt after the restart when doing a "hosted-engine --console". [ INFO ] TASK [ovirt.hosted_engine_setup : Check engine VM health] [ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 180, "changed": true, "cmd": ["hosted-engine", "--vm-status", "--json"], "delta": "0:00:00.160595", "end": "2020-06-12 17:50:05.675774", "rc": 0, "start": "2020-06-12 17:50:05.515179", "stderr": "", "stderr_lines": [], "stdout": "{\"1\": {\"host-id\": 1, \"host-ts\": 11528, \"score\": 3400, \"engine-status\": {\"vm\": \"up\", \"health\": \"bad\", \"detail\": \"Powering down\", \"reason\": \"failed liveliness check\"}, \"hostname\": \"vmhost\", \"maintenance\": false, \"stopped\": false, \"crc32\": \"2c447835\", \"conf_on_shared_storage\": true, \"local_conf_timestamp\": 11528, \"extra\": \"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=11528 (Fri Jun 12 17:49:57 2020)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=11528 (Fri Jun 12 17:49:57 2020)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStop\\nstopped=False\\ntimeout=Thu Jan 1 04:12:48 1970\\n\", \"live-data\": true}, \"global_maintenance\": false}", "stdout_lines": ["{\"1\": {\"host-id\": 1, \"host-ts\": 11528, \"score\": 3400, \"engine-status\": {\"vm\": \"up\", \"health\": \"bad\", \"detail\": \"Powering down\", \"reason\": \"failed liveliness check\"}, \"hostname\": \"vmhost\", \"maintenance\": false, \"stopped\": false, \"crc32\": \"2c447835\", \"conf_on_shared_storage\": true, \"local_conf_timestamp\": 11528, \"extra\": \"metadata_parse_version=1\\nmetadata_feature_version=1\\ntimestamp=11528 (Fri Jun 12 17:49:57 2020)\\nhost-id=1\\nscore=3400\\nvm_conf_refresh_time=11528 (Fri Jun 12 17:49:57 2020)\\nconf_on_shared_storage=True\\nmaintenance=False\\nstate=EngineStop\\nstopped=False\\ntimeout=Thu Jan 1 04:12:48 1970\\n\", \"live-data\": true}, \"global_maintenance\": false}"]} A second attempt failed at exactly the same stage. I can see the following in the setup log: ovirt-hosted-engine-setup-20200612151212-j9zwd2.log: 2020-06-12 17:33:18,314+0200 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:103 {'msg': 'non-zero return code', 'cmd': ['hosted-engine', '--reinitialize-lockspace', '--force'], 'stdout': '', 'stderr': 'Traceback (most recent call last):\n File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main\n "__main__", mod_spec)\n File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code\n exec(code, run_globals)\n File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/reinitialize_ lockspace.py", line 30, in \n ha_cli.reset_lockspace(force)\n File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/client/client.py", line 286, in reset_lockspace\n stats = broker.get_stats_from_storage()\n File "/usr/lib/python3.6/site-packages/ov irt_hosted_engine_ha/lib/brokerlink.py", line 148, in get_stats_from_storage\n result = self._proxy.get_stats()\n File "/usr/lib64/python3.6/xmlrpc/client.py", line 1112, in __call__\n return self.__send(self.__name, args)\n File "/usr/lib64/python3.6/xmlrpc/client .py", line 1452, in __request\n verbose=self.__verbose\n File "/usr/lib64/python3.6/xmlrpc/client.py", line 1154, in request\n return self.single_request(host, handler, request_body, verbose)\n File "/usr/lib64/python3.6/xmlrpc/client.py", line 1166, in single_requ est\n http_conn = self.send_request(host, handler, request_body, verbose)\n File "/usr/lib64/python3.6/xmlrpc/client.py", line 1279, in send_request\n self.send_content(connection, request_body)\n File "/usr/lib64/python3.6/xmlrpc/client.py", line 1309, in send_con tent\n connection.endheaders(request_body)\n File "/usr/lib64/python3.6/http/client.py", line 1249, in endheaders\n self._send_output(message_body,
[ovirt-users] Re: oVirt 4.4: Self-hosted engine deployment fails with backup restore from 4.3 engine
Hi, I think I know (it's hard to tell without more logs, but anyway): It's because your PKI was expired and thus renewed. If you used the command line to restore/deploy, you were also asked: 'Renew engine CA on restore if needed? Please notice ' 'that if you choose Yes, all hosts will have to be ' 'later manually reinstalled from the engine. ' '(@VALUES@)[@DEFAULT@]: ' and probably replied Yes. You have two options: 1. Try again, and reply No. 2. Run first engine-setup (can add --offline to prevent it from upgrading) on your old engine. You should be prompted there, and reply Yes, and then take a backup after it finishes and try again to restore with that backup. In any case, that's a b your guess was right, I think (btw: I check the ca.pem in the old HE - this one is valid til 2028). I took the easy way and replied with "No". I will open a bug (my first one :-)). Is this one for the ovirt-hosted-engine-setup category? Anyway, setup ran much farther but still did not complete. It fails after this error now: [ ERROR ] ovirtsdk4.AuthError: Error during SSO authentication server_error : PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target [ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 50, "changed": false, "msg": "Error during SSO authentication server_error : PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target"} I can see this in /var/log/engine.log in the HE. 2020-05-27 16:10:43,695+02 ERROR [org.ovirt.engine.core.sso.utils.SsoUtils] (default task-8) [] OAuthException server_error: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target 2020-05-27 16:10:53,962+02 INFO [org.ovirt.engine.extension.aaa.jdbc.core.Authentication] (default task-8) [] locking user: admin due to interval failures 2020-05-27 16:10:58,956+02 ERROR [org.ovirt.engine.core.sso.utils.SsoUtils] (default task-8) [] OAuthException server_error: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target 2020-05-27 16:11:09,222+02 INFO [org.ovirt.engine.extension.aaa.jdbc.core.Authentication] (default task-8) [] locking user: admin due to interval failures 2020-05-27 16:11:14,217+02 ERROR [org.ovirt.engine.core.sso.utils.SsoUtils] (default task-8) [] OAuthException server_error: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target 2020-05-27 16:11:24,484+02 INFO [org.ovirt.engine.extension.aaa.jdbc.core.Authentication] (default task-8) [] locking user: admin due to interval failures 2020-05-27 16:11:29,480+02 ERROR [org.ovirt.engine.core.sso.utils.SsoUtils] (default task-8) [] OAuthException server_error: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target I tried to dig around a bit more in /var/log of the HE to get more details but can't seem to find anything there. Thanks in advance! Best regards Oli ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/PXPI2Z65JFXR7VCWNVSIPOAIEC4EZX3I/
[ovirt-users] Re: oVirt 4.4: Self-hosted engine deployment fails with backup restore from 4.3 engine
Hi, Yedidyah Bar David wrote: In any case (perhaps not relevant to you right now, if indeed engine-setup succeeded), usually the engine vm is left running at the end of a failed deploy. If it's still the local vm, you can find its IP address by searching the ansible logs for local_vm_ip, then you can ssh to it from the host. For fixing the "empty engine-logs dirs", now pushed this: https://github.com/oVirt/ovirt-ansible-hosted-engine-setup/pull/325 Didn't test yet, it's just a guess. I followed up on your remark that the engine may indeed be running. And it is, sorry for not seeing this earlier. Anyway, I was thus able to take a look in /var/log/ovirt-engine/setup in the HE VM and found the following error (I found a couple of more "suspicious" lines, but this one sticks out). 2020-05-27 00:17:09,660+0200 DEBUG otopi.context context._executeMethod:145 method exception Traceback (most recent call last): File "/usr/lib64/python3.6/site-packages/M2Crypto/BIO.py", line 279, in openfile f = open(filename, mode) FileNotFoundError: [Errno 2] No such file or directory: '/etc/pki/ovirt-engine/qemu-ca.pem' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/otopi/context.py", line 132, in _executeMethod method['method']() File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/pki/ca.py", line 699, in _miscUpgrade if self._expired(self._x509_load_cert(ca_file)): File "/usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/pki/ca.py", line 94, in _x509_load_cert res = X509.load_cert(f) File "/usr/lib64/python3.6/site-packages/M2Crypto/X509.py", line 802, in load_cert with BIO.openfile(file) as bio: File "/usr/lib64/python3.6/site-packages/M2Crypto/BIO.py", line 281, in openfile raise BIOError(ex.args) M2Crypto.BIO.BIOError: (2, 'No such file or directory') 2020-05-27 00:17:09,663+0200 ERROR otopi.context context._executeMethod:154 Failed to execute stage 'Misc configuration': (2, 'No such file or directory') Best regards Oli ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/IU2N3PECOV6VFGBWKXMHXDSEKCTVNZTB/
[ovirt-users] Re: oVirt 4.4: Self-hosted engine deployment fails with backup restore from 4.3 engine
Hi there, You should also see one or more ERROR messages, can you check/post them? There is one error message that immediately follows, if that helps: 2020-05-27 00:17:12,397+0200 ERROR otopi.context context._executeMethod:154 Failed to execute stage 'Closing up': Failed executing ansible-playbook 2020-05-27 00:17:12,397+0200 DEBUG otopi.context context.dumpEnvironment:765 ENVIRONMENT DUMP - BEGIN 2020-05-27 00:17:12,398+0200 DEBUG otopi.context context.dumpEnvironment:775 ENV BASE/error=bool:'True' 2020-05-27 00:17:12,398+0200 DEBUG otopi.context context.dumpEnvironment:775 ENV BASE/exceptionInfo=list:'[('RuntimeError'>, RuntimeError('Failed executing ansible-playbook',), )]' 2020-05-27 00:17:12,398+0200 DEBUG otopi.context context.dumpEnvironment:779 ENVIRONMENT DUMP - END 2020-05-27 00:17:12,398+0200 INFO otopi.context context.runSequence:616 Stage: Clean up 2020-05-27 00:17:12,399+0200 DEBUG otopi.context context.runSequence:620 STAGE cleanup Other than that, there is nothing that looks like an error (or contains the word "error"). Also, if possible, please try to check/share the engine-setup log. If you can access the engine VM, it's there, in: /var/log/ovirt-engine/setup The engine is not running after that, "hosted-engine --vm-status" gives me the following error: It seems like a previous attempt to deploy hosted-engine failed or it's still in progress. Please clean it up before trying again Otherwise, you might find it in the host doing the deployment, in: /var/log/ovirt-hosted-engine-setup/engine-logs* I have 4 directories like this (from my failed deployment attempts ;-)), but all of them are empty. The last attempt was with a new backup, just in case. The production oVirt is 4.3.9, the host I'm installing from is a clean install from the ovirt node 4.4.0 release ISO with the last available package upgrades. Best regards Oliver ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/3KXFKTXLHR3DQH6JHEGBIBURCLYGF3GB/
[ovirt-users] oVirt 4.4: Self-hosted engine deployment fails with backup restore from 4.3 engine
Hi there, I'm a bit puzzled about an possible upgrade paths from a 4.3 cluster to version 4.4 in a self-hosted engine environment. My idea was: Set up a new host with a clean ovirt node 4.4 installation, then deploy the hosted engine on this with a restored backup from the production cluster and go from there. This however fails with the following error: 2020-05-27 00:17:08,886+0200 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils._process_output:103 {'msg': 'non-zero return code', 'cmd': ['engine-setup', '--accept-defaults', '--config-append=/root/ovirt-engine-answers'], 'stdout': "[ INFO ] Stage: Initializing\n[ INFO ] Stage: Environment setup\n C onfiguration files: /etc/ovirt-engine-setup.conf.d/10-packaging-jboss.conf, /etc/ovirt-engine-setup.conf.d/10-packaging.conf, /etc/ovirt-engine-setup.conf.d/20-setup-ovirt-post.conf, /root/ovirt-engine-answers\n Log file: /var/log/ovirt-engine/setup/ovirt-engine-setup-20200527001657-fyeueu.log\n Version: otop i-1.9.1 (otopi-1.9.1-1.el8)\n[ INFO ] DNF Downloading 1 files, 0.00KB\n[ INFO ] DNF Downloaded CentOS-8 - AppStream\n[ INFO ] DNF Downloading 1 files, 0.00KB\n[ INFO ] DNF Downloaded CentOS-8 - Base\n[ INFO ] DNF Downloading 1 files, 0.00KB\n [...] ... anwsers from backup config follow [...] 2020-05-27 00:17:12,396+0200 DEBUG otopi.context context._executeMethod:145 method exception Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/otopi/context.py", line 132, in _executeMethod method['method']() File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/gr-he-ansiblesetup/core/misc.py", line 403, in _closeup r = ah.run() File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/ansible_utils.py", line 229, in run raise RuntimeError(_('Failed executing ansible-playbook')) Is this approach (restoring from 4.3) generally supposed to work? If not, what is the appropriate upgrade path? Thank you! Regards Oli ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/CY6UZZKQEQJVHA73W3ODHDY3D3VI3WHE/