On Sun, Nov 18, 2018 at 1:13 PM Alex K <[email protected]> wrote:
> > > On Sun, Nov 18, 2018 at 11:30 AM Alex K <[email protected]> wrote: > >> >> >> On Sun, Nov 18, 2018 at 8:53 AM Alex K <[email protected]> wrote: >> >>> >>> >>> On Sat, Nov 17, 2018, 19:32 Gianluca Cecchi <[email protected]> >>> wrote: >>> >>>> >>>> >>>> Il giorno Sab 17 Nov 2018 14:07 Alex K <[email protected]> ha >>>> scritto: >>>> >>>>> Hi all, >>>>> >>>>> I had a setup with ovirt 4.2.0 which at some point the engine stopped >>>>> responding, due to some split brain issues. >>>>> >>>>> Since was not able to resolve the split brain, I proceeded to redeploy >>>>> the engine. >>>>> >>>>> The steps I followed: >>>>> 1. upgrade servers (yum update) >>>>> 2. ran ovirt-hosted-engine-cleanup >>>>> 3. deployed engine (now 4.2.7) >>>>> >>>>> The deploy was successful and was able to add a new data domain. >>>>> The issue is that at this point I would expect the engine storage >>>>> domain and VM to be automatically imported, but it is not. At HA agent >>>>> logs >>>>> at the server I see: >>>>> >>>>> MainThread::INFO::2018-11-17 >>>>> 12:55:51,856::states::444::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) >>>>> Engine vm running on localhost >>>>> MainThread::WARNING::2018-11-17 >>>>> 12:55:52,145::ovf_store::140::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) >>>>> Unable to find OVF_STORE >>>>> MainThread::ERROR::2018-11-17 >>>>> 12:55:52,146::config_ovf::84::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config.vm::(_get_vm_conf_content_from_ovf_store) >>>>> Unable to identify the OVF_STORE volume, falling back to initial vm.conf. >>>>> Please ensure you already added your first data domain for regular VMs >>>>> MainThread::INFO::2018-11-17 >>>>> 12:55:52,246::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) >>>>> Current state EngineUp (score: 3400) >>>>> >>>>> While at engine.log of engine VM I see: >>>>> >>>>> 2018-11-17 12:47:14,748Z INFO >>>>> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] >>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-51) [] VM >>>>> '88dacb07-45f1-4bc1-80a0-9434d530eaaa' was discovered as 'Up' on VDS >>>>> '6eff2018-516d-4af1-807d-ecc31d024f4d'(v0.maya) >>>>> 2018-11-17 12:47:14,773Z INFO >>>>> [org.ovirt.engine.core.bll.AddUnmanagedVmsCommand] >>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-51) [51c593c1] Running >>>>> command: AddUnmanagedVmsCommand internal: true. >>>>> 2018-11-17 12:47:14,775Z INFO >>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand] >>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-51) [51c593c1] START, >>>>> DumpXmlsVDSCommand(HostName = v0.maya, >>>>> Params:{hostId='6eff2018-516d-4af1-807d-ecc31d024f4d', >>>>> vmIds='[88dacb07-45f1-4bc1-80a0-9434d530eaaa]'}), log id: 44bb4e0a >>>>> 2018-11-17 12:47:14,779Z INFO >>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand] >>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-51) [51c593c1] FINISH, >>>>> DumpXmlsVDSCommand, return: {88dacb07-45f1-4bc1-80a0-9434d530eaaa=<domain >>>>> type='kvm' id='7'> >>>>> ... >>>>> <some kind of XML> >>>>> ... >>>>> 2018-11-17 12:47:14,793Z WARN >>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder] >>>>> (EE-ManagedThreadFactory-engineScheduled-Thread-51) [51c593c1] null >>>>> architecture type, replacing with x86_64, VM [HostedEngine] >>>>> >>>>> Sth is causing engine not not getting imported. >>>>> Tried to run hosted-engine –reinitialize-lockspace, since I was >>>>> getting some lockspace errors, but no change. >>>>> >>>>> Any idea what could be causing this? >>>>> I am left with little time due to the site being production. Any idea >>>>> is appreciated. >>>>> >>>>> Thanx, >>>>> Alex >>>>> >>>>> _______________________________________________ >>>>> Users mailing list -- [email protected] >>>>> To unsubscribe send an email to [email protected] >>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>>> oVirt Code of Conduct: >>>>> https://www.ovirt.org/community/about/community-guidelines/ >>>>> List Archives: >>>>> https://lists.ovirt.org/archives/list/[email protected]/message/M4DXHOUQ45QY77P5VVG4AZKYYYGHBFOT/ >>>> >>>> >>>> In step 3 how did you deploy engine? >>>> I had the same problem some days ago and was due to a bug in using >>>> command line and excluding ansible (option --no-ansible) >>>> I solved redeploying using the default that is with ansible >>>> >>> I deployed with --no-ansible flag since the ansible way was giving me an >>> error (sth with localhost). I can try ansible to check what was the error. >>> >> The error I am getting when trying to deploy with ansible is the >> following: >> >> 2018-11-17 09:03:50,378+0000 DEBUG >> otopi.ovirt_hosted_engine_setup.ansible_utils >> ansible_utils._process_output:94 hostname_resolution_output: >> {'stderr_lines': [], u'changed': True, u'end': u'2018-11-17 >> 09:03:48.572863', u'stdout': u'', u'cmd': u'getent ahostsv4 v0.maya | grep >> v0.maya', u'failed': True, u'delta': u'0:00:00.005712', u'stderr': u'', >> u'rc': 1, u'msg': u'non-zero return code', 'stdout_lines': [], u'start': >> u'2018-11-17 09:03:48.567151'} >> >> 2018-11-17 09:03:51,280+0000 INFO >> otopi.ovirt_hosted_engine_setup.ansible_utils >> ansible_utils._process_output:100 TASK [Check address resolution] >> >> 2018-11-17 09:03:52,082+0000 DEBUG >> otopi.ovirt_hosted_engine_setup.ansible_utils >> ansible_utils._process_output:94 {u'msg': u'Unable to resolve address\n', >> u'changed': False, u'_ansible_no_log': False} >> >> 2018-11-17 09:03:52,182+0000 ERROR >> otopi.ovirt_hosted_engine_setup.ansible_utils >> ansible_utils._process_output:98 fatal: [localhost]: FAILED! => {"changed": >> false, "msg": "Unable to resolve address\n"} >> >> 2018-11-17 09:03:52,784+0000 DEBUG >> otopi.ovirt_hosted_engine_setup.ansible_utils >> ansible_utils._process_output:94 PLAY RECAP [localhost] : ok: 16 changed: 3 >> unreachable: 0 skipped: 4 failed: 1 >> >> 2018-11-17 09:03:52,884+0000 DEBUG >> otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:180 >> ansible-playbook rc: 2 >> >> 2018-11-17 09:03:52,884+0000 DEBUG >> otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:187 >> ansible-playbook stdout: >> >> -- >> >> File >> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/ansible_utils.py", >> line 194, in run >> >> raise RuntimeError(_('Failed executing ansible-playbook')) >> >> RuntimeError: Failed executing ansible-playbook >> >> 2018-11-17 09:03:52,886+0000 ERROR otopi.context >> context._executeMethod:152 Failed to execute stage 'Closing up': Failed >> executing ansible-playbook >> >> 2018-11-17 09:03:52,887+0000 DEBUG otopi.context >> context.dumpEnvironment:859 ENVIRONMENT DUMP - BEGIN >> >> 2018-11-17 09:03:52,887+0000 DEBUG otopi.context >> context.dumpEnvironment:869 ENV BASE/error=bool:'True' >> >> 2018-11-17 09:03:52,887+0000 DEBUG otopi.context >> context.dumpEnvironment:869 ENV BASE/exceptionInfo=list:'[(<type >> 'exceptions.RuntimeError'>, RuntimeError('Failed executing >> ansible-playbook',), <traceback object at 0x7fefb0248f38>)]' >> >> How Can I overcome this? I recall I've seen this on past attempts also >> and was able to proceed only with the traditional python (--no-ansible) >> way. >> > I was able to overcome this by amending the /etc/hosts file at the server. > It had some erroneous entries. > The deployment was able to proceed and engine is up though it gave only at > the end the following error: > > [ INFO ] TASK [Wait for the local bootstrap VM to be down at engine eyes] > [ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_vms": > [{"affinity_labels": [], "applications": [], "bios": {"boot_menu": > {"enabled": false}}, "cdroms": [], "cluster": {"href": > "/ovirt-engine/api/clusters/a407b02c-eb1a-11e8-a5a5-00163e445490", "id": > "a407b02c-eb1a-11e8-a5a5-00163e445490"}, "comment": "", "cpu": > {"architecture": "x86_64", "topology": {"cores": 1, "sockets": 4, > "threads": 1}}, "cpu_profile": {"href": > "/ovirt-engine/api/cpuprofiles/58ca604e-01a7-003f-01de-000000000250", "id": > "58ca604e-01a7-003f-01de-000000000250"}, "cpu_shares": 0, "creation_time": > "2018-11-18 10:17:45.351000+00:00", "delete_protected": false, > "description": "", "disk_attachments": [], "display": {"address": > "127.0.0.1", "allow_override": false, "copy_paste_enabled": true, > "disconnect_action": "LOCK_SCREEN", "file_transfer_enabled": true, > "monitors": 1, "port": 5900, "single_qxl_pci": false, "smartcard_enabled": > false, "type": "vnc"}, "fqdn": "engine.maya", "graphics_consoles": [], > "guest_operating_system": {"architecture": "x86_64", "codename": "", > "distribution": "CentOS Linux", "family": "Linux", "kernel": {"version": > {"build": 0, "full_version": "3.10.0-862.14.4.el7.x86_64", "major": 3, > "minor": 10, "revision": 862}}, "version": {"full_version": "7", "major": > 7}}, "guest_time_zone": {"name": "UTC", "utc_offset": "+00:00"}, > "high_availability": {"enabled": false, "priority": 0}, "host": {"href": > "/ovirt-engine/api/hosts/4250ef49-969c-4cd5-8a4f-30b7755a7d36", "id": > "4250ef49-969c-4cd5-8a4f-30b7755a7d36"}, "host_devices": [], "href": > "/ovirt-engine/api/vms/a7872048-030e-4991-be23-43283794d650", "id": > "a7872048-030e-4991-be23-43283794d650", "io": {"threads": 1}, > "katello_errata": [], "large_icon": {"href": > "/ovirt-engine/api/icons/c444caf0-5750-9602-f4b4-62db210b133b", "id": > "c444caf0-5750-9602-f4b4-62db210b133b"}, "memory": 10737418240, > "memory_policy": {"guaranteed": 10737418240, "max": 10737418240}, > "migration": {"auto_converge": "inherit", "compressed": "inherit"}, > "migration_downtime": -1, "multi_queues_enabled": true, "name": > "external-HostedEngineLocal", "next_run_configuration_exists": false, > "nics": [], "numa_nodes": [], "numa_tune_mode": "interleave", "origin": > "external", "original_template": {"href": > "/ovirt-engine/api/templates/00000000-0000-0000-0000-000000000000", "id": > "00000000-0000-0000-0000-000000000000"}, "os": {"boot": {"devices": > ["hd"]}, "type": "other"}, "permissions": [], "placement_policy": > {"affinity": "migratable"}, "quota": {"id": > "b4232eb4-eb1a-11e8-9bc5-00163e445490"}, "reported_devices": [], > "run_once": false, "sessions": [], "small_icon": {"href": > "/ovirt-engine/api/icons/4a0580c6-11ba-bc2e-6c82-211666f323e9", "id": > "4a0580c6-11ba-bc2e-6c82-211666f323e9"}, "snapshots": [], "sso": > {"methods": [{"id": "guest_agent"}]}, "start_paused": false, "stateless": > false, "statistics": [], "status": "unknown", > "storage_error_resume_behaviour": "auto_resume", "tags": [], "template": > {"href": > "/ovirt-engine/api/templates/00000000-0000-0000-0000-000000000000", "id": > "00000000-0000-0000-0000-000000000000"}, "time_zone": {"name": "Etc/GMT"}, > "type": "desktop", "usb": {"enabled": false}, "watchdogs": []}]}, > "attempts": 24, "changed": false, "deprecations": [{"msg": "The > 'ovirt_vms_facts' module is being renamed 'ovirt_vm_facts'", "version": > 2.8}]} > [ ERROR ] Failed to execute stage 'Closing up': Failed executing > ansible-playbook > > The engine is up and I can SSH into it. Though when I try to login through > GUI I get " The redirection URI for client is not registered ", even > though I have set its IP address in SSO_ALTERNATE_ENGINE_FQDNS through a > config file. What could be the issue now? Thanx > Ok I managed to resolve this one also by adding 127.0.0.1 in the SSO_ALTERNATE_ENGINE_FQDNS, as I using port forwarding through SSH to access engine GUI. Now I am in the process to import Data SD and hopefully will complete my restoration... :) crossing fingers > > >> >>> HIH, >>>> Gianluca >>>> >>>
_______________________________________________ Users mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/MJCQEVQU2T76A2K2XVZVHQLTVWFF7EZS/

