Hi, > hosted-engine1 : 192.168.122.66 > hosted-engine2 : 192.168.122.223
But you said in an earlier email that: > I hava two node, A:192.168.122.65 , B:192.168.122.66 Make sure your names resolve properly. So far it does exactly what it is supposed to do - when the engine is unreachable, it tries restarting it. Did you really use hosted-engine.ovirt.com as the fqdn? Are you sure it resolves to whatever IP the VM has (192.168.122.91)? Maybe you used /etc/hosts to configure the name on the first host and in the VM, but miss the record on the second host? What does $(host hosted-engine.ovirt.com) show you? > I can not visit web UI, but my engine VM is run, i can login it. engine has > some error > > VdsIdVDSCommandParametersBase:{hostId='1b5f799a-125d-4f4e-8aef-cb2ecdd63136'})' > execution failed: java.net.NoRouteToHostException: No route to host I told you before. This is normal as it is trying to figure out whether the host is up. Best regards Martin Sivak On Thu, Apr 26, 2018 at 4:14 AM, <dhy...@sina.com> wrote: > engine VM:192.168.122.91 > hosted-engine1 : 192.168.122.66 > hosted-engine2 : 192.168.122.223 > > I can not visit web UI, but my engine VM is run, i can login it. engine has > some error > > 2018-04-25 18:35:03,401+08 INFO > [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) > [] Connecting to hosted-engine1/192.168.122.66 > 2018-04-25 18:35:06,411+08 ERROR > [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] > (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Command > 'GetAllVmStatsVDSCommand(HostName = hosted-engine1, > > VdsIdVDSCommandParametersBase:{hostId='1b5f799a-125d-4f4e-8aef-cb2ecdd63136'})' > execution failed: java.net.NoRouteToHostException: No route to host > ---------------------------------------------------------------------------------------------------------------------------------------- > [root@hosted-engine2 ~]# hosted-engine --check-liveliness > Hosted Engine is not up! > ----------------------------------------------------------------------------------------------------------------------------------------- > [root@hosted-engine2 ~]# curl > http://hosted-engine.ovirt.com/ovirt-engine/services/health > <html><head><title>Error</title></head><body>404 - Not Found</body></html> > > Note: this command is blocked ,it takes 5 minutes > ----------------------------------------------------------------------------------------------------------------------------------------- > --== Host 1 status ==-- > > conf_on_shared_storage : True > Status up-to-date : False > Hostname : hosted-engine1 > Host ID : 1 > Engine status : unknown stale-data > Score : 3400 > stopped : False > Local maintenance : False > crc32 : 1eae8968 > local_conf_timestamp : 48907 > Host timestamp : 48907 > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=48907 (Thu Apr 26 01:57:14 2018) > host-id=1 > score=3400 > vm_conf_refresh_time=48907 (Thu Apr 26 01:57:15 2018) > conf_on_shared_storage=True > maintenance=False > state=EngineUp > stopped=False > > > --== Host 2 status ==-- > > conf_on_shared_storage : True > Status up-to-date : True > Hostname : hosted-engine2 > Host ID : 2 > Engine status : {"reason": "failed liveliness check", > "health": "bad", "vm": "up", "detail": "Up"} > Score : 3000 > stopped : False > Local maintenance : False > crc32 : 1b92756d > local_conf_timestamp : 44057 > Host timestamp : 44057 > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=44057 (Thu Apr 26 02:00:57 2018) > host-id=2 > score=3000 > vm_conf_refresh_time=44057 (Thu Apr 26 02:00:57 2018) > conf_on_shared_storage=True > maintenance=False > state=EngineStarting > stopped=False > > > > > > > ----- Original Message -----ovirt > From: Martin Sivak <msi...@redhat.com> > To: dhy336 <dhy...@sina.com> > Cc: users <users@ovirt.org> > Subject: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch > Date: 2018-04-25 20:41 > > >> 2018-04-25 18:35:06,411+08 ERROR >> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] >> (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Command >> 'GetAllVmStatsVDSCommand(HostName = hosted-engine1, >> >> VdsIdVDSCommandParametersBase:{hostId='1b5f799a-125d-4f4e-8aef-cb2ecdd63136'})' >> execution failed: java.net.NoRouteToHostException: No route to host > This is expected and normal. The ovirt-engine service is trying to > find out whether host A is still unreachable or not. This is not the > issue you are looking for. >> 192.168.122.66 has been powered off, and hosted engine VM run in >> 192.168.122.223, I think engine should connect to 192.168.122.223, > You are mixing the IP of the engine VM and the IP of a host. The > engine runs in VM with stable .122.223 (independent on which host the > VM runs at) and manages two hosts .122.65 and .122.66. The engine > constantly monitors all its hosts and that means it is trying to > connect to them every now and then. > Please execute the two following commands on Host B and show us the > results (use the proper fqdn): > $(hosted-engine --check-liveliness) > $(curl http://{fqdn}/ovirt-engine/services/health) > Best regards > Martin Sivak > On Wed, Apr 25, 2018 at 2:34 PM, <dhy...@sina.com> wrote: >> I login in engine VM by (#hosted-engine --console) , I find ovirt-engine >> process. and I find some error in /var/log/ovirt-engine/engine.log >> >> 192.168.122.66 has been powered off, and hosted engine VM run in >> 192.168.122.223, I think engine should connect to 192.168.122.223, >> >> >> 2018-04-25 18:35:03,401+08 INFO >> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) >> [] Connecting to hosted-engine1/192.168.122.66 >> 2018-04-25 18:35:06,411+08 ERROR >> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] >> (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Command >> 'GetAllVmStatsVDSCommand(HostName = hosted-engine1, >> >> VdsIdVDSCommandParametersBase:{hostId='1b5f799a-125d-4f4e-8aef-cb2ecdd63136'})' >> execution failed: java.net.NoRouteToHostException: No route to host >> 2018-04-25 18:35:06,411+08 INFO >> [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher] >> (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Failed to fetch vms >> info for host 'hosted-engine1' - skipping VMs monitoring. >> 2018-04-25 18:35:21,420+08 INFO >> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) >> [] Connecting to hosted-engine1/192.168.122.66 >> 2018-04-25 18:35:24,430+08 ERROR >> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] >> (EE-ManagedThreadFactory-engineScheduled-Thread-1) [] Command >> 'GetAllVmStatsVDSCommand(HostName = hosted-engine1, >> >> VdsIdVDSCommandParametersBase:{hostId='1b5f799a-125d-4f4e-8aef-cb2ecdd63136'})' >> execution failed: java.net.NoRouteToHostException: No route to host >> 2018-04-25 18:35:24,431+08 INFO >> [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher] >> (EE-ManagedThreadFactory-engineScheduled-Thread-1) [] Failed to fetch vms >> info for host 'hosted-engine1' - skipping VMs monitoring. >> 2018-04-25 18:35:39,438+08 INFO >> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) >> [] Connecting to hosted-engine1/192.168.122.66 >> >> >> >> ----- Original Message ----- >> From: Martin Sivak <msi...@redhat.com> >> To: dhy336 <dhy...@sina.com> >> Cc: users <users@ovirt.org> >> Subject: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch >> Date: 2018-04-25 20:27 >> >> >> The engine will try connecting to all registered hosts all the time. >> That is normal. >> If your host can reach the engine then check whether it can reach >> http://{fqdn}/ovirt-engine/services/health as that is what is used to >> make sure the engine is alive. >> Best regards >> Martin Sivak >> On Wed, Apr 25, 2018 at 2:15 PM, <dhy...@sina.com> wrote: >>> Hi Martin, >>> >>> thank you for answer >>> my host can reach the engine, I confuse why engine connect to another >>> host >>> which has been power off by me? >>> >>> ----- Original Message ----- >>> From: Martin Sivak <msi...@redhat.com> >>> To: dhy336 <dhy...@sina.com>, users <users@ovirt.org> >>> Subject: Re: Re: Re: Re: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can >>> not_switch >>> Date: 2018-04-25 19:12 >>> >>> It is as I expected: >>> Engine status : {"reason": "failed liveliness check" >>> The host can't talk to the ovirt-engine service. Please make sure the >>> host can reach the engine fqdn as configured in >>> /etc/ovirt-hosted-engine/hosted-engine.conf on the fqdn= line. >>> You can check it manually by executing $(hosted-engine >>> --check-liveliness) from the host. >>> Best regards >>> Martin Sivak >>> On Wed, Apr 25, 2018 at 12:51 PM, <dhy...@sina.com> wrote: >>>> Hi, >>>> >>>> two node : >>>> 192.168.122.66 hosted-engine1 >>>> 192.168.122.223 hosted-engine2 >>>> >>>> I power off hosted-engine1, so I do not attach hosted-engine1`s log, >>>> >>>> [root@hosted-engine2 ~]# hosted-engine --vm-status >>>> >>>> --== Host 1 status ==-- >>>> >>>> conf_on_shared_storage : True >>>> Status up-to-date : False >>>> Hostname : hosted-engine1 >>>> Host ID : 1 >>>> Engine status : unknown stale-data >>>> Score : 3400 >>>> stopped : False >>>> Local maintenance : False >>>> crc32 : a7af0afa >>>> local_conf_timestamp : 11485 >>>> Host timestamp : 11485 >>>> Extra metadata (valid at timestamp): >>>> metadata_parse_version=1 >>>> metadata_feature_version=1 >>>> timestamp=11485 (Wed Apr 25 10:08:34 2018) >>>> host-id=1 >>>> score=3400 >>>> vm_conf_refresh_time=11485 (Wed Apr 25 10:08:34 2018) >>>> conf_on_shared_storage=True >>>> maintenance=False >>>> state=EngineUp >>>> stopped=False >>>> >>>> >>>> --== Host 2 status ==-- >>>> >>>> conf_on_shared_storage : True >>>> Status up-to-date : True >>>> Hostname : hosted-engine2 >>>> Host ID : 2 >>>> Engine status : {"reason": "failed liveliness check", >>>> "health": "bad", "vm": "up", "detail": "Up"} >>>> Score : 3000 >>>> stopped : False >>>> Local maintenance : False >>>> crc32 : a2e82883 >>>> local_conf_timestamp : 6278 >>>> Host timestamp : 6278 >>>> Extra metadata (valid at timestamp): >>>> metadata_parse_version=1 >>>> metadata_feature_version=1 >>>> timestamp=6278 (Wed Apr 25 10:37:44 2018) >>>> host-id=2 >>>> score=3000 >>>> vm_conf_refresh_time=6278 (Wed Apr 25 10:37:44 2018) >>>> conf_on_shared_storage=True >>>> maintenance=False >>>> state=EngineStop >>>> stopped=False >>>> timeout=Thu Jan 1 09:49:38 1970 >>>> >>>> >>>> >>>> ----- Original Message ----- >>>> From: Martin Sivak <msi...@redhat.com> >>>> To: dhy336 <dhy...@sina.com>, users <users@ovirt.org> >>>> Subject: Re: Re: Re: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can >>>> not_switch >>>> Date: 2018-04-25 17:41 >>>> >>>> >>>> Please attach the output of hosted-engine --vm-status and the >>>> /var/log/ovirt-hosted-engine-ha/agent.log file from both hosts. >>>> The VM will restart if the ovirt-engine service does not become >>>> available within timeout. And that might mean couple of things - the >>>> FQDN of the engine is wrong, the engine needs something that was only >>>> available on the dead host (A) like some storage, host B cannot ping >>>> the gateway.. >>>> Best regards >>>> Martin Sivak >>>> On Wed, Apr 25, 2018 at 11:33 AM, <dhy...@sina.com> wrote: >>>>> sorry, I mis-represent, >>>>> >>>>> I hava two node, A:192.168.122.65 , B:192.168.122.66 with >>>>> hosted-engine. >>>>> >>>>> testing engine HA : >>>>> >>>>> first two node is up, and hosted-engine VM run in A, then I poweroff A, >>>>> and >>>>> after 3 minutes, B start it`s hosted engine VM, >>>>> But it`s ovirt-engine connect to host A, and continue for about 10 >>>>> minutes, >>>>> then hosted engine VM restart. >>>>> ----- Original Message ----- >>>>> From: Martin Sivak <msi...@redhat.com> >>>>> To: dhy336 <dhy...@sina.com> >>>>> Subject: Re: Re: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can >>>>> not_switch >>>>> Date: 2018-04-25 17:11 >>>>> >>>>> >>>>> Your hosted engine VM has its own address that does not depend on >>>>> which host it is currently running. So it should be available on the >>>>> same address no matter where the VM is running. >>>>> Best regards >>>>> Martin Sivak >>>>> On Wed, Apr 25, 2018 at 9:07 AM, <dhy...@sina.com> wrote: >>>>>>>> I deploy two node for hosted engine, first hosted engine VM run in >>>>>>>> 192.168.122.65, I power off this host, hosted-engine VM switch >>>>>>>> another host,but ovirt engine still connect 192.168.122.65. if >>>>>>>> restart >>>>>>>> ovirt-engine server, it is work. >>>>>> >>>>>> I think this issue is error, because hosted engine VM has power up in >>>>>> another host( 192.168.122.66), so hosted engine should >>>>>> connect to host( 192.168.122.66), not connet to host(192.168.122.66)? >>>>>> >>>>>> thanks >>>>>> >>>>>> ----- Original Message ----- >>>>>> From: Martin Sivak <msi...@redhat.com> >>>>>> To: dhy336 <dhy...@sina.com> >>>>>> Cc: users <users@ovirt.org> >>>>>> Subject: Re: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch >>>>>> Date: 2018-04-20 18:28 >>>>>> >>>>>> >>>>>> Hi, >>>>>> No, this is not an error. You killed the host without moving it to >>>>>> maintenance first. The engine has no way to distinguish this from >>>>>> temporary network failure for example. Give it some time and the host >>>>>> will move its status to one of the error states and handle the highly >>>>>> available VMs on it (if fencing is properly configured). >>>>>> Best regards >>>>>> Martin Sivak >>>>>> On Fri, Apr 20, 2018 at 12:13 PM, <dhy...@sina.com> wrote: >>>>>>> this process is not error ? >>>>>>> ----- Original Message ----- >>>>>>> From: Martin Sivak <msi...@redhat.com> >>>>>>> To: dhy336 <dhy...@sina.com> >>>>>>> Cc: users <users@ovirt.org> >>>>>>> Subject: Re: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch >>>>>>> Date: 2018-04-20 18:05 >>>>>>> >>>>>>> >>>>>>> Hi, >>>>>>> the engine does not know you killed the host. It will notice >>>>>>> eventually and handle the situation. Just give it time (5 minutes or >>>>>>> so). >>>>>>> Best regards >>>>>>> -- >>>>>>> Martin Sivak >>>>>>> SLA / oVirt >>>>>>> On Fri, Apr 20, 2018 at 12:00 PM, <dhy...@sina.com> wrote: >>>>>>>> Hi, thanks for your feedback. I hava another qeustions >>>>>>>> >>>>>>>> I deploy two node for hosted engine, first hosted engine VM run in >>>>>>>> 192.168.122.65, I power off this host, hosted-engine VM switch >>>>>>>> another host,but ovirt engine still connect 192.168.122.65. if >>>>>>>> restart >>>>>>>> ovirt-engine server, it is work. >>>>>>>> >>>>>>>> >>>>>>>> 2018-04-20 17:13:04,692+08 ERROR >>>>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] >>>>>>>> (EE-ManagedThreadFactory-en gineScheduled-Thread-98) [] Command >>>>>>>> 'GetAllVmStatsVDSCommand(HostName = hosted-engine2, >>>>>>>> VdsIdVDSCommandParametersBase:{hos >>>>>>>> tId='a5428ef7-9df6-4a86-91de-7e36fda340fa'})' execution failed: >>>>>>>> java.net.NoRouteToHostException: No route to host >>>>>>>> 6568 2018-04-20 17:13:04,693+08 INFO >>>>>>>> [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher] >>>>>>>> (EE-ManagedThreadFactory-engi neScheduled-Thread-98) [] Failed to >>>>>>>> fetch >>>>>>>> vms info for host 'hosted-engin2' - skipping VMs monitoring. >>>>>>>> 6569 2018-04-20 17:13:19,710+08 INFO >>>>>>>> [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp >>>>>>>> Reactor) >>>>>>>> [] Connecting to hosted-engine2/192.168.122.656570 2018-04-20 >>>>>>>> 17:13:22,730+08 ERROR >>>>>>>> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] >>>>>>>> (EE-ManagedThreadFactory-en gineScheduled-Thread-45) [] Command >>>>>>>> 'GetAllVmStatsVDSCommand(HostName = hosted-engine-tchyp2, >>>>>>>> VdsIdVDSCommandParametersBase:{hos >>>>>>>> tId='a5428ef7-9df6-4a86-91de-7e36fda340fa'})' execution failed: >>>>>>>> java.net.NoRouteToHostException: No route to host >>>>>>>> 6571 2018-04-20 17:13:22,732+08 INFO >>>>>>>> [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher] >>>>>>>> (EE-ManagedThreadFactory-engi neScheduled-Thread-45) [] Failed to >>>>>>>> fetch >>>>>>>> vms info for host 'hosted-engine2' - skipping VMs monitoring. >>>>>>>> >>>>>>>> ----- Original Message ----- >>>>>>>> From: Martin Sivak <msi...@redhat.com> >>>>>>>> To: dhy336 <dhy...@sina.com> >>>>>>>> Cc: users <users@ovirt.org> >>>>>>>> Subject: Re: [ovirt-users] 回复:Re: Hosted-engine can not_switch >>>>>>>> Date: 2018-04-20 16:40 >>>>>>>> >>>>>>>> >>>>>>>> Hi, >>>>>>>> your ovirt-hosted-engine-ha package is too old. You need at least >>>>>>>> 2.1.9 to properly support 4.2 engine. The same applies to vdsm. >>>>>>>> Please >>>>>>>> upgrade the node. >>>>>>>> Best regards >>>>>>>> Martin Sivak >>>>>>>> On Fri, Apr 20, 2018 at 3:58 AM, <dhy...@sina.com> wrote: >>>>>>>>> Hi I find some error logs in >>>>>>>>> /var/log/ovirt-hosted-engine-ha/broker. >>>>>>>>> >>>>>>>>> [root@hosted-engine2 ~]# ll /rhev/data-center/mnt >>>>>>>>> total 0 >>>>>>>>> drwxr-xr-x. 3 vdsm kvm 76 Apr 18 22:28 >>>>>>>>> 192.168.122.218:_exports_data >>>>>>>>> drwxr-xr-x. 3 vdsm kvm 76 Apr 18 22:12 >>>>>>>>> 192.168.122.218:_exports_hosted-engine-test1 >>>>>>>>> [root@hosted-engine2 ~]# ll >>>>>>>>> >>>>>>>>> /rhev/data-center/mnt/192.168.122.218\:_exports_hosted-engine-test1/ >>>>>>>>> total 0 >>>>>>>>> drwxr-xr-x. 5 vdsm kvm 50 Apr 18 22:14 >>>>>>>>> 8a734205-65b7-4801-b7f0-d380eb45dbae >>>>>>>>> -rwxr-xr-x. 1 vdsm kvm 0 Apr 20 09:54 __DIRECT_IO_TEST__ >>>>>>>>> >>>>>>>>> uuid 8a734205-65b7-4801-b7f0-d380eb45dbae is in >>>>>>>>> >>>>>>>>> /rhev/data-center/mnt/192.168.122.218\:_exports_hosted-engine-test1/ >>>>>>>>> but broker find it in /rhev/data-center/mnt, is it my version is >>>>>>>>> error? >>>>>>>>> my >>>>>>>>> ovirt-hosted-engine-ha version is 2.1.5, vdsm is 4.20.5, >>>>>>>>> ovirt-engine is 4.2 >>>>>>>>> >>>>>>>>> MainThread::INFO::2018-04-19 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 19:26:31,479::listener::41::ovirt_hosted_engine_ha.broker.listener.Listener::(__init__) >>>>>>>>> Initializing SocketServer >>>>>>>>> MainThread::INFO::2018-04-19 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 19:26:31,480::listener::56::ovirt_hosted_engine_ha.broker.listener.Listener::(__init__) >>>>>>>>> SocketServer ready >>>>>>>>> Thread-1::INFO::2018-04-19 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 19:26:31,558::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup) >>>>>>>>> Connection established >>>>>>>>> Thread-1::ERROR::2018-04-19 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 19:26:31,559::listener::192::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) >>>>>>>>> Error handling request, data: 'set-storage-domain FilesystemBackend >>>>>>>>> dom_type=nfs3 sd_uuid=8a734205-65b7-4801-b7f0-d380eb45dbae' >>>>>>>>> Traceback (most recent call last): >>>>>>>>> File >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py", >>>>>>>>> line 166, in handle >>>>>>>>> data) >>>>>>>>> File >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/listener.py", >>>>>>>>> line 299, in _dispatch >>>>>>>>> .set_storage_domain(client, sd_type, **options) >>>>>>>>> File >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", >>>>>>>>> line 66, in set_storage_domain >>>>>>>>> self._backends[client].connect() >>>>>>>>> File >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", >>>>>>>>> line 462, in connect >>>>>>>>> self._dom_type) >>>>>>>>> File >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py", >>>>>>>>> line 107, in get_domain_path >>>>>>>>> " in {1}".format(sd_uuid, parent)) >>>>>>>>> BackendFailureException: path to storage domain >>>>>>>>> 8a734205-65b7-4801-b7f0-d380eb45dbae not found in >>>>>>>>> /rhev/data-center/mnt >>>>>>>>> Thread-1::INFO::2018-04-19 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 19:26:31,563::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) >>>>>>>>> Connection closed >>>>>>>>> Thread-2::INFO::2018-04-19 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 19:26:44,601::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup) >>>>>>>>> Connection established >>>>>>>>> >>>>>>>>> ----- 原始邮件 ----- >>>>>>>>> 发件人:<dhy...@sina.com> >>>>>>>>> 收件人:"Martin Sivak" <msi...@redhat.com> >>>>>>>>> 抄送人:users <users@ovirt.org> >>>>>>>>> 主题:[ovirt-users] 回复:Re: Hosted-engine can not_switch >>>>>>>>> 日期:2018年04月20日 09点30分 >>>>>>>>> >>>>>>>>> libvirt has not error logs . I only find some error for vdsm. >>>>>>>>> vdsm log is: >>>>>>>>> 2018-04-20 09:24:52,610+0800 INFO (jsonrpc/1) [vdsm.api] FINISH >>>>>>>>> getVolumeInfo return={'info': {'status': 'OK', 'domain': >>>>>>>>> '8a734205-65b7-4801-b7f0-d380eb45dbae', 'voltype': 'LEAF', >>>>>>>>> 'description': >>>>>>>>> 'hosted-engine.lockspace', 'parent': >>>>>>>>> '00000000-0000-0000-0000-000000000000', >>>>>>>>> 'format': 'RAW', 'generation': 0, 'image': >>>>>>>>> '611272bd-c2cc-42bc-94e2-9aa52e754c35', 'ctime': '1524032037', >>>>>>>>> 'disktype': >>>>>>>>> '2', 'legality': 'LEGAL', 'mtime': '0', 'apparentsize': '1048576', >>>>>>>>> 'children': [], 'pool': '', 'capacity': '1048576', 'uuid': >>>>>>>>> u'7037aac6-7c8e-4efd-82f7-ca618c953fe6', 'truesize': '1048576', >>>>>>>>> 'type': >>>>>>>>> 'PREALLOCATED', 'lease': {'owners': [], 'version': None}}} >>>>>>>>> from=::1,48306, >>>>>>>>> task_id=03a7938e-8afb-4b16-b8dd-126c2b1f5d52 (api:52) >>>>>>>>> 2018-04-20 09:24:52,611+0800 INFO (jsonrpc/1) >>>>>>>>> [jsonrpc.JsonRpcServer] >>>>>>>>> RPC >>>>>>>>> call Volume.getInfo succeeded in 0.03 seconds (__init__:630) >>>>>>>>> 2018-04-20 09:24:54,113+0800 ERROR (periodic/3) >>>>>>>>> [virt.periodic.Operation] >>>>>>>>> <vdsm.virt.sampling.VMBulkstatsMonitor object at 0x1e92f90> >>>>>>>>> operation >>>>>>>>> failed >>>>>>>>> (periodic:215) >>>>>>>>> Traceback (most recent call last): >>>>>>>>> File "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line >>>>>>>>> 213, >>>>>>>>> in __call__ >>>>>>>>> self._func() >>>>>>>>> File "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line >>>>>>>>> 522, >>>>>>>>> in __call__ >>>>>>>>> self._send_metrics() >>>>>>>>> File "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line >>>>>>>>> 538, >>>>>>>>> in _send_metrics >>>>>>>>> vm_sample.interval) >>>>>>>>> File "/usr/lib/python2.7/site-packages/vdsm/virt/vmstats.py", line >>>>>>>>> 45, >>>>>>>>> in >>>>>>>>> produce >>>>>>>>> networks(vm, stats, first_sample, last_sample, interval) >>>>>>>>> File "/usr/lib/python2.7/site-packages/vdsm/virt/vmstats.py", line >>>>>>>>> 322, >>>>>>>>> in >>>>>>>>> networks >>>>>>>>> if nic.name.startswith('hostdev'): >>>>>>>>> AttributeError: name >>>>>>>>> 2018-04-20 09:24:54,800+0800 INFO (Reactor thread) >>>>>>>>> [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:48308 >>>>>>>>> (protocoldetector:61) >>>>>>>>> 2018-04-20 09:24:54,810+0800 INFO (Reactor thread) >>>>>>>>> [ProtocolDetector.Detector] Detected protocol stomp from ::1:48308 >>>>>>>>> (protocoldetector:125) >>>>>>>>> 2018-04-20 09:24:54,810+0800 INFO (Reactor thread) >>>>>>>>> [Broker.StompAdapter] >>>>>>>>> Processing CONNECT request (stompreactor:103) >>>>>>>>> 2018-04-20 09:24:54,818+0800 INFO (JsonRpc (StompReactor)) >>>>>>>>> [Broker.StompAdapter] Subscribe command received (stompreactor:132) >>>>>>>>> 2018-04-20 09:24:55,119+0800 INFO (jsonrpc/6) [api.host] START >>>>>>>>> getHardwareInfo() from=::1,48308 (api:46) >>>>>>>>> >>>>>>>>> ----- 原始邮件 ----- >>>>>>>>> 发件人:Martin Sivak <msi...@redhat.com> >>>>>>>>> 收件人:dhy336 <dhy...@sina.com> >>>>>>>>> 抄送人:users <users@ovirt.org> >>>>>>>>> 主题:Re: [ovirt-users] Hosted-engine can not switch >>>>>>>>> 日期:2018年04月19日 20点16分 >>>>>>>>> >>>>>>>>> >>>>>>>>> We need more than just this small log snippet. Please check the >>>>>>>>> vdsm >>>>>>>>> and libvirt logs as well. >>>>>>>>> Best regards >>>>>>>>> Martin Sivak >>>>>>>>> On Thu, Apr 19, 2018 at 2:05 PM, <dhy...@sina.com> wrote: >>>>>>>>>> Hi, >>>>>>>>>> I deploy three node with hosted engine, I force shut down a node >>>>>>>>>> which >>>>>>>>>> Host-engine VM is run, But hosted engine VM in other nodes can not >>>>>>>>>> run. >>>>>>>>>> >>>>>>>>>> I find some error in /var/log/ovirt-hosted-engine-ha/agent.log >>>>>>>>>> >>>>>>>>>> MainThread::INFO::2018-04-19 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 19:56:35,787::hosted_engine::1192::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_clean_vdsm_state) >>>>>>>>>> Cleaning state for non-running VM >>>>>>>>>> MainThread::INFO::2018-04-19 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 19:56:42,587::hosted_engine::1176::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_clean_vdsm_state) >>>>>>>>>> Vdsm state for VM clean >>>>>>>>>> MainThread::INFO::2018-04-19 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 19:56:42,589::hosted_engine::1125::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm) >>>>>>>>>> Starting vm using `/usr/sbin/hosted-engine --vm-start` >>>>>>>>>> MainThread::INFO::2018-04-19 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 19:56:47,599::hosted_engine::1131::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm) >>>>>>>>>> stdout: >>>>>>>>>> MainThread::INFO::2018-04-19 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 19:56:47,600::hosted_engine::1132::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm) >>>>>>>>>> stderr: Virtual machine does not exist: {'vmId': >>>>>>>>>> u'08bbd680-a8a7-4267-82e7-89f36e87e930'} >>>>>>>>>> >>>>>>>>>> MainThread::INFO::2018-04-19 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 19:56:47,600::hosted_engine::1144::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm) >>>>>>>>>> Engine VM started on localhost >>>>>>>>>> MainThread::INFO::2018-04-19 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 19:56:47,609::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) >>>>>>>>>> Trying: notify time=1524139007.61 type=state_transition >>>>>>>>>> detail=EngineStart-EngineStarting hostname='hosted-engine2' >>>>>>>>>> MainThread::INFO::2018-04-19 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 19:56:47,670::brokerlink::121::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) >>>>>>>>>> Success, was notification of state_transition >>>>>>>>>> (EngineStart-EngineStarting) >>>>>>>>>> sent? sent >>>>>>>>>> MainThread::INFO::2018-04-19 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 19:56:47,670::hosted_engine::604::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) >>>>>>>>>> Initializing VDSM >>>>>>>>>> MainThread::INFO::2018-04-19 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 19:56:50,095::hosted_engine::630::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) >>>>>>>>>> Connecting the storage >>>>>>>>>> MainThread::INFO::2018-04-19 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 19:56:50,096::storage_server::220::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(validate_storage_server) >>>>>>>>>> Validating storage server >>>>>>>>>> MainThread::INFO::2018-04-19 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 19:56:52,449::hosted_engine::639::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) >>>>>>>>>> Storage domain reported as valid and reconnect is not forced. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Users mailing list >>>>>>>>>> Users@ovirt.org >>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users >>>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Users mailing list >>>>>>>>> Users@ovirt.org >>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users