Hello,

I also put host in Maintenance and restarted vdsm while ovirt-ha-agent
is running. I can mount the gluster Volume "engine" manually in the host.

I get this repeatedly in /var/log/vdsm.log:

2017-02-03 15:29:28,891 INFO  (MainThread) [vds] Exiting (vdsm:167)
2017-02-03 15:29:30,974 INFO  (MainThread) [vds] (PID: 11456) I am the
actual vdsm 4.19.4-1.el7.centos microcloud27 (3.10.0-514.6.1.el7.x86_64)
(vdsm:145)
2017-02-03 15:29:30,974 INFO  (MainThread) [vds] VDSM will run with cpu
affinity: frozenset([1]) (vdsm:251)
2017-02-03 15:29:31,013 INFO  (MainThread) [storage.check] Starting
check service (check:91)
2017-02-03 15:29:31,017 INFO  (MainThread) [storage.Dispatcher] Starting
StorageDispatcher... (dispatcher:47)
2017-02-03 15:29:31,017 INFO  (check/loop) [storage.asyncevent] Starting
<EventLoop running=True closed=False at 0x37480464> (asyncevent:122)
2017-02-03 15:29:31,156 INFO  (MainThread) [dispatcher] Run and protect:
registerDomainStateChangeCallback(callbackFunc=<functools.partial object
at 0x2881fc8>) (logUtils:49)
2017-02-03 15:29:31,156 INFO  (MainThread) [dispatcher] Run and protect:
registerDomainStateChangeCallback, Return response: None (logUtils:52)
2017-02-03 15:29:31,160 INFO  (MainThread) [MOM] Preparing MOM interface
(momIF:49)
2017-02-03 15:29:31,161 INFO  (MainThread) [MOM] Using named unix socket
/var/run/vdsm/mom-vdsm.sock (momIF:58)
2017-02-03 15:29:31,162 INFO  (MainThread) [root] Unregistering all
secrets (secret:91)
2017-02-03 15:29:31,164 INFO  (MainThread) [vds] Setting channels'
timeout to 30 seconds. (vmchannels:223)
2017-02-03 15:29:31,165 INFO  (MainThread) [vds.MultiProtocolAcceptor]
Listening at :::54321 (protocoldetector:185)
2017-02-03 15:29:31,354 INFO  (vmrecovery) [vds] recovery: completed in
0s (clientIF:495)
2017-02-03 15:29:31,371 INFO  (BindingXMLRPC) [vds] XMLRPC server
running (bindingxmlrpc:63)
2017-02-03 15:29:31,471 INFO  (periodic/1) [dispatcher] Run and protect:
repoStats(options=None) (logUtils:49)
2017-02-03 15:29:31,472 INFO  (periodic/1) [dispatcher] Run and protect:
repoStats, Return response: {} (logUtils:52)
2017-02-03 15:29:31,472 WARN  (periodic/1) [MOM] MOM not available.
(momIF:116)
2017-02-03 15:29:31,473 WARN  (periodic/1) [MOM] MOM not available, KSM
stats will be missing. (momIF:79)
2017-02-03 15:29:31,474 ERROR (periodic/1) [root] failed to retrieve
Hosted Engine HA info (api:252)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 231, in
_getHaInfo
    stats = instance.get_all_stats()
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 103, in get_all_stats
    self._configure_broker_conn(broker)
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 180, in _configure_broker_conn
    dom_type=dom_type)
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 177, in set_storage_domain
    .format(sd_type, options, e))
RequestError: Failed to set storage domain FilesystemBackend, options
{'dom_type': 'glusterfs', 'sd_uuid':
'7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96'}: Request failed: <class 'ovirt_hos
ted_engine_ha.lib.storage_backends.BackendFailureException'>
2017-02-03 15:29:35,920 INFO  (Reactor thread)
[ProtocolDetector.AcceptorImpl] Accepted connection from ::1:49506
(protocoldetector:72)
2017-02-03 15:29:35,929 INFO  (Reactor thread)
[ProtocolDetector.Detector] Detected protocol stomp from ::1:49506
(protocoldetector:127)
2017-02-03 15:29:35,930 INFO  (Reactor thread) [Broker.StompAdapter]
Processing CONNECT request (stompreactor:102)
2017-02-03 15:29:35,930 INFO  (JsonRpc (StompReactor))
[Broker.StompAdapter] Subscribe command received (stompreactor:129)
2017-02-03 15:29:36,067 INFO  (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC
call Host.ping succeeded in 0.00 seconds (__init__:515)
2017-02-03 15:29:36,071 INFO  (jsonrpc/1) [throttled] Current
getAllVmStats: {} (throttledlog:105)
2017-02-03 15:29:36,071 INFO  (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC
call Host.getAllVmStats succeeded in 0.00 seconds (__init__:515)
2017-02-03 15:29:46,435 INFO  (periodic/0) [dispatcher] Run and protect:
repoStats(options=None) (logUtils:49)
2017-02-03 15:29:46,435 INFO  (periodic/0) [dispatcher] Run and protect:
repoStats, Return response: {} (logUtils:52)
2017-02-03 15:29:46,439 ERROR (periodic/0) [root] failed to retrieve
Hosted Engine HA info (api:252)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 231, in
_getHaInfo
    stats = instance.get_all_stats()
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 103, in get_all_stats
    self._configure_broker_conn(broker)
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 180, in _configure_broker_conn
    dom_type=dom_type)
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 177, in set_storage_domain
    .format(sd_type, options, e))
RequestError: Failed to set storage domain FilesystemBackend, options
{'dom_type': 'glusterfs', 'sd_uuid':
'7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96'}: Request failed: <class 'ovirt_hos
ted_engine_ha.lib.storage_backends.BackendFailureException'>
2017-02-03 15:29:51,095 INFO  (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC
call Host.getAllVmStats succeeded in 0.00 seconds (__init__:515)
2017-02-03 15:29:51,219 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC
call Host.setKsmTune succeeded in 0.00 seconds (__init__:515)
2017-02-03 15:30:01,444 INFO  (periodic/1) [dispatcher] Run and protect:
repoStats(options=None) (logUtils:49)
2017-02-03 15:30:01,444 INFO  (periodic/1) [dispatcher] Run and protect:
repoStats, Return response: {} (logUtils:52)
2017-02-03 15:30:01,448 ERROR (periodic/1) [root] failed to retrieve
Hosted Engine HA info (api:252)



Am 03.02.2017 um 13:39 schrieb Simone Tiraboschi:
> I see there an ERROR on stopMonitoringDomain but I cannot see the
> correspondent  startMonitoringDomain; could you please look for it?
>
> On Fri, Feb 3, 2017 at 1:16 PM, Ralf Schenk <r...@databay.de
> <mailto:r...@databay.de>> wrote:
>
>     Hello,
>
>     attached is my vdsm.log from the host with hosted-engine-ha around
>     the time-frame of agent timeout that is not working anymore for
>     engine (it works in Ovirt and is active). It simply isn't working
>     for engine-ha anymore after Update.
>
>     At 2017-02-02 19:25:34,248 you'll find an error corresponoding to
>     agent timeout error.
>
>     Bye
>
>
>
>     Am 03.02.2017 um 11:28 schrieb Simone Tiraboschi:
>>
>>             3. Three of my hosts have the hosted engine deployed for
>>             ha. First all three where marked by a crown (running was
>>             gold and others where silver). After upgrading the 3 Host
>>             deployed hosted engine ha is not active anymore.
>>
>>             I can't get this host back with working
>>             ovirt-ha-agent/broker. I already rebooted, manually
>>             restarted the services but It isn't able to get cluster
>>             state according to
>>             "hosted-engine --vm-status". The other hosts state the
>>             host status as "unknown stale-data"
>>
>>             I already shut down all agents on all hosts and issued a
>>             "hosted-engine --reinitialize-lockspace" but that didn't
>>             help.
>>
>>             Agents stops working after a timeout-error according to log:
>>
>>             MainThread::INFO::2017-02-02
>>             
>> 19:24:52,040::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>             VDSM domain monitor status: PENDING
>>             MainThread::INFO::2017-02-02
>>             
>> 19:24:59,185::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>             VDSM domain monitor status: PENDING
>>             MainThread::INFO::2017-02-02
>>             
>> 19:25:06,333::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>             VDSM domain monitor status: PENDING
>>             MainThread::INFO::2017-02-02
>>             
>> 19:25:13,554::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>             VDSM domain monitor status: PENDING
>>             MainThread::INFO::2017-02-02
>>             
>> 19:25:20,710::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>             VDSM domain monitor status: PENDING
>>             MainThread::INFO::2017-02-02
>>             
>> 19:25:27,865::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>             VDSM domain monitor status: PENDING
>>             MainThread::ERROR::2017-02-02
>>             
>> 19:25:27,866::hosted_engine::815::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor)
>>             Failed to start monitoring domain
>>             (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96,
>>             host_id=3): timeout during domain acquisition
>>             MainThread::WARNING::2017-02-02
>>             
>> 19:25:27,866::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>             Error while monitoring engine: Failed to start monitoring
>>             domain (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96,
>>             host_id=3): timeout during domain acquisition
>>             MainThread::WARNING::2017-02-02
>>             
>> 19:25:27,866::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>             Unexpected error
>>             Traceback (most recent call last):
>>               File
>>             
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>             line 443, in start_monitoring
>>                 self._initialize_domain_monitor()
>>               File
>>             
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>             line 816, in _initialize_domain_monitor
>>                 raise Exception(msg)
>>             Exception: Failed to start monitoring domain
>>             (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96,
>>             host_id=3): timeout during domain acquisition
>>             MainThread::ERROR::2017-02-02
>>             
>> 19:25:27,866::hosted_engine::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>             Shutting down the agent because of 3 failures in a row!
>>             MainThread::INFO::2017-02-02
>>             
>> 19:25:32,087::hosted_engine::841::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>             VDSM domain monitor status: PENDING
>>             MainThread::INFO::2017-02-02
>>             
>> 19:25:34,250::hosted_engine::769::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_domain_monitor)
>>             Failed to stop monitoring domain
>>             (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96): Storage
>>             domain is member of pool:
>>             u'domain=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96'
>>             MainThread::INFO::2017-02-02
>>             
>> 19:25:34,254::agent::143::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
>>             Agent shutting down
>>
>>         Simone, Martin, can you please follow up on this?
>>
>>
>>     Ralph, could you please attach vdsm logs from on of your hosts
>>     for the relevant time frame?
>
>     -- 
>
>
>     *Ralf Schenk*
>     fon +49 (0) 24 05 / 40 83 70 <tel:+49%202405%20408370>
>     fax +49 (0) 24 05 / 40 83 759 <tel:+49%202405%204083759>
>     mail *r...@databay.de* <mailto:r...@databay.de>
>               
>     *Databay AG*
>     Jens-Otto-Krag-Straße 11
>     D-52146 Würselen
>     *www.databay.de* <http://www.databay.de>
>
>     Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE 210844202
>     Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari,
>     Dipl.-Kfm. Philipp Hermanns
>     Aufsichtsratsvorsitzender: Wilhelm Dohmen
>
>     ------------------------------------------------------------------------
>
>

-- 


*Ralf Schenk*
fon +49 (0) 24 05 / 40 83 70
fax +49 (0) 24 05 / 40 83 759
mail *r...@databay.de* <mailto:r...@databay.de>
                
*Databay AG*
Jens-Otto-Krag-Straße 11
D-52146 Würselen
*www.databay.de* <http://www.databay.de>

Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE 210844202
Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari, Dipl.-Kfm.
Philipp Hermanns
Aufsichtsratsvorsitzender: Wilhelm Dohmen

------------------------------------------------------------------------
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to