Ping looks fine from both engine-host and host-engine.

While troubleshooting more from logs, found the below errors from various files:

###################
VDSM <host> command Get Host Capabilities failed: Not enough resources: 
{'reason': 'Too many tasks', 'resource': 'jsonrpc', 'current_tasks': 80}

############
May  9 03:54:04 <host> vdsm[26934]: WARN Worker blocked: <Worker name=jsonrpc/4 
running <Task <JsonRpcTask {'params': {u'volumeName': u'vm_gv0'}, 'jsonrpc': 
'2.0', 'method': u'GlusterVolume.healInfo', 'id': 
u'f4e56ab9-6916-4938-821a-1b9aab2ef162'} at 0x7fb886fd8dd0> timeout=60, 
duration=7980 at 0x7fb886edc910> task#=14247 at 0x7fb8a4035450>, 
traceback:#012File: "/usr/lib64/python2.7/threading.py", line 785, in 
__bootstrap#012  self.__bootstrap_inner()#012File: 
"/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner#012  
self.run()#012File: "/usr/lib64/python2.7/threading.py", line 765, in run#012  
self.__target(*self.__args, **self.__kwargs)#012File: 
"/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 194, in 
run#012  ret = func(*args, **kwargs)#012File: 
"/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run#012  
self._execute_task()#012File: 
"/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in 
_execute_task#012  task()#012File:
  "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in 
__call__#012  self._callable()#012File: 
"/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 523, in 
__call__#012  self._handler(self._ctx, self._req)#012File: 
"/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 566, in 
_serveRequest#012  response = self._handle_request(req, ctx)#012File: 
"/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 606, in 
_handle_request#012  res = method(**params)#012File: 
"/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 197, in 
_dynamicMethod#012  result = fn(*methodArgs)#012File: 
"/usr/lib/python2.7/site-packages/vdsm/gluster/apiwrapper.py", line 129, in 
healInfo#012  return self._gluster.volumeHealInfo(volumeName)#012File: 
"/usr/lib/python2.7/site-packages/vdsm/gluster/api.py", line 90, in wrapper#012 
 rv = func(*args, **kwargs)#012File: 
"/usr/lib/python2.7/site-packages/vdsm/gluster/api.py", line 776, in 
volumeHealInfo#012  return {'healInfo': 
 self.svdsmProxy.glusterVolumeHealInfo(volumeName)}#012File: 
"/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 55, in 
__call__#012  return callMethod()#012File: 
"/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 53, in 
<lambda>#012  **kwargs)#012File: "<string>", line 2, in 
glusterVolumeHealInfo#012File: 
"/usr/lib64/python2.7/multiprocessing/managers.py", line 759, in 
_callmethod#012  kind, result = conn.recv()

#########
cat /var/log/messages | grep 'database connection failed'
May  9 07:25:59 <host> ovs-vsctl: 
ovs|00001|db_ctl_base|ERR|unix:/var/run/openvswitch/db.sock: database 
connection failed (No such file or directory)

#######
/var/log/ovirt-hosted-engine-ha/agent.log
MainThread::ERROR::2020-05-09 
11:32:33,089::agent::145::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
 Trying to restart agent
MainThread::INFO::2020-05-09 
11:32:33,089::agent::89::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Agent 
shutting down
MainThread::INFO::2020-05-09 
11:32:43,926::agent::67::ovirt_hosted_engine_ha.agent.agent.Agent::(run) 
ovirt-hosted-engine-ha agent 2.2.16 started
MainThread::INFO::2020-05-09 
11:32:43,984::hosted_engine::244::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname)
 Found certificate common name: <hostname>
MainThread::ERROR::2020-05-09 
11:33:49,369::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
 Traceback (most recent call last):
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 
131, in _run_agent
    return action(he)
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 
55, in action_proper
    return he.start_monitoring()
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
 line 412, in start_monitoring
    self._initialize_vdsm()
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
 line 569, in _initialize_vdsm
    logger=self._log
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/util.py", 
line 468, in connect_vdsm_json_rpc
    __vdsm_json_rpc_connect(logger, timeout)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/util.py", 
line 411, in __vdsm_json_rpc_connect
    timeout=VDSM_MAX_RETRY * VDSM_DELAY
RuntimeError: Couldn't  connect to VDSM within 60 seconds
MainThread::ERROR::2020-05-09 
11:33:49,371::agent::145::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
 Trying to restart agent
MainThread::INFO::2020-05-09 
11:33:49,371::agent::89::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Agent 
shutting down
MainThread::INFO::2020-05-09 
11:34:00,216::agent::67::ovirt_hosted_engine_ha.agent.agent.Agent::(run) 
ovirt-hosted-engine-ha agent 2.2.16 started
MainThread::INFO::2020-05-09 
11:34:00,326::hosted_engine::244::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname)
 Found certificate common name: <hostname>
_______________________________________________
Infra mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/AMFMO5AQTBMJRDTKS42BOU2UOQJPUAPD/

Reply via email to