On 21-03-2019 17:47, Simone Tiraboschi wrote:

> On Thu, Mar 21, 2019 at 3:47 PM Arif Ali <m...@arif-ali.co.uk> wrote: 
> 
>> Hi all,
>> 
>> Recently deployed oVirt version 4.3.1
>> 
>> It's in a self-hosted engine environment
>> 
>> Used the steps via cockpit to install the engine, and was able to add 
>> the rest of the oVirt nodes without any specific problems
>> 
>> We tested the HA of the hosted-engine without a problem, and then at one 
>> point of turn off the machine that was hosting the engine, to mimic 
>> failure to see how it goes; the vm was able to move over successfully, 
>> but some of the oVirt started to go into Unassigned. From a total of 6 
>> oVirt hosts, I have 4 of them in this state.
>> 
>> Clicking on the host, I see the following message in the events. I can 
>> get to the hosts via the engine, and ping the machine, so not sure what 
>> it's doing that it's no longer working
>> 
>> VDSM <snip> command Get Host Capabilities failed: Message timeout which 
>> can be caused by communication issues
>> 
>> Mind you, I have been trying to resolve this issue since Monday, and 
>> have tried various things, like rebooting and re-installing the oVirt 
>> hosts, without having much luck
>> 
>> So any assistance on this would be grateful, maybe I've missed something 
>> really simple, and I am overlooking it
> 
> Can you please check that VDSM is correctly running on that nodes? 
> Are you able to correctly reach that nodes from the engine VM?

So, I have gone back, and re-installed the whole solution again with the
4.3.2 now, and I again have the same issue 

Checking the vdsm logs, I get the issue below in the logs. The host is
either Unassigned or Connecting. I don't have the option to Activate or
put the host into Maintenance mode. I have tried rebooting the node with
no luck 

Mar 22 10:53:27 scvirt02 vdsm[32481]: WARN WORKER BLOCKED: <WORKER
NAME=PERIODIC/2 RUNNING <TASK <OPERATION
ACTION=<VDSM.VIRT.SAMPLING.HOSTMONITOR OBJECT AT 0X7EFED4180610> AT
0X7EFED4180650> TIMEOUT=15, DURATION=30.00 AT 0X7EFED4180810> TASK#=2 AT
0X7EFEF41987D0>, TRACEBACK: 

                                      FILE:
"/USR/LIB64/PYTHON2.7/THREADING.PY", LINE 785, IN __BOOTSTRAP 

                                        SELF.__BOOTSTRAP_INNER() 

                                      FILE:
"/USR/LIB64/PYTHON2.7/THREADING.PY", LINE 812, IN __BOOTSTRAP_INNER 

                                        SELF.RUN() 

                                      FILE:
"/USR/LIB64/PYTHON2.7/THREADING.PY", LINE 765, IN RUN 

                                        SELF.__TARGET(*SELF.__ARGS,
**SELF.__KWARGS) 

                                      FILE:
"/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/COMMON/CONCURRENT.PY", LINE 195,
IN RUN 

                                        RET = FUNC(*ARGS, **KWARGS) 

                                      FILE:
"/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/EXECUTOR.PY", LINE 301, IN _RUN 

                                        SELF._EXECUTE_TASK() 

                                      FILE:
"/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/EXECUTOR.PY", LINE 315, IN
_EXECUTE_TASK 

                                        TASK() 

                                      FILE:
"/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/EXECUTOR.PY", LINE 391, IN
__CALL__ 

                                        SELF._CALLABLE() 

                                      FILE:
"/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/VIRT/PERIODIC.PY", LINE 186, IN
__CALL__ 

                                        SELF._FUNC() 

                                      FILE:
"/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/VIRT/SAMPLING.PY", LINE 481, IN
__CALL__ 

                                        STATS =
HOSTAPI.GET_STATS(SELF._CIF, SELF._SAMPLES.STATS()) 

                                      FILE:
"/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/HOST/API.PY", LINE 79, IN
GET_STATS 

                                        RET['HASTATS'] = _GETHAINFO() 

                                      FILE:
"/USR/LIB/PYTHON2.7/SITE-PACKAGES/VDSM/HOST/API.PY", LINE 177, IN
_GETHAINFO 

                                        STATS = INSTANCE.GET_ALL_STATS()


                                      FILE:
"/USR/LIB/PYTHON2.7/SITE-PACKAGES/OVIRT_HOSTED_ENGINE_HA/CLIENT/CLIENT.PY",
LINE 94, IN GET_ALL_STATS 

                                        STATS =
BROKER.GET_STATS_FROM_STORAGE() 

                                      FILE:
"/USR/LIB/PYTHON2.7/SITE-PACKAGES/OVIRT_HOSTED_ENGINE_HA/LIB/BROKERLINK.PY",
LINE 143, IN GET_STATS_FROM_STORAGE 

                                        RESULT = SELF._PROXY.GET_STATS()


                                      FILE:
"/USR/LIB64/PYTHON2.7/XMLRPCLIB.PY", LINE 1233, IN __CALL__ 

                                        RETURN SELF.__SEND(SELF.__NAME,
ARGS) 

                                      FILE:
"/USR/LIB64/PYTHON2.7/XMLRPCLIB.PY", LINE 1591, IN __REQUEST 

                                        VERBOSE=SELF.__VERBOSE 

                                      FILE:
"/USR/LIB64/PYTHON2.7/XMLRPCLIB.PY", LINE 1273, IN REQUEST 

                                        RETURN SELF.SINGLE_REQUEST(HOST,
HANDLER, REQUEST_BODY, VERBOSE) 

                                      FILE:
"/USR/LIB64/PYTHON2.7/XMLRPCLIB.PY", LINE 1303, IN SINGLE_REQUEST 

                                        RESPONSE =
H.GETRESPONSE(BUFFERING=TRUE) 

                                      FILE:
"/USR/LIB64/PYTHON2.7/HTTPLIB.PY", LINE 1113, IN GETRESPONSE 

                                        RESPONSE.BEGIN() 

                                      FILE:
"/USR/LIB64/PYTHON2.7/HTTPLIB.PY", LINE 444, IN BEGIN 

                                        VERSION, STATUS, REASON =
SELF._READ_STATUS() 

                                      FILE:
"/USR/LIB64/PYTHON2.7/HTTPLIB.PY", LINE 400, IN _READ_STATUS 

                                        LINE = SELF.FP.READLINE(_MAXLINE
+ 1) 

                                      FILE:
"/USR/LIB64/PYTHON2.7/SOCKET.PY", LINE 476, IN READLINE 

                                        DATA =
SELF._SOCK.RECV(SELF._RBUFSIZE)

On the engine host, I continuously get the following messages too 

Mar 22 11:02:32 <snip> ovsdb-server[4724]:
ovs|01900|jsonrpc|WARN|Dropped 3 log messages in last 14 seconds (most
recently, 7 seconds ago) due to excessive rate 

Mar 22 11:02:32 <snip> ovsdb-server[4724]:
ovs|01901|jsonrpc|WARN|ssl:[::ffff:192.168.203.205]:55658: send error:
Protocol error 

Mar 22 11:02:32 <snip> ovsdb-server[4724]:
ovs|01902|reconnect|WARN|ssl:[::ffff:192.168.203.205]:55658: connection
dropped (Protocol error) 

Mar 22 11:02:34 <snip> ovsdb-server[4724]:
ovs|01903|stream_ssl|WARN|SSL_accept: unexpected SSL connection close 

Mar 22 11:02:34 <snip> ovsdb-server[4724]:
ovs|01904|reconnect|WARN|ssl:[::ffff:192.168.203.202]:49504: connection
dropped (Protocol error) 

Mar 22 11:02:40 <snip> ovsdb-server[4724]:
ovs|01905|stream_ssl|WARN|SSL_accept: unexpected SSL connection close 

Mar 22 11:02:40 <snip> ovsdb-server[4724]:
ovs|01906|jsonrpc|WARN|Dropped 1 log messages in last 5 seconds (most
recently, 5 seconds ago) due to excessive rate 

Mar 22 11:02:40 <snip> ovsdb-server[4724]:
ovs|01907|jsonrpc|WARN|ssl:[::ffff:192.168.203.203]:34114: send error:
Protocol error 

Mar 22 11:02:40 <snip> ovsdb-server[4724]:
ovs|01908|reconnect|WARN|ssl:[::ffff:192.168.203.203]:34114: connection
dropped (Protocol error) 

Mar 22 11:02:41 <snip> ovsdb-server[4724]:
ovs|01909|reconnect|WARN|ssl:[::ffff:192.168.203.204]:52034: connection
dropped (Protocol error) 

Mar 22 11:02:48 <snip> ovsdb-server[4724]:
ovs|01910|stream_ssl|WARN|Dropped 1 log messages in last 7 seconds (most
recently, 7 seconds ago) due to excessive rate 

Mar 22 11:02:48 <snip> ovsdb-server[4724]:
ovs|01911|stream_ssl|WARN|SSL_accept: unexpected SSL connection close 

Mar 22 11:02:48 <snip> ovsdb-server[4724]:
ovs|01912|reconnect|WARN|ssl:[::ffff:192.168.203.205]:55660: connection
dropped (Protocol error) 

Mar 22 11:02:50 <snip> ovsdb-server[4724]:
ovs|01913|stream_ssl|WARN|SSL_accept: unexpected SSL connection close 

Mar 22 11:02:50 <snip> ovsdb-server[4724]:
ovs|01914|jsonrpc|WARN|Dropped 2 log messages in last 9 seconds (most
recently, 2 seconds ago) due to excessive rate 

Mar 22 11:02:50 <snip> ovsdb-server[4724]:
ovs|01915|jsonrpc|WARN|ssl:[::ffff:192.168.203.202]:49506: send error:
Protocol error 

Mar 22 11:02:50 <snip> ovsdb-server[4724]:
ovs|01916|reconnect|WARN|ssl:[::ffff:192.168.203.202]:49506: connection
dropped (Protocol error) 

Mar 22 11:02:56 <snip> ovsdb-server[4724]:
ovs|01917|stream_ssl|WARN|SSL_accept: unexpected SSL connection close 

Mar 22 11:02:56 <snip> ovsdb-server[4724]:
ovs|01918|reconnect|WARN|ssl:[::ffff:192.168.203.203]:34116: connection
dropped (Protocol error) 

Mar 22 11:02:57 <snip> ovsdb-server[4724]:
ovs|01919|reconnect|WARN|ssl:[::ffff:192.168.203.204]:52036: connection
dropped (Protocol error) 

Mar 22 11:03:04 <snip> ovsdb-server[4724]:
ovs|01920|stream_ssl|WARN|Dropped 1 log messages in last 7 seconds (most
recently, 7 seconds ago) due to excessive rate 

Mar 22 11:03:04 <snip> ovsdb-server[4724]:
ovs|01921|stream_ssl|WARN|SSL_accept: unexpected SSL connection close 

Mar 22 11:03:04 <snip> ovsdb-server[4724]:
ovs|01922|jsonrpc|WARN|Dropped 2 log messages in last 9 seconds (most
recently, 7 seconds ago) due to excessive rate 

Mar 22 11:03:04 <snip> ovsdb-server[4724]:
ovs|01923|jsonrpc|WARN|ssl:[::ffff:192.168.203.205]:55662: send error:
Protocol error 

Mar 22 11:03:04 <snip> ovsdb-server[4724]:
ovs|01924|reconnect|WARN|ssl:[::ffff:192.168.203.205]:55662: connection
dropped (Protocol error) 

Mar 22 11:03:06 <snip> ovsdb-server[4724]:
ovs|01925|reconnect|WARN|ssl:[::ffff:192.168.203.202]:49508: connection
dropped (Protocol error) 

Mar 22 11:03:12 <snip> ovsdb-server[4724]:
ovs|01926|stream_ssl|WARN|Dropped 1 log messages in last 5 seconds (most
recently, 5 seconds ago) due to excessive rate 

Mar 22 11:03:12 <snip> ovsdb-server[4724]:
ovs|01927|stream_ssl|WARN|SSL_accept: unexpected SSL connection close 

Mar 22 11:03:12 <snip> ovsdb-server[4724]:
ovs|01928|reconnect|WARN|ssl:[::ffff:192.168.203.203]:34118: connection
dropped (Protocol error) 

Mar 22 11:03:13 <snip> ovsdb-server[4724]:
ovs|01929|reconnect|WARN|ssl:[::ffff:192.168.203.204]:52038: connection
dropped (Protocol error) 
-- 
regards,

Arif Ali
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PHNGLLQM2Q53PT2JRUUWRBHYNHV37R3G/

Reply via email to