Hi Dominik

When these commands are used on the ovirt-engine host the output is the one
depicted in your email.
For your reference see also below:

[root@ath01-ovirt01 certs]# ovn-nbctl get-ssl
Private key: /etc/pki/ovirt-engine/keys/ovn-ndb.key.nopass
Certificate: /etc/pki/ovirt-engine/certs/ovn-ndb.cer
CA Certificate: /etc/pki/ovirt-engine/ca.pem
Bootstrap: false
[root@ath01-ovirt01 certs]# ovn-nbctl get-connection
ptcp:6641
[root@ath01-ovirt01 certs]# ovn-sbctl get-ssl
Private key: /etc/pki/ovirt-engine/keys/ovn-sdb.key.nopass
Certificate: /etc/pki/ovirt-engine/certs/ovn-sdb.cer
CA Certificate: /etc/pki/ovirt-engine/ca.pem
Bootstrap: false
[root@ath01-ovirt01 certs]# ovn-sbctl get-connection
read-write role="" ptcp:6642
[root@ath01-ovirt01 certs]# ls -l /etc/pki/ovirt-engine/keys/ovn-*
-rw-r-----. 1 root hugetlbfs 1828 Jun 25 11:08
/etc/pki/ovirt-engine/keys/ovn-ndb.key.nopass
-rw-------. 1 root root      2893 Jun 25 11:08
/etc/pki/ovirt-engine/keys/ovn-ndb.p12
-rw-r-----. 1 root hugetlbfs 1828 Jun 25 11:08
/etc/pki/ovirt-engine/keys/ovn-sdb.key.nopass
-rw-------. 1 root root      2893 Jun 25 11:08
/etc/pki/ovirt-engine/keys/ovn-sdb.p12

When i try the above commands on the node hosts the following happens:
ovn-nbctl get-ssl / get-connection
ovn-nbctl: unix:/var/run/openvswitch/ovnnb_db.sock: database connection
failed (No such file or directory)
The above i believe is expected since no northbound connections should be
established from the host nodes.

ovn-sbctl get-ssl /get-connection
The output is stuck till i terminate it.

For the requested logs the below are found in the ovsdb-server-sb.log

2020-09-14T07:18:38.187Z|219636|reconnect|WARN|tcp:DC02-host01:33146:
connection dropped (Protocol error)
2020-09-14T07:18:41.946Z|219637|reconnect|WARN|tcp:DC01-host01:51188:
connection dropped (Protocol error)
2020-09-14T07:18:43.033Z|219638|reconnect|WARN|tcp:DC01-host02:37044:
connection dropped (Protocol error)
2020-09-14T07:18:46.198Z|219639|reconnect|WARN|tcp:DC02-host01:33148:
connection dropped (Protocol error)
2020-09-14T07:18:50.069Z|219640|jsonrpc|WARN|Dropped 4 log messages in last
12 seconds (most recently, 4 seconds ago) due to excessive rate
2020-09-14T07:18:50.069Z|219641|jsonrpc|WARN|tcp:DC01-host01:51190: error
parsing stream: line 0, column 0, byte 0: invalid character U+0016
2020-09-14T07:18:50.069Z|219642|jsonrpc|WARN|Dropped 4 log messages in last
12 seconds (most recently, 4 seconds ago) due to excessive rate
2020-09-14T07:18:50.069Z|219643|jsonrpc|WARN|tcp:DC01-host01:51190:
received SSL data on JSON-RPC channel
2020-09-14T07:18:50.070Z|219644|reconnect|WARN|tcp:DC01-host01:51190:
connection dropped (Protocol error)
2020-09-14T07:18:51.147Z|219645|reconnect|WARN|tcp:DC01-host02:37046:
connection dropped (Protocol error)
2020-09-14T07:18:54.209Z|219646|reconnect|WARN|tcp:DC02-host01:33150:
connection dropped (Protocol error)
2020-09-14T07:18:58.192Z|219647|reconnect|WARN|tcp:DC01-host01:51192:
connection dropped (Protocol error)
2020-09-14T07:18:59.262Z|219648|jsonrpc|WARN|Dropped 3 log messages in last
8 seconds (most recently, 1 seconds ago) due to excessive rate
2020-09-14T07:18:59.262Z|219649|jsonrpc|WARN|tcp:DC01-host02:37048: error
parsing stream: line 0, column 0, byte 0: invalid character U+0016
2020-09-14T07:18:59.263Z|219650|jsonrpc|WARN|Dropped 3 log messages in last
8 seconds (most recently, 1 seconds ago) due to excessive rate
2020-09-14T07:18:59.263Z|219651|jsonrpc|WARN|tcp:DC01-host02:37048:
received SSL data on JSON-RPC channel
2020-09-14T07:18:59.263Z|219652|reconnect|WARN|tcp:DC01-host02:37048:
connection dropped (Protocol error)
2020-09-14T07:19:02.220Z|219653|reconnect|WARN|tcp:DC02-host01:33152:
connection dropped (Protocol error)
2020-09-14T07:19:06.316Z|219654|reconnect|WARN|tcp:DC01-host01:51194:
connection dropped (Protocol error)
2020-09-14T07:19:07.386Z|219655|reconnect|WARN|tcp:DC01-host02:37050:
connection dropped (Protocol error)
2020-09-14T07:19:10.232Z|219656|reconnect|WARN|tcp:DC02-host01:33154:
connection dropped (Protocol error)
2020-09-14T07:19:14.439Z|219657|jsonrpc|WARN|Dropped 4 log messages in last
12 seconds (most recently, 4 seconds ago) due to excessive rate
2020-09-14T07:19:14.439Z|219658|jsonrpc|WARN|tcp:DC01-host01:51196: error
parsing stream: line 0, column 0, byte 0: invalid character U+0016
2020-09-14T07:19:14.439Z|219659|jsonrpc|WARN|Dropped 4 log messages in last
12 seconds (most recently, 4 seconds ago) due to excessive rate
2020-09-14T07:19:14.439Z|219660|jsonrpc|WARN|tcp:DC01-host01:51196:
received SSL data on JSON-RPC channel
2020-09-14T07:19:14.440Z|219661|reconnect|WARN|tcp:DC01-host01:51196:
connection dropped (Protocol error)
2020-09-14T07:19:15.505Z|219662|reconnect|WARN|tcp:DC01-host02:37052:
connection dropped (Protocol error)


How can we fix these SSL errors?
I thought vdsm did the certificate provisioning on the host nodes as to
communicate to the engine host node.

On Fri, Sep 11, 2020 at 6:39 PM Dominik Holler <[email protected]> wrote:

> Looks still like the ovn-controller on the host has problems communicating
> with ovn-southbound.
>
> Are there any hints in /var/log/openvswitch/*.log,
> especially in /var/log/openvswitch/ovsdb-server-sb.log ?
>
> Can you please check the output of
>
> ovn-nbctl get-ssl
> ovn-nbctl get-connection
> ovn-sbctl get-ssl
> ovn-sbctl get-connection
> ls -l /etc/pki/ovirt-engine/keys/ovn-*
>
> it should be similar to
>
> [root@ovirt-43 ~]# ovn-nbctl get-ssl
> Private key: /etc/pki/ovirt-engine/keys/ovn-ndb.key.nopass
> Certificate: /etc/pki/ovirt-engine/certs/ovn-ndb.cer
> CA Certificate: /etc/pki/ovirt-engine/ca.pem
> Bootstrap: false
> [root@ovirt-43 ~]# ovn-nbctl get-connection
> pssl:6641:[::]
> [root@ovirt-43 ~]# ovn-sbctl get-ssl
> Private key: /etc/pki/ovirt-engine/keys/ovn-sdb.key.nopass
> Certificate: /etc/pki/ovirt-engine/certs/ovn-sdb.cer
> CA Certificate: /etc/pki/ovirt-engine/ca.pem
> Bootstrap: false
> [root@ovirt-43 ~]# ovn-sbctl get-connection
> read-write role="" pssl:6642:[::]
> [root@ovirt-43 ~]# ls -l /etc/pki/ovirt-engine/keys/ovn-*
> -rw-r-----. 1 root hugetlbfs 1828 Oct 14  2019
> /etc/pki/ovirt-engine/keys/ovn-ndb.key.nopass
> -rw-------. 1 root root      2709 Oct 14  2019
> /etc/pki/ovirt-engine/keys/ovn-ndb.p12
> -rw-r-----. 1 root hugetlbfs 1828 Oct 14  2019
> /etc/pki/ovirt-engine/keys/ovn-sdb.key.nopass
> -rw-------. 1 root root      2709 Oct 14  2019
> /etc/pki/ovirt-engine/keys/ovn-sdb.p12
>
>
>
>
> On Fri, Sep 11, 2020 at 1:10 PM Konstantinos Betsis <[email protected]>
> wrote:
>
>> I did a restart of the ovn-controller, this is the output of the
>> ovn-controller.log
>>
>> 2020-09-11T10:54:07.566Z|00001|vlog|INFO|opened log file
>> /var/log/openvswitch/ovn-controller.log
>> 2020-09-11T10:54:07.568Z|00002|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
>> connecting...
>> 2020-09-11T10:54:07.568Z|00003|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
>> connected
>> 2020-09-11T10:54:07.570Z|00004|main|INFO|OVS IDL reconnected, force
>> recompute.
>> 2020-09-11T10:54:07.571Z|00005|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
>> connecting...
>> 2020-09-11T10:54:07.571Z|00006|main|INFO|OVNSB IDL reconnected, force
>> recompute.
>> 2020-09-11T10:54:07.685Z|00007|stream_ssl|WARN|SSL_connect: unexpected
>> SSL connection close
>> 2020-09-11T10:54:07.685Z|00008|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
>> connection attempt failed (Protocol error)
>> 2020-09-11T10:54:08.685Z|00009|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
>> connecting...
>> 2020-09-11T10:54:08.800Z|00010|stream_ssl|WARN|SSL_connect: unexpected
>> SSL connection close
>> 2020-09-11T10:54:08.800Z|00011|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
>> connection attempt failed (Protocol error)
>> 2020-09-11T10:54:08.800Z|00012|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
>> waiting 2 seconds before reconnect
>> 2020-09-11T10:54:10.802Z|00013|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
>> connecting...
>> 2020-09-11T10:54:10.917Z|00014|stream_ssl|WARN|SSL_connect: unexpected
>> SSL connection close
>> 2020-09-11T10:54:10.917Z|00015|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
>> connection attempt failed (Protocol error)
>> 2020-09-11T10:54:10.917Z|00016|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
>> waiting 4 seconds before reconnect
>> 2020-09-11T10:54:14.921Z|00017|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
>> connecting...
>> 2020-09-11T10:54:15.036Z|00018|stream_ssl|WARN|SSL_connect: unexpected
>> SSL connection close
>> 2020-09-11T10:54:15.036Z|00019|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
>> connection attempt failed (Protocol error)
>> 2020-09-11T10:54:15.036Z|00020|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642:
>> continuing to reconnect in the background but suppressing further logging
>>
>>
>> I have also done the vdsm-tool ovn-config OVIRT_ENGINE_IP
>> OVIRTMGMT_NETWORK_DC
>> This is how the OVIRT_ENGINE_IP is provided in the ovn controller, i can
>> redo it if you wan.
>>
>> After the restart of the ovn-controller the OVIRT ENGINE still shows only
>> two geneve connections one with DC01-host02 and DC02-host01.
>> Chassis "c4b23834-aec7-4bf8-8be7-aa94a50a6144"
>>     hostname: "dc02-host01"
>>     Encap geneve
>>         ip: "DC02-host01_IP"
>>         options: {csum="true"}
>> Chassis "be3abcc9-7358-4040-a37b-8d8a782f239c"
>>     hostname: "DC01-host02"
>>     Encap geneve
>>         ip: "DC01-host02"
>>         options: {csum="true"}
>>
>> I've re-done the vdsm-tool command and nothing changed.... again....with
>> the same errors as the systemctl restart ovn-controller
>>
>> On Fri, Sep 11, 2020 at 1:49 PM Dominik Holler <[email protected]>
>> wrote:
>>
>>> Please include ovirt-users list in your reply, to share the knowledge
>>> and experience with the community!
>>>
>>> On Fri, Sep 11, 2020 at 12:12 PM Konstantinos Betsis <[email protected]>
>>> wrote:
>>>
>>>> Ok below the output per node and DC
>>>> DC01
>>>> node01
>>>>
>>>> [root@dc01-node01 ~]# ovs-vsctl --no-wait get open .
>>>> external-ids:ovn-remote
>>>> "ssl:*OVIRT_ENGINE_IP*:6642"
>>>> [root@ dc01-node01 ~]# ovs-vsctl --no-wait get open .
>>>> external-ids:ovn-encap-type
>>>> geneve
>>>> [root@ dc01-node01 ~]# ovs-vsctl --no-wait get open .
>>>> external-ids:ovn-encap-ip
>>>>
>>>> "*OVIRTMGMT_IP_DC01-NODE01*"
>>>>
>>>> node02
>>>>
>>>> [root@dc01-node02 ~]# ovs-vsctl --no-wait get open .
>>>> external-ids:ovn-remote
>>>> "ssl:*OVIRT_ENGINE_IP*:6642"
>>>> [root@ dc01-node02 ~]# ovs-vsctl --no-wait get open .
>>>> external-ids:ovn-encap-type
>>>> geneve
>>>> [root@ dc01-node02 ~]# ovs-vsctl --no-wait get open .
>>>> external-ids:ovn-encap-ip
>>>>
>>>> "*OVIRTMGMT_IP_DC01-NODE02*"
>>>>
>>>> DC02
>>>> node01
>>>>
>>>> [root@dc02-node01 ~]# ovs-vsctl --no-wait get open .
>>>> external-ids:ovn-remote
>>>> "ssl:*OVIRT_ENGINE_IP*:6642"
>>>> [root@ dc02-node01 ~]# ovs-vsctl --no-wait get open .
>>>> external-ids:ovn-encap-type
>>>> geneve
>>>> [root@ dc02-node01 ~]# ovs-vsctl --no-wait get open .
>>>> external-ids:ovn-encap-ip
>>>>
>>>> "*OVIRTMGMT_IP_DC02-NODE01*"
>>>>
>>>>
>>> Looks good.
>>>
>>>
>>>> DC01 node01 and node02 share the same VM networks and VMs deployed on
>>>> top of them cannot talk to VM on the other hypervisor.
>>>>
>>>
>>> Maybe there is a hint on ovn-controller.log on dc01-node02 ? Maybe
>>> restarting ovn-controller creates more helpful log messages?
>>>
>>> You can also try restart the ovn configuration on all hosts by executing
>>> vdsm-tool ovn-config OVIRT_ENGINE_IP LOCAL_OVIRTMGMT_IP
>>> on each host, this would trigger
>>>
>>> https://github.com/oVirt/ovirt-provider-ovn/blob/master/driver/scripts/setup_ovn_controller.sh
>>> internally.
>>>
>>>
>>>> So I would expect to see the same output for node01 to have a geneve
>>>> tunnel to node02 and vice versa.
>>>>
>>>>
>>> Me too.
>>>
>>>
>>>> On Fri, Sep 11, 2020 at 12:14 PM Dominik Holler <[email protected]>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 11, 2020 at 10:53 AM Konstantinos Betsis <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi Dominik
>>>>>>
>>>>>> OVN is selected as the default network provider on the clusters and
>>>>>> the hosts.
>>>>>>
>>>>>>
>>>>> sounds good.
>>>>> This configuration is required already during the host is added to
>>>>> oVirt Engine, because OVN is configured during this step.
>>>>>
>>>>>
>>>>>> The "ovn-sbctl show" works on the ovirt engine and shows only two
>>>>>> hosts, 1 per DC.
>>>>>>
>>>>>> Chassis "c4b23834-aec7-4bf8-8be7-aa94a50a6144"
>>>>>>     hostname: "dc01-node02"
>>>>>>     Encap geneve
>>>>>>         ip: "X.X.X.X"
>>>>>>         options: {csum="true"}
>>>>>> Chassis "be3abcc9-7358-4040-a37b-8d8a782f239c"
>>>>>>     hostname: "dc02-node1"
>>>>>>     Encap geneve
>>>>>>         ip: "A.A.A.A"
>>>>>>         options: {csum="true"}
>>>>>>
>>>>>>
>>>>>> The new node is not listed (dc01-node1).
>>>>>>
>>>>>> When executed on the nodes the same command (ovn-sbctl show)
>>>>>> times-out on all nodes.....
>>>>>>
>>>>>> The output of the /var/log/openvswitch/ovn-conntroller.log lists on
>>>>>> all logs
>>>>>>
>>>>>> 2020-09-11T08:46:55.197Z|07361|stream_ssl|WARN|SSL_connect:
>>>>>> unexpected SSL connection close
>>>>>>
>>>>>>
>>>>>>
>>>>> Can you please compare the output of
>>>>>
>>>>> ovs-vsctl --no-wait get open . external-ids:ovn-remote
>>>>> ovs-vsctl --no-wait get open . external-ids:ovn-encap-type
>>>>> ovs-vsctl --no-wait get open . external-ids:ovn-encap-ip
>>>>>
>>>>> of the working hosts, e.g. dc01-node02, and the failing host
>>>>> dc01-node1?
>>>>> This should point us the relevant difference in the configuration.
>>>>>
>>>>> Please include ovirt-users list in your replay, to share the knowledge
>>>>> and experience with the community.
>>>>>
>>>>>
>>>>>
>>>>>> Thank you
>>>>>> Best regards
>>>>>> Konstantinos Betsis
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 11, 2020 at 11:01 AM Dominik Holler <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Sep 10, 2020 at 6:26 PM Konstantinos B <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi all
>>>>>>>>
>>>>>>>> We have a small installation based on OVIRT 4.3.
>>>>>>>> 1 Cluster is based on Centos 7 and the other on OVIRT NG Node image.
>>>>>>>>
>>>>>>>> The environment was stable till an upgrade took place a couple of
>>>>>>>> months ago.
>>>>>>>> As such we had to re-install one of the Centos 7 node and start
>>>>>>>> from scratch.
>>>>>>>>
>>>>>>>
>>>>>>> To trigger the automatic configuration of the host, it is required
>>>>>>> to configure ovirt-provider-ovn as the default network provider for the
>>>>>>> cluster before adding the host to oVirt.
>>>>>>>
>>>>>>>
>>>>>>>> Even though the installation completed successfully and VMs are
>>>>>>>> created, the following are not working as expected:
>>>>>>>> 1. ovn geneve tunnels are not established with the other Centos 7
>>>>>>>> node in the cluster.
>>>>>>>> 2. Centos 7 node is configured by ovirt engine however no geneve
>>>>>>>> tunnel is established when "ovn-sbctl show" is issued on the engine.
>>>>>>>>
>>>>>>>
>>>>>>> Does "ovn-sbctl show" list the hosts?
>>>>>>>
>>>>>>>
>>>>>>>> 3. no flows are shown on the engine on port 6642 for the ovs db.
>>>>>>>>
>>>>>>>> Does anyone have any experience on how to troubleshoot OVN on ovirt?
>>>>>>>>
>>>>>>>>
>>>>>>> /var/log/openvswitch/ovncontroller.log on the host should contain a
>>>>>>> helpful hint.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Thank you
>>>>>>>> _______________________________________________
>>>>>>>> Users mailing list -- [email protected]
>>>>>>>> To unsubscribe send an email to [email protected]
>>>>>>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>>>>>>>> oVirt Code of Conduct:
>>>>>>>> https://www.ovirt.org/community/about/community-guidelines/
>>>>>>>> List Archives:
>>>>>>>> https://lists.ovirt.org/archives/list/[email protected]/message/LBVGLQJBWJF3EKFITPR72LBPA5A43WWW/
>>>>>>>>
>>>>>>>
_______________________________________________
Users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives:

Reply via email to