Can you try again with: [OVN REMOTE] ovn-remote=ssl:127.0.0.1:6641 [SSL] https-enabled=false ssl-cacert-file=/etc/pki/ovirt-engine/ca.pem ssl-cert-file=/etc/pki/ovirt-engine/certs/ovirt-provider-ovn.cer ssl-key-file=/etc/pki/ovirt-engine/keys/ovirt-provider-ovn.key.nopass [OVIRT] ovirt-sso-client-secret=*random_test* ovirt-host=https://dc02-ovirt01.testdomain.com:443 <https://dc02-ovirt01.testdomain.com/> ovirt-sso-client-id=ovirt-provider-ovn ovirt-ca-file=/etc/pki/ovirt-engine/apache-ca.pem [NETWORK] port-security-enabled-default=True [PROVIDER]
provider-host=dc02-ovirt01.testdomain.com Please note that the should match the HTTP or HTTPS in the of the ovirt-prover-ovn configuration in oVirt Engine. So if the ovirt-provider-ovn entity in Engine is on HTTP, the config file should use https-enabled=false On Tue, Sep 15, 2020 at 5:56 PM Konstantinos Betsis <[email protected]> wrote: > This is the updated one: > > # This file is automatically generated by engine-setup. Please do not edit > manually > [OVN REMOTE] > ovn-remote=ssl:127.0.0.1:6641 > [SSL] > https-enabled=true > ssl-cacert-file=/etc/pki/ovirt-engine/ca.pem > ssl-cert-file=/etc/pki/ovirt-engine/certs/ovirt-provider-ovn.cer > ssl-key-file=/etc/pki/ovirt-engine/keys/ovirt-provider-ovn.key.nopass > [OVIRT] > ovirt-sso-client-secret=*random_text* > ovirt-host=https://dc02-ovirt01.testdomain.com:443 > ovirt-sso-client-id=ovirt-provider-ovn > ovirt-ca-file=/etc/pki/ovirt-engine/apache-ca.pem > [NETWORK] > port-security-enabled-default=True > [PROVIDER] > provider-host=dc02-ovirt01.testdomain.com > [AUTH] > auth-plugin=auth.plugins.static_token:NoAuthPlugin > > > However, it still does not connect. > It prompts for the certificate but then fails and prompts to see the log > but the ovirt-provider-ovn.log does not list anything. > > Yes we've got ovirt for about a year now from about version 4.1 > > This might explain the trouble. Upgrade of ovirt-provider-ovn should work flawlessly starting from oVirt 4.2. > On Tue, Sep 15, 2020 at 6:44 PM Dominik Holler <[email protected]> wrote: > >> >> >> On Tue, Sep 15, 2020 at 5:34 PM Konstantinos Betsis <[email protected]> >> wrote: >> >>> There is a file with the below entries >>> >> >> Impressive, do you know when this config file was created and if it was >> manually modified? >> Is this an upgrade from oVirt 4.1? >> >> >>> [root@dc02-ovirt01 log]# cat >>> /etc/ovirt-provider-ovn/conf.d/10-setup-ovirt-provider-ovn.conf >>> # This file is automatically generated by engine-setup. Please do not >>> edit manually >>> [OVN REMOTE] >>> ovn-remote=tcp:127.0.0.1:6641 >>> [SSL] >>> https-enabled=false >>> ssl-cacert-file=/etc/pki/ovirt-engine/ca.pem >>> ssl-cert-file=/etc/pki/ovirt-engine/certs/ovirt-provider-ovn.cer >>> ssl-key-file=/etc/pki/ovirt-engine/keys/ovirt-provider-ovn.key.nopass >>> [OVIRT] >>> ovirt-sso-client-secret=*random_test* >>> ovirt-host=https://dc02-ovirt01.testdomain.com:443 >>> ovirt-sso-client-id=ovirt-provider-ovn >>> ovirt-ca-file=/etc/pki/ovirt-engine/apache-ca.pem >>> [NETWORK] >>> port-security-enabled-default=True >>> [PROVIDER] >>> >>> provider-host=dc02-ovirt01.testdomain.com >>> >>> The only entry missing is the [AUTH] and under [SSL] the https-enabled >>> is false. Should I edit this in this file or is this going to break >>> everything? >>> >>> >> Changing the file should improve, but better create a backup into another >> diretory before modification. >> The only required change is >> from >> ovn-remote=tcp:127.0.0.1:6641 >> to >> ovn-remote=ssl:127.0.0.1:6641 >> >> >> >> >>> On Tue, Sep 15, 2020 at 6:27 PM Dominik Holler <[email protected]> >>> wrote: >>> >>>> >>>> >>>> On Tue, Sep 15, 2020 at 5:11 PM Konstantinos Betsis <[email protected]> >>>> wrote: >>>> >>>>> Hi Dominik >>>>> >>>>> That immediately fixed the geneve tunnels between all hosts. >>>>> >>>>> >>>> thanks for the feedback. >>>> >>>> >>>>> However, the ovn provider is not broken. >>>>> After fixing the networks we tried to move a VM to the DC01-host01 so >>>>> we powered it down and simply configured it to run on dc01-node01. >>>>> >>>>> While checking the logs on the ovirt engine i noticed the below: >>>>> Failed to synchronize networks of Provider ovirt-provider-ovn. >>>>> >>>>> The ovn-provider configure on the engine is the below: >>>>> Name: ovirt-provider-ovn >>>>> Description: oVirt network provider for OVN >>>>> Type: External Network Provider >>>>> Network Plugin: oVirt Network Provider for OVN >>>>> Automatic Synchronization: Checked >>>>> Unmanaged: Unchecked >>>>> Provider URL: http:localhost:9696 >>>>> Requires Authentication: Checked >>>>> Username: admin@internal >>>>> Password: "The admin password" >>>>> Protocol: hTTP >>>>> Host Name: dc02-ovirt01 >>>>> API Port: 35357 >>>>> API Version: v2.0 >>>>> Tenant Name: "Empty" >>>>> >>>>> In the past this was deleted by an engineer and recreated as per the >>>>> documentation, and it worked. Do we need to update something due to the >>>>> SSL >>>>> on the ovn? >>>>> >>>>> >>>> Is there a file in /etc/ovirt-provider-ovn/conf.d/ ? >>>> engine-setup should have created one. >>>> If the file is missing, for testing purposes, you can create a >>>> file /etc/ovirt-provider-ovn/conf.d/00-setup-ovirt-provider-ovn-test.conf : >>>> [PROVIDER] >>>> provider-host=REPLACE_WITH_FQDN >>>> [SSL] >>>> ssl-cert-file=/etc/pki/ovirt-engine/certs/ovirt-provider-ovn.cer >>>> ssl-key-file=/etc/pki/ovirt-engine/keys/ovirt-provider-ovn.key.nopass >>>> ssl-cacert-file=/etc/pki/ovirt-engine/ca.pem >>>> https-enabled=true >>>> [OVN REMOTE] >>>> ovn-remote=ssl:127.0.0.1:6641 >>>> [AUTH] >>>> auth-plugin=auth.plugins.static_token:NoAuthPlugin >>>> [NETWORK] >>>> port-security-enabled-default=True >>>> >>>> and restart the ovirt-provider-ovn service. >>>> >>>> >>>> >>>> >>>>> From the ovn-provider logs the below is generated after a service >>>>> restart and when the start VM is triggered >>>>> >>>>> 2020-09-15 15:07:33,579 root Starting server >>>>> 2020-09-15 15:07:33,579 root Version: 1.2.29-1 >>>>> 2020-09-15 15:07:33,579 root Build date: 20191217125241 >>>>> 2020-09-15 15:07:33,579 root Githash: cb5a80d >>>>> 2020-09-15 15:08:26,582 root From: ::ffff:127.0.0.1:59980 Request: >>>>> GET /v2.0/ports >>>>> 2020-09-15 15:08:26,582 root Could not retrieve schema from tcp: >>>>> 127.0.0.1:6641: Unknown error -1 >>>>> Traceback (most recent call last): >>>>> File "/usr/share/ovirt-provider-ovn/handlers/base_handler.py", line >>>>> 138, in _handle_request >>>>> method, path_parts, content >>>>> File "/usr/share/ovirt-provider-ovn/handlers/selecting_handler.py", >>>>> line 175, in handle_request >>>>> return self.call_response_handler(handler, content, parameters) >>>>> File "/usr/share/ovirt-provider-ovn/handlers/neutron.py", line 35, >>>>> in call_response_handler >>>>> with NeutronApi() as ovn_north: >>>>> File "/usr/share/ovirt-provider-ovn/neutron/neutron_api.py", line >>>>> 95, in __init__ >>>>> self.ovsidl, self.idl = ovn_connection.connect() >>>>> File "/usr/share/ovirt-provider-ovn/ovn_connection.py", line 46, in >>>>> connect >>>>> ovnconst.OVN_NORTHBOUND >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/connection.py", >>>>> line 127, in from_server >>>>> helper = idlutils.get_schema_helper(connection_string, schema_name) >>>>> File >>>>> "/usr/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", >>>>> line 128, in get_schema_helper >>>>> 'err': os.strerror(err)}) >>>>> Exception: Could not retrieve schema from tcp:127.0.0.1:6641: Unknown >>>>> error -1 >>>>> >>>>> >>>>> When i update the ovn provider from the GUI to have >>>>> https://localhost:9696/ and HTTPS as the protocol the test fails. >>>>> >>>>> On Tue, Sep 15, 2020 at 5:35 PM Dominik Holler <[email protected]> >>>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Mon, Sep 14, 2020 at 9:25 AM Konstantinos Betsis < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi Dominik >>>>>>> >>>>>>> When these commands are used on the ovirt-engine host the output is >>>>>>> the one depicted in your email. >>>>>>> For your reference see also below: >>>>>>> >>>>>>> [root@ath01-ovirt01 certs]# ovn-nbctl get-ssl >>>>>>> Private key: /etc/pki/ovirt-engine/keys/ovn-ndb.key.nopass >>>>>>> Certificate: /etc/pki/ovirt-engine/certs/ovn-ndb.cer >>>>>>> CA Certificate: /etc/pki/ovirt-engine/ca.pem >>>>>>> Bootstrap: false >>>>>>> [root@ath01-ovirt01 certs]# ovn-nbctl get-connection >>>>>>> ptcp:6641 >>>>>>> >>>>>>> [root@ath01-ovirt01 certs]# ovn-sbctl get-ssl >>>>>>> Private key: /etc/pki/ovirt-engine/keys/ovn-sdb.key.nopass >>>>>>> Certificate: /etc/pki/ovirt-engine/certs/ovn-sdb.cer >>>>>>> CA Certificate: /etc/pki/ovirt-engine/ca.pem >>>>>>> Bootstrap: false >>>>>>> [root@ath01-ovirt01 certs]# ovn-sbctl get-connection >>>>>>> read-write role="" ptcp:6642 >>>>>>> >>>>>>> >>>>>> ^^^ the line above points to the problem: ovn-central is configured >>>>>> to use plain TCP without ssl. >>>>>> engine-setup usually configures ovn-central to use SSL. That the >>>>>> files /etc/pki/ovirt-engine/keys/ovn-* exist, shows, >>>>>> that engine-setup was triggered correctly. Looks like the ovn db was >>>>>> dropped somehow, this should not happen. >>>>>> This can be fixed manually by executing the following commands on >>>>>> engine's machine: >>>>>> ovn-nbctl set-ssl /etc/pki/ovirt-engine/keys/ovn-ndb.key.nopass >>>>>> /etc/pki/ovirt-engine/certs/ovn-ndb.cer /etc/pki/ovirt-engine/ca.pem >>>>>> ovn-nbctl set-connection pssl:6641 >>>>>> ovn-sbctl set-ssl /etc/pki/ovirt-engine/keys/ovn-sdb.key.nopass >>>>>> /etc/pki/ovirt-engine/certs/ovn-sdb.cer /etc/pki/ovirt-engine/ca.pem >>>>>> ovn-sbctl set-connection pssl:6642 >>>>>> >>>>>> The /var/log/openvswitch/ovn-controller.log on the hosts should tell >>>>>> that br-int.mgmt is connected now. >>>>>> >>>>>> >>>>>> >>>>>>> [root@ath01-ovirt01 certs]# ls -l /etc/pki/ovirt-engine/keys/ovn-* >>>>>>> -rw-r-----. 1 root hugetlbfs 1828 Jun 25 11:08 >>>>>>> /etc/pki/ovirt-engine/keys/ovn-ndb.key.nopass >>>>>>> -rw-------. 1 root root 2893 Jun 25 11:08 >>>>>>> /etc/pki/ovirt-engine/keys/ovn-ndb.p12 >>>>>>> -rw-r-----. 1 root hugetlbfs 1828 Jun 25 11:08 >>>>>>> /etc/pki/ovirt-engine/keys/ovn-sdb.key.nopass >>>>>>> -rw-------. 1 root root 2893 Jun 25 11:08 >>>>>>> /etc/pki/ovirt-engine/keys/ovn-sdb.p12 >>>>>>> >>>>>>> When i try the above commands on the node hosts the following >>>>>>> happens: >>>>>>> ovn-nbctl get-ssl / get-connection >>>>>>> ovn-nbctl: unix:/var/run/openvswitch/ovnnb_db.sock: database >>>>>>> connection failed (No such file or directory) >>>>>>> The above i believe is expected since no northbound connections >>>>>>> should be established from the host nodes. >>>>>>> >>>>>>> ovn-sbctl get-ssl /get-connection >>>>>>> The output is stuck till i terminate it. >>>>>>> >>>>>>> >>>>>> Yes, the ovn-* commands works only on engine's machine, which has the >>>>>> role ovn-central. >>>>>> On the hosts, there is only the ovn-controller, which connects the >>>>>> ovn southbound to openvswitch on the host. >>>>>> >>>>>> >>>>>>> For the requested logs the below are found in the ovsdb-server-sb.log >>>>>>> >>>>>>> 2020-09-14T07:18:38.187Z|219636|reconnect|WARN|tcp:DC02-host01:33146: >>>>>>> connection dropped (Protocol error) >>>>>>> 2020-09-14T07:18:41.946Z|219637|reconnect|WARN|tcp:DC01-host01:51188: >>>>>>> connection dropped (Protocol error) >>>>>>> 2020-09-14T07:18:43.033Z|219638|reconnect|WARN|tcp:DC01-host02:37044: >>>>>>> connection dropped (Protocol error) >>>>>>> 2020-09-14T07:18:46.198Z|219639|reconnect|WARN|tcp:DC02-host01:33148: >>>>>>> connection dropped (Protocol error) >>>>>>> 2020-09-14T07:18:50.069Z|219640|jsonrpc|WARN|Dropped 4 log messages >>>>>>> in last 12 seconds (most recently, 4 seconds ago) due to excessive rate >>>>>>> 2020-09-14T07:18:50.069Z|219641|jsonrpc|WARN|tcp:DC01-host01:51190: >>>>>>> error parsing stream: line 0, column 0, byte 0: invalid character U+0016 >>>>>>> 2020-09-14T07:18:50.069Z|219642|jsonrpc|WARN|Dropped 4 log messages >>>>>>> in last 12 seconds (most recently, 4 seconds ago) due to excessive rate >>>>>>> 2020-09-14T07:18:50.069Z|219643|jsonrpc|WARN|tcp:DC01-host01:51190: >>>>>>> received SSL data on JSON-RPC channel >>>>>>> 2020-09-14T07:18:50.070Z|219644|reconnect|WARN|tcp:DC01-host01:51190: >>>>>>> connection dropped (Protocol error) >>>>>>> 2020-09-14T07:18:51.147Z|219645|reconnect|WARN|tcp:DC01-host02:37046: >>>>>>> connection dropped (Protocol error) >>>>>>> 2020-09-14T07:18:54.209Z|219646|reconnect|WARN|tcp:DC02-host01:33150: >>>>>>> connection dropped (Protocol error) >>>>>>> 2020-09-14T07:18:58.192Z|219647|reconnect|WARN|tcp:DC01-host01:51192: >>>>>>> connection dropped (Protocol error) >>>>>>> 2020-09-14T07:18:59.262Z|219648|jsonrpc|WARN|Dropped 3 log messages >>>>>>> in last 8 seconds (most recently, 1 seconds ago) due to excessive rate >>>>>>> 2020-09-14T07:18:59.262Z|219649|jsonrpc|WARN|tcp:DC01-host02:37048: >>>>>>> error parsing stream: line 0, column 0, byte 0: invalid character U+0016 >>>>>>> 2020-09-14T07:18:59.263Z|219650|jsonrpc|WARN|Dropped 3 log messages >>>>>>> in last 8 seconds (most recently, 1 seconds ago) due to excessive rate >>>>>>> 2020-09-14T07:18:59.263Z|219651|jsonrpc|WARN|tcp:DC01-host02:37048: >>>>>>> received SSL data on JSON-RPC channel >>>>>>> 2020-09-14T07:18:59.263Z|219652|reconnect|WARN|tcp:DC01-host02:37048: >>>>>>> connection dropped (Protocol error) >>>>>>> 2020-09-14T07:19:02.220Z|219653|reconnect|WARN|tcp:DC02-host01:33152: >>>>>>> connection dropped (Protocol error) >>>>>>> 2020-09-14T07:19:06.316Z|219654|reconnect|WARN|tcp:DC01-host01:51194: >>>>>>> connection dropped (Protocol error) >>>>>>> 2020-09-14T07:19:07.386Z|219655|reconnect|WARN|tcp:DC01-host02:37050: >>>>>>> connection dropped (Protocol error) >>>>>>> 2020-09-14T07:19:10.232Z|219656|reconnect|WARN|tcp:DC02-host01:33154: >>>>>>> connection dropped (Protocol error) >>>>>>> 2020-09-14T07:19:14.439Z|219657|jsonrpc|WARN|Dropped 4 log messages >>>>>>> in last 12 seconds (most recently, 4 seconds ago) due to excessive rate >>>>>>> 2020-09-14T07:19:14.439Z|219658|jsonrpc|WARN|tcp:DC01-host01:51196: >>>>>>> error parsing stream: line 0, column 0, byte 0: invalid character U+0016 >>>>>>> 2020-09-14T07:19:14.439Z|219659|jsonrpc|WARN|Dropped 4 log messages >>>>>>> in last 12 seconds (most recently, 4 seconds ago) due to excessive rate >>>>>>> 2020-09-14T07:19:14.439Z|219660|jsonrpc|WARN|tcp:DC01-host01:51196: >>>>>>> received SSL data on JSON-RPC channel >>>>>>> 2020-09-14T07:19:14.440Z|219661|reconnect|WARN|tcp:DC01-host01:51196: >>>>>>> connection dropped (Protocol error) >>>>>>> 2020-09-14T07:19:15.505Z|219662|reconnect|WARN|tcp:DC01-host02:37052: >>>>>>> connection dropped (Protocol error) >>>>>>> >>>>>>> >>>>>>> How can we fix these SSL errors? >>>>>>> >>>>>> >>>>>> I addressed this above. >>>>>> >>>>>> >>>>>>> I thought vdsm did the certificate provisioning on the host nodes as >>>>>>> to communicate to the engine host node. >>>>>>> >>>>>>> >>>>>> Yes, this seems to work in your scenario, just the SSL configuration >>>>>> on the ovn-central was lost. >>>>>> >>>>>> >>>>>>> On Fri, Sep 11, 2020 at 6:39 PM Dominik Holler <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Looks still like the ovn-controller on the host has problems >>>>>>>> communicating with ovn-southbound. >>>>>>>> >>>>>>>> Are there any hints in /var/log/openvswitch/*.log, >>>>>>>> especially in /var/log/openvswitch/ovsdb-server-sb.log ? >>>>>>>> >>>>>>>> Can you please check the output of >>>>>>>> >>>>>>>> ovn-nbctl get-ssl >>>>>>>> ovn-nbctl get-connection >>>>>>>> ovn-sbctl get-ssl >>>>>>>> ovn-sbctl get-connection >>>>>>>> ls -l /etc/pki/ovirt-engine/keys/ovn-* >>>>>>>> >>>>>>>> it should be similar to >>>>>>>> >>>>>>>> [root@ovirt-43 ~]# ovn-nbctl get-ssl >>>>>>>> Private key: /etc/pki/ovirt-engine/keys/ovn-ndb.key.nopass >>>>>>>> Certificate: /etc/pki/ovirt-engine/certs/ovn-ndb.cer >>>>>>>> CA Certificate: /etc/pki/ovirt-engine/ca.pem >>>>>>>> Bootstrap: false >>>>>>>> [root@ovirt-43 ~]# ovn-nbctl get-connection >>>>>>>> pssl:6641:[::] >>>>>>>> [root@ovirt-43 ~]# ovn-sbctl get-ssl >>>>>>>> Private key: /etc/pki/ovirt-engine/keys/ovn-sdb.key.nopass >>>>>>>> Certificate: /etc/pki/ovirt-engine/certs/ovn-sdb.cer >>>>>>>> CA Certificate: /etc/pki/ovirt-engine/ca.pem >>>>>>>> Bootstrap: false >>>>>>>> [root@ovirt-43 ~]# ovn-sbctl get-connection >>>>>>>> read-write role="" pssl:6642:[::] >>>>>>>> [root@ovirt-43 ~]# ls -l /etc/pki/ovirt-engine/keys/ovn-* >>>>>>>> -rw-r-----. 1 root hugetlbfs 1828 Oct 14 2019 >>>>>>>> /etc/pki/ovirt-engine/keys/ovn-ndb.key.nopass >>>>>>>> -rw-------. 1 root root 2709 Oct 14 2019 >>>>>>>> /etc/pki/ovirt-engine/keys/ovn-ndb.p12 >>>>>>>> -rw-r-----. 1 root hugetlbfs 1828 Oct 14 2019 >>>>>>>> /etc/pki/ovirt-engine/keys/ovn-sdb.key.nopass >>>>>>>> -rw-------. 1 root root 2709 Oct 14 2019 >>>>>>>> /etc/pki/ovirt-engine/keys/ovn-sdb.p12 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Sep 11, 2020 at 1:10 PM Konstantinos Betsis < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> I did a restart of the ovn-controller, this is the output of the >>>>>>>>> ovn-controller.log >>>>>>>>> >>>>>>>>> 2020-09-11T10:54:07.566Z|00001|vlog|INFO|opened log file >>>>>>>>> /var/log/openvswitch/ovn-controller.log >>>>>>>>> 2020-09-11T10:54:07.568Z|00002|reconnect|INFO|unix:/var/run/openvswitch/db.sock: >>>>>>>>> connecting... >>>>>>>>> 2020-09-11T10:54:07.568Z|00003|reconnect|INFO|unix:/var/run/openvswitch/db.sock: >>>>>>>>> connected >>>>>>>>> 2020-09-11T10:54:07.570Z|00004|main|INFO|OVS IDL reconnected, >>>>>>>>> force recompute. >>>>>>>>> 2020-09-11T10:54:07.571Z|00005|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: >>>>>>>>> connecting... >>>>>>>>> 2020-09-11T10:54:07.571Z|00006|main|INFO|OVNSB IDL reconnected, >>>>>>>>> force recompute. >>>>>>>>> 2020-09-11T10:54:07.685Z|00007|stream_ssl|WARN|SSL_connect: >>>>>>>>> unexpected SSL connection close >>>>>>>>> 2020-09-11T10:54:07.685Z|00008|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: >>>>>>>>> connection attempt failed (Protocol error) >>>>>>>>> 2020-09-11T10:54:08.685Z|00009|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: >>>>>>>>> connecting... >>>>>>>>> 2020-09-11T10:54:08.800Z|00010|stream_ssl|WARN|SSL_connect: >>>>>>>>> unexpected SSL connection close >>>>>>>>> 2020-09-11T10:54:08.800Z|00011|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: >>>>>>>>> connection attempt failed (Protocol error) >>>>>>>>> 2020-09-11T10:54:08.800Z|00012|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: >>>>>>>>> waiting 2 seconds before reconnect >>>>>>>>> 2020-09-11T10:54:10.802Z|00013|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: >>>>>>>>> connecting... >>>>>>>>> 2020-09-11T10:54:10.917Z|00014|stream_ssl|WARN|SSL_connect: >>>>>>>>> unexpected SSL connection close >>>>>>>>> 2020-09-11T10:54:10.917Z|00015|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: >>>>>>>>> connection attempt failed (Protocol error) >>>>>>>>> 2020-09-11T10:54:10.917Z|00016|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: >>>>>>>>> waiting 4 seconds before reconnect >>>>>>>>> 2020-09-11T10:54:14.921Z|00017|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: >>>>>>>>> connecting... >>>>>>>>> 2020-09-11T10:54:15.036Z|00018|stream_ssl|WARN|SSL_connect: >>>>>>>>> unexpected SSL connection close >>>>>>>>> 2020-09-11T10:54:15.036Z|00019|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: >>>>>>>>> connection attempt failed (Protocol error) >>>>>>>>> 2020-09-11T10:54:15.036Z|00020|reconnect|INFO|ssl:OVIRT_ENGINE_IP:6642: >>>>>>>>> continuing to reconnect in the background but suppressing further >>>>>>>>> logging >>>>>>>>> >>>>>>>>> >>>>>>>>> I have also done the vdsm-tool ovn-config OVIRT_ENGINE_IP >>>>>>>>> OVIRTMGMT_NETWORK_DC >>>>>>>>> This is how the OVIRT_ENGINE_IP is provided in the ovn controller, >>>>>>>>> i can redo it if you wan. >>>>>>>>> >>>>>>>>> After the restart of the ovn-controller the OVIRT ENGINE still >>>>>>>>> shows only two geneve connections one with DC01-host02 and >>>>>>>>> DC02-host01. >>>>>>>>> Chassis "c4b23834-aec7-4bf8-8be7-aa94a50a6144" >>>>>>>>> hostname: "dc02-host01" >>>>>>>>> Encap geneve >>>>>>>>> ip: "DC02-host01_IP" >>>>>>>>> options: {csum="true"} >>>>>>>>> Chassis "be3abcc9-7358-4040-a37b-8d8a782f239c" >>>>>>>>> hostname: "DC01-host02" >>>>>>>>> Encap geneve >>>>>>>>> ip: "DC01-host02" >>>>>>>>> options: {csum="true"} >>>>>>>>> >>>>>>>>> I've re-done the vdsm-tool command and nothing changed.... >>>>>>>>> again....with the same errors as the systemctl restart ovn-controller >>>>>>>>> >>>>>>>>> On Fri, Sep 11, 2020 at 1:49 PM Dominik Holler <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Please include ovirt-users list in your reply, to share >>>>>>>>>> the knowledge and experience with the community! >>>>>>>>>> >>>>>>>>>> On Fri, Sep 11, 2020 at 12:12 PM Konstantinos Betsis < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Ok below the output per node and DC >>>>>>>>>>> DC01 >>>>>>>>>>> node01 >>>>>>>>>>> >>>>>>>>>>> [root@dc01-node01 ~]# ovs-vsctl --no-wait get open . >>>>>>>>>>> external-ids:ovn-remote >>>>>>>>>>> "ssl:*OVIRT_ENGINE_IP*:6642" >>>>>>>>>>> [root@ dc01-node01 ~]# ovs-vsctl --no-wait get open . >>>>>>>>>>> external-ids:ovn-encap-type >>>>>>>>>>> geneve >>>>>>>>>>> [root@ dc01-node01 ~]# ovs-vsctl --no-wait get open . >>>>>>>>>>> external-ids:ovn-encap-ip >>>>>>>>>>> >>>>>>>>>>> "*OVIRTMGMT_IP_DC01-NODE01*" >>>>>>>>>>> >>>>>>>>>>> node02 >>>>>>>>>>> >>>>>>>>>>> [root@dc01-node02 ~]# ovs-vsctl --no-wait get open . >>>>>>>>>>> external-ids:ovn-remote >>>>>>>>>>> "ssl:*OVIRT_ENGINE_IP*:6642" >>>>>>>>>>> [root@ dc01-node02 ~]# ovs-vsctl --no-wait get open . >>>>>>>>>>> external-ids:ovn-encap-type >>>>>>>>>>> geneve >>>>>>>>>>> [root@ dc01-node02 ~]# ovs-vsctl --no-wait get open . >>>>>>>>>>> external-ids:ovn-encap-ip >>>>>>>>>>> >>>>>>>>>>> "*OVIRTMGMT_IP_DC01-NODE02*" >>>>>>>>>>> >>>>>>>>>>> DC02 >>>>>>>>>>> node01 >>>>>>>>>>> >>>>>>>>>>> [root@dc02-node01 ~]# ovs-vsctl --no-wait get open . >>>>>>>>>>> external-ids:ovn-remote >>>>>>>>>>> "ssl:*OVIRT_ENGINE_IP*:6642" >>>>>>>>>>> [root@ dc02-node01 ~]# ovs-vsctl --no-wait get open . >>>>>>>>>>> external-ids:ovn-encap-type >>>>>>>>>>> geneve >>>>>>>>>>> [root@ dc02-node01 ~]# ovs-vsctl --no-wait get open . >>>>>>>>>>> external-ids:ovn-encap-ip >>>>>>>>>>> >>>>>>>>>>> "*OVIRTMGMT_IP_DC02-NODE01*" >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> Looks good. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> DC01 node01 and node02 share the same VM networks and VMs >>>>>>>>>>> deployed on top of them cannot talk to VM on the other hypervisor. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Maybe there is a hint on ovn-controller.log on dc01-node02 ? >>>>>>>>>> Maybe restarting ovn-controller creates more helpful log messages? >>>>>>>>>> >>>>>>>>>> You can also try restart the ovn configuration on all hosts by >>>>>>>>>> executing >>>>>>>>>> vdsm-tool ovn-config OVIRT_ENGINE_IP LOCAL_OVIRTMGMT_IP >>>>>>>>>> on each host, this would trigger >>>>>>>>>> >>>>>>>>>> https://github.com/oVirt/ovirt-provider-ovn/blob/master/driver/scripts/setup_ovn_controller.sh >>>>>>>>>> internally. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> So I would expect to see the same output for node01 to have a >>>>>>>>>>> geneve tunnel to node02 and vice versa. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> Me too. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On Fri, Sep 11, 2020 at 12:14 PM Dominik Holler < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Sep 11, 2020 at 10:53 AM Konstantinos Betsis < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Dominik >>>>>>>>>>>>> >>>>>>>>>>>>> OVN is selected as the default network provider on the >>>>>>>>>>>>> clusters and the hosts. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> sounds good. >>>>>>>>>>>> This configuration is required already during the host is added >>>>>>>>>>>> to oVirt Engine, because OVN is configured during this step. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> The "ovn-sbctl show" works on the ovirt engine and shows only >>>>>>>>>>>>> two hosts, 1 per DC. >>>>>>>>>>>>> >>>>>>>>>>>>> Chassis "c4b23834-aec7-4bf8-8be7-aa94a50a6144" >>>>>>>>>>>>> hostname: "dc01-node02" >>>>>>>>>>>>> Encap geneve >>>>>>>>>>>>> ip: "X.X.X.X" >>>>>>>>>>>>> options: {csum="true"} >>>>>>>>>>>>> Chassis "be3abcc9-7358-4040-a37b-8d8a782f239c" >>>>>>>>>>>>> hostname: "dc02-node1" >>>>>>>>>>>>> Encap geneve >>>>>>>>>>>>> ip: "A.A.A.A" >>>>>>>>>>>>> options: {csum="true"} >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> The new node is not listed (dc01-node1). >>>>>>>>>>>>> >>>>>>>>>>>>> When executed on the nodes the same command (ovn-sbctl show) >>>>>>>>>>>>> times-out on all nodes..... >>>>>>>>>>>>> >>>>>>>>>>>>> The output of the /var/log/openvswitch/ovn-conntroller.log >>>>>>>>>>>>> lists on all logs >>>>>>>>>>>>> >>>>>>>>>>>>> 2020-09-11T08:46:55.197Z|07361|stream_ssl|WARN|SSL_connect: >>>>>>>>>>>>> unexpected SSL connection close >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> Can you please compare the output of >>>>>>>>>>>> >>>>>>>>>>>> ovs-vsctl --no-wait get open . external-ids:ovn-remote >>>>>>>>>>>> ovs-vsctl --no-wait get open . external-ids:ovn-encap-type >>>>>>>>>>>> ovs-vsctl --no-wait get open . external-ids:ovn-encap-ip >>>>>>>>>>>> >>>>>>>>>>>> of the working hosts, e.g. dc01-node02, and the failing host >>>>>>>>>>>> dc01-node1? >>>>>>>>>>>> This should point us the relevant difference in the >>>>>>>>>>>> configuration. >>>>>>>>>>>> >>>>>>>>>>>> Please include ovirt-users list in your replay, to share >>>>>>>>>>>> the knowledge and experience with the community. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Thank you >>>>>>>>>>>>> Best regards >>>>>>>>>>>>> Konstantinos Betsis >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Sep 11, 2020 at 11:01 AM Dominik Holler < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, Sep 10, 2020 at 6:26 PM Konstantinos B < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi all >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> We have a small installation based on OVIRT 4.3. >>>>>>>>>>>>>>> 1 Cluster is based on Centos 7 and the other on OVIRT NG >>>>>>>>>>>>>>> Node image. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The environment was stable till an upgrade took place a >>>>>>>>>>>>>>> couple of months ago. >>>>>>>>>>>>>>> As such we had to re-install one of the Centos 7 node and >>>>>>>>>>>>>>> start from scratch. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> To trigger the automatic configuration of the host, it is >>>>>>>>>>>>>> required to configure ovirt-provider-ovn as the default network >>>>>>>>>>>>>> provider >>>>>>>>>>>>>> for the cluster before adding the host to oVirt. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Even though the installation completed successfully and VMs >>>>>>>>>>>>>>> are created, the following are not working as expected: >>>>>>>>>>>>>>> 1. ovn geneve tunnels are not established with the other >>>>>>>>>>>>>>> Centos 7 node in the cluster. >>>>>>>>>>>>>>> 2. Centos 7 node is configured by ovirt engine however no >>>>>>>>>>>>>>> geneve tunnel is established when "ovn-sbctl show" is issued on >>>>>>>>>>>>>>> the engine. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Does "ovn-sbctl show" list the hosts? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> 3. no flows are shown on the engine on port 6642 for the ovs >>>>>>>>>>>>>>> db. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Does anyone have any experience on how to troubleshoot OVN >>>>>>>>>>>>>>> on ovirt? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> /var/log/openvswitch/ovncontroller.log on the host should >>>>>>>>>>>>>> contain a helpful hint. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thank you >>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>> Users mailing list -- [email protected] >>>>>>>>>>>>>>> To unsubscribe send an email to [email protected] >>>>>>>>>>>>>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html >>>>>>>>>>>>>>> oVirt Code of Conduct: >>>>>>>>>>>>>>> https://www.ovirt.org/community/about/community-guidelines/ >>>>>>>>>>>>>>> List Archives: >>>>>>>>>>>>>>> https://lists.ovirt.org/archives/list/[email protected]/message/LBVGLQJBWJF3EKFITPR72LBPA5A43WWW/ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>
_______________________________________________ Users mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/5S4RSFWX7URNWPQPKGH3U2RBTNWKJU4P/

