On Tue, Oct 6, 2020 at 12:25 PM Konstantinos Betsis <[email protected]> wrote:
> Hi Dominic > > That fixed it. > Thanks for letting us know and your patience. > VMs have full connectivity and I don't see any errors on the nodes ovn > controller. > > Thanks for the help and quick responses, I really appreciate it. > > In summary for future reference: > Thanks for this nice summary, I am sure this will help others in the community. > If certificate errors are met need to review: > > ovs-vsctl --no-wait get open . external-ids:ovn-remote > ovs-vsctl --no-wait get open . external-ids:ovn-encap-type > ovs-vsctl --no-wait get open . external-ids:ovn-encap-ip > > The ovn-remote will state if the OVN connection is using TCP or TLS. > > We then do: > > ovn-nbctl get-ssl > ovn-nbctl get-connection > ovn-sbctl get-ssl > ovn-sbctl get-connection > ls -l /etc/pki/ovirt-engine/keys/ovn-* > > > As to check the ovn northbound and southbound configuration and listening > ports and if TCP or TLS is used. > > If tls is used we must update the nodes with: > > ovn-nbctl set-ssl "ovn northbound interface certificate key" "ovn > northbound interface certificate file" > ovn-nbctl set-connection pssl:6641 > ovn-sbctl set-ssl "ovn southbound interface certificate key" "ovn > southbound interface certificate file" > ovn-sbctl set-connection pssl:6642 > > > The certificates must reside within nodes through the VDSM client. > > Finally, we check that all tunnels are established and working ok. > > If we get to a stuck chassis we simply stop the ovn service on the node > and delete the chassis from the northbound interface through: > > ovn-sbctl chassis-del "chassis_ID" > > Thank you > Best Regards > Konstantinos Betsis > > > On Tue, Oct 6, 2020 at 11:37 AM Dominik Holler <[email protected]> wrote: > >> >> >> On Tue, Oct 6, 2020 at 10:31 AM Konstantinos Betsis <[email protected]> >> wrote: >> >>> Hi guys >>> >>> Sorry to disturb you but i am pretty much stuck at this point with the >>> ovn southbound interface. >>> >>> Is there a way i can flush it and have it reconfigured from ovirt? >>> >>> >> Can you please delete the chassis via >> >> ovn-sbctl chassis-del 32cd0eb4-d763-4036-bbc9-a4d3a4013ee6 >> >> while 32cd0eb4-d763-4036-bbc9-a4d3a4013ee6 should be replaced with the >> id of the suspicious chassis show by >> ovn-sbctl show >> >> The ovn-controller will add the chassis again in a few seconds, but I >> hope that this would remove the inconsistency in the db. >> >> >> >>> Thank you >>> Best Regards >>> Konstantinos Betsis >>> >>> On Thu, Oct 1, 2020 at 6:52 PM Konstantinos Betsis <[email protected]> >>> wrote: >>> >>>> Regarding the ovn-controller logs.... >>>> >>>> 2020-10-01T15:51:03.156Z|14143|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:03.220Z|14144|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:03.284Z|14145|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:03.347Z|14146|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:03.411Z|14147|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:03.474Z|14148|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:03.538Z|14149|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:03.601Z|14150|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:03.664Z|14151|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:03.727Z|14152|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:08.792Z|14153|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:08.855Z|14154|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:08.919Z|14155|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:08.982Z|14156|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:09.046Z|14157|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:09.109Z|14158|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:09.173Z|14159|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:09.236Z|14160|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> 2020-10-01T15:51:09.299Z|14161|main|INFO|OVNSB commit failed, force >>>> recompute next time. >>>> >>>> >>>> I don't think we can see anything more from these. >>>> >>>> >>>> >>>> On Thu, Oct 1, 2020 at 6:12 PM Konstantinos Betsis <[email protected]> >>>> wrote: >>>> >>>>> Hi Dimitru >>>>> >>>>> I've seen that as well..... >>>>> I've deleted the dc01-node2 (ams03-hypersec02) from ovirt. >>>>> I've also issued ovs-vsctl emer-reset. >>>>> >>>>> But ovn-sbctl list chassis still depicts the node twice. >>>>> The ovs-sbctl show still depicts 3 geneve tunnels from dc01-node2.... >>>>> >>>>> How, can we fix this? >>>>> >>>>> On Thu, Oct 1, 2020 at 9:59 AM Dumitru Ceara <[email protected]> >>>>> wrote: >>>>> >>>>>> On 9/30/20 3:41 PM, Konstantinos Betsis wrote: >>>>>> > From the configuration I can see only three nodes..... >>>>>> > "Encap":{ >>>>>> > #dc01-node02 >>>>>> > >>>>>> "da8fb1dc-f832-4d62-a01d-2e5aef018c8d":{"ip":"10.137.156.56","chassis_name":"be3abcc9-7358-4040-a37b-8d8a782f239c","options":["map",[["csum","true"]]],"type":"geneve"}, >>>>>> > #dc01-node01 >>>>>> > >>>>>> "4808bd8f-7e46-4f29-9a96-046bb580f0c5":{"ip":"10.137.156.55","chassis_name":"95ccb04a-3a08-4a62-8bc0-b8a7a42956f8","options":["map",[["csum","true"]]],"type":"geneve"}, >>>>>> > #dc02-node01 >>>>>> > >>>>>> "f20b33ae-5a6b-456c-b9cb-2e4d8b54d8be":{"ip":"192.168.121.164","chassis_name":"c4b23834-aec7-4bf8-8be7-aa94a50a6144","options":["map",[["csum","true"]]],"type":"geneve"}} >>>>>> > >>>>>> > So I don't understand why the dc01-node02 tries to establish a >>>>>> tunnel >>>>>> > with itself..... >>>>>> > >>>>>> > Is there a way for ovn to refresh according to Ovirt network >>>>>> database as >>>>>> > to not affect VM networks? >>>>>> > >>>>>> > On Wed, Sep 30, 2020 at 2:33 PM Konstantinos Betsis < >>>>>> [email protected] >>>>>> > <mailto:[email protected]>> wrote: >>>>>> > >>>>>> > Sure >>>>>> > >>>>>> > I've attached it for easier reference. >>>>>> > >>>>>> > On Wed, Sep 30, 2020 at 2:21 PM Dominik Holler < >>>>>> [email protected] >>>>>> > <mailto:[email protected]>> wrote: >>>>>> > >>>>>> > >>>>>> > >>>>>> > On Wed, Sep 30, 2020 at 1:16 PM Konstantinos Betsis >>>>>> > <[email protected] <mailto:[email protected]>> wrote: >>>>>> > >>>>>> > Hi Dominik >>>>>> > >>>>>> > The DC01-node02 was formatted and reinstalled and then >>>>>> > attached to ovirt environment. >>>>>> > Unfortunately we exhibit the same issue. >>>>>> > The new DC01-node02 tries to establish geneve tunnels >>>>>> to his >>>>>> > own IP. >>>>>> > >>>>>> > [root@dc01-node02 ~]# ovs-vsctl show >>>>>> > eff2663e-cb10-41b0-93ba-605bb5c7bd78 >>>>>> > Bridge br-int >>>>>> > fail_mode: secure >>>>>> > Port "ovn-95ccb0-0" >>>>>> > Interface "ovn-95ccb0-0" >>>>>> > type: geneve >>>>>> > options: {csum="true", key=flow, >>>>>> > remote_ip="dc01-node01_IP"} >>>>>> > Port "ovn-be3abc-0" >>>>>> > Interface "ovn-be3abc-0" >>>>>> > type: geneve >>>>>> > options: {csum="true", key=flow, >>>>>> > remote_ip="dc01-node02_IP"} >>>>>> > Port "ovn-c4b238-0" >>>>>> > Interface "ovn-c4b238-0" >>>>>> > type: geneve >>>>>> > options: {csum="true", key=flow, >>>>>> > remote_ip="dc02-node01_IP"} >>>>>> > Port br-int >>>>>> > Interface br-int >>>>>> > type: internal >>>>>> > ovs_version: "2.11.0" >>>>>> > >>>>>> > >>>>>> > Is there a way to fix this on the Ovirt engine since >>>>>> this is >>>>>> > where the information resides? >>>>>> > Something is broken there. >>>>>> > >>>>>> > >>>>>> > I suspect that there is an inconsistency in the OVN SB DB. >>>>>> > Is there a way to share >>>>>> your /var/lib/openvswitch/ovnsb_db.db >>>>>> > with us? >>>>>> > >>>>>> > >>>>>> >>>>>> Hi Konstantinos, >>>>>> >>>>>> One of the things I noticed in the SB DB you attached is that two of >>>>>> the >>>>>> chassis records have the same hostname: >>>>>> >>>>>> $ ovn-sbctl list chassis | grep ams03-hypersec02 >>>>>> hostname : ams03-hypersec02 >>>>>> hostname : ams03-hypersec02 >>>>>> >>>>>> This shouldn't be a major issue but shows a potential misconfiguration >>>>>> on the nodes. Could you please double check the hostname configuration >>>>>> of the nodes? >>>>>> >>>>>> Would it also be possible to attach the openvswitch conf.db from the >>>>>> three nodes? It should be in /var/lib/openvswitch/conf.db >>>>>> >>>>>> Thanks, >>>>>> Dumitru >>>>>> >>>>>>
_______________________________________________ Users mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/N6L2FVWIXCHTU7PYMBKKYDRA3I6OHLBD/

