Changed the title for this specific problem. I looked into logs and have more findings. The problem was happening when sb-db leader switched.
For ovsdb cluster, what may trigger the leader switch? Given the log, 2020-07-21T01:08:38.119Z|00074|raft|INFO|term 2: 1135 ms timeout expired, starting election The election is asked by a follower node. Is that because the connection from follower to leader timeout, then follower assumes the leader is dead and starts an election? For ovn-northd (3 instances), they all connect to the sb-db leader, whoever has the locker is the master. When sb-db leader switches, all ovn-northd instances look for the new leader. In this case, there is no guarantee that the old ovn-northd master remains the role, other ovn-northd instance may find the leader and acquire the lock first. So, the sb-db leader switch may also cause ovn-northd master switch. Such switch may happen in the middle of ovn-northd transaction, in that case, is there any guarantee to the transaction completeness? My guess is that, the older created a datapath-binding for a logical-switch, switch happened when this transaction is not completed, then the new master/leader created another data-path binding for the same logical-switch. Does it make any sense? >From the log, when sb-db switched, ovn-northd master connected to the new >leader and lost the master, but immediately, it acquired the lock and become master again. Not sure how this happened. Here are some loggings. ==== .84 sb-db leader ================= 2020-07-21T01:08:20.221Z|01408|raft|INFO|current entry eid 639238ba-bc00-4efe-bb66-6ac766bb5f4b does not match prerequisite 78e8e167-8b4c-4292-8e25-d9975631b010 in execute_command_request 2020-07-21T01:08:38.450Z|01409|timeval|WARN|Unreasonably long 1435ms poll interval (1135ms user, 43ms system) 2020-07-21T01:08:38.451Z|01410|timeval|WARN|faults: 5942 minor, 0 major 2020-07-21T01:08:38.451Z|01411|timeval|WARN|disk: 0 reads, 50216 writes 2020-07-21T01:08:38.452Z|01412|timeval|WARN|context switches: 60 voluntary, 25 involuntary 2020-07-21T01:08:38.453Z|01413|coverage|INFO|Skipping details of duplicate event coverage for hash=45329980 2020-07-21T01:08:38.455Z|01414|raft|WARN|ignoring vote request received as leader 2020-07-21T01:08:38.456Z|01415|raft|INFO|server 1f9e is leader for term 2 2020-07-21T01:08:38.457Z|01416|raft|INFO|rejected append_reply (not leader) 2020-07-21T01:08:38.471Z|01417|raft|INFO|rejected append_reply (not leader) 2020-07-21T01:23:00.890Z|01418|timeval|WARN|Unreasonably long 1336ms poll interval (1102ms user, 20ms system) ==== .85 sb-db ================== 2020-07-21T01:05:25.734Z|00061|coverage|INFO|Skipping details of duplicate event coverage for hash=a1e16f5b 2020-07-21T01:08:38.166Z|00062|raft|INFO|server 1f9e is leader for term 2 2020-07-21T01:08:38.455Z|00063|raft|INFO|rejecting term 1 < current term 2 received in append_request message from server 39a9 2020-07-21T01:23:44.500Z|00064|timeval|WARN|Unreasonably long 1447ms poll interval (1118ms user, 33ms system) ==== .86 sb-db ================= 2020-07-21T01:02:35.607Z|00073|coverage|INFO|70 events never hit 2020-07-21T01:08:38.119Z|00074|raft|INFO|term 2: 1135 ms timeout expired, starting election 2020-07-21T01:08:38.165Z|00075|raft|INFO|term 2: elected leader by 2+ of 3 servers 2020-07-21T01:08:38.455Z|00076|raft|INFO|rejecting term 1 < current term 2 received in append_request message from server 39a9 2020-07-21T01:08:38.456Z|00077|raft|INFO|rejecting term 1 < current term 2 received in vote_reply message from server 39a9 2020-07-21T01:12:51.363Z|00078|timeval|WARN|Unreasonably long 1466ms poll interval (1186ms user, 27ms system) ==== .85 ovn-northd master ============== 2020-07-21T01:08:25.925Z|00479|poll_loop|INFO|wakeup due to [POLLIN] on fd 8 (10.6.20.85:45478<->10.6.20.84:6641) at lib/stream-fd.c:157 (54% CPU usage) 2020-07-21T01:08:38.464Z|00480|ovsdb_idl|INFO|tcp:10.6.20.84:6642: clustered database server is not cluster leader; trying another server 2020-07-21T01:08:38.465Z|00481|reconnect|INFO|tcp:10.6.20.84:6642: connection attempt timed out 2020-07-21T01:08:38.465Z|00482|reconnect|INFO|tcp:10.6.20.86:6642: connecting... 2020-07-21T01:08:38.465Z|00483|reconnect|INFO|tcp:10.6.20.86:6642: connected 2020-07-21T01:08:38.465Z|00484|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby. 2020-07-21T01:08:38.465Z|00485|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active. 2020-07-21T01:08:38.966Z|00486|ovn_northd|INFO|deleting Datapath_Binding 108cf745-db82-43c0-a9d3-afe27a41e4aa with duplicate external-ids:logical-switch/router 8a5d1d3c-e9fc-4cbe-a461-98ff838e6473 2020-07-20T21:32:48.803Z|00019|reconnect|INFO|tcp:10.6.20.84:6641: connected ==== .86 ovn-northd ============== 2020-07-21T01:08:38.463Z|00020|ovsdb_idl|INFO|tcp:10.6.20.84:6642: clustered database server is not cluster leader; trying another server 2020-07-21T01:08:38.464Z|00021|reconnect|INFO|tcp:10.6.20.84:6642: connection attempt timed out 2020-07-21T01:08:38.464Z|00022|reconnect|INFO|tcp:10.6.20.85:6642: connecting... 2020-07-21T01:08:38.464Z|00023|reconnect|INFO|tcp:10.6.20.85:6642: connected 2020-07-21T01:08:38.464Z|00024|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active. 2020-07-21T01:08:38.465Z|00025|ovsdb_idl|INFO|tcp:10.6.20.85:6642: clustered database server is not cluster leader; trying another server 2020-07-21T01:08:38.466Z|00026|reconnect|INFO|tcp:10.6.20.85:6642: connection attempt timed out 2020-07-21T01:08:39.467Z|00027|reconnect|INFO|tcp:10.6.20.86:6642: connecting... 2020-07-21T01:08:39.468Z|00028|reconnect|INFO|tcp:10.6.20.86:6642: connected 2020-07-21T01:08:39.468Z|00029|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby. 2020-07-21T02:04:22.718Z|00030|ovsdb_idl|INFO|tcp:10.6.20.86:6642: clustered database server is not cluster leader; trying another server Thanks! Tony ________________________________ From: Numan Siddique <[email protected]> Sent: July 23, 2020 03:15 AM To: Tony Liu <[email protected]> Cc: ovs-dev <[email protected]>; [email protected] <[email protected]> Subject: Re: [ovs-discuss] OVN nb-db and sb-db out of sync On Thu, Jul 23, 2020 at 8:22 AM Tony Liu <[email protected]<mailto:[email protected]>> wrote: Hi, I see why sb-db broke at 1568th port-binding. The 1568th datapath-binding in sb-db references the same _uuid : 108cf745-db82-43c0-a9d3-afe27a41e4aa external_ids : {logical-switch="8a5d1d3c-e9fc-4cbe-a461-98ff838e6473", name=neutron-e907dc17-f1e8-4217-a37d-86e9a98c86c2, name2=net-97-192} tunnel_key : 1567 _uuid : d934ed92-2f3c-4b31-8a76-2a5047a3bb46 external_ids : {logical-switch="8a5d1d3c-e9fc-4cbe-a461-98ff838e6473", name=neutron-e907dc17-f1e8-4217-a37d-86e9a98c86c2, name2=net-97-192} tunnel_key : 1568 I don't believe this is supposed to happen. Any idea how could it happen? Then ovn-northd is stuck in trying to delete this duplication, and it ignores all the following updates. That caused out-of-sync between nb-db and sb-db. Any way I can fix it manually, like with ovn-sbctl to delete it? If you delete the ovn sb db resources manually, ovn-northd should sync it up. But I'm surprised why ovn-northd didn't sync earlier. There's something wrong related to raft going on here. Not sure what. Thanks Numan Thanks! Tony ________________________________ From: dev <[email protected]<mailto:[email protected]>> on behalf of Tony Liu <[email protected]<mailto:[email protected]>> Sent: July 22, 2020 11:33 AM To: ovs-dev <[email protected]<mailto:[email protected]>> Subject: [ovs-dev] OVN nb-db and sb-db out of sync Hi, During a scaling test where 4000 networks are created from OpenStack, I see that nb-db and sb-db are out of sync. All 4000 logical switches and 8000 LS ports (GW port and service port of each network) are created in nb-db. In sb-db, only 1567 port-bindings, 4000 is expected. [root@ovn-db-2 ~]# ovn-nbctl list nb_global _uuid : b7b3aa05-f7ed-4dbc-979f-10445ac325b8 connections : [] external_ids : {"neutron:liveness_check_at"="2020-07-22 04:03:17.726917+00:00"} hv_cfg : 312 ipsec : false name : "" nb_cfg : 2636 options : {mac_prefix="ca:e8:07", svc_monitor_mac="4e:d0:3a:80:d4:b7"} sb_cfg : 2005 ssl : [] [root@ovn-db-2 ~]# ovn-sbctl list sb_global _uuid : 3720bc1d-b0da-47ce-85ca-96fa8d398489 connections : [] external_ids : {} ipsec : false nb_cfg : 312 options : {mac_prefix="ca:e8:07", svc_monitor_mac="4e:d0:3a:80:d4:b7"} ssl : [] Is there any way to force ovn-northd to rebuild sb-db to sync with nb-db, like manipulating nb_cfg or anything else? Note, it's 3-node RAFT cluster for both nb-db and sb-db. Is that "incremental update" implemented in 20.03? If not, in which release it's going to be available? Thanks! Tony _______________________________________________ dev mailing list [email protected]<mailto:[email protected]> https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ discuss mailing list [email protected]<mailto:[email protected]> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
