Hu Numan: You need to use --db as you are now running db in cluster, you can access data from any of the three dbs.
So if the leader crashes, it re-elects from the other two. Below is the e.g. command: # export remote="tcp:192.168.220.103:6641,tcp:192.168.220.102:6641,tcp: 192.168.220.101:6641" # kill -9 3985 # ovn-nbctl --db=$remote show switch 1d86ab4e-c8bf-4747-a716-8832a285d58c (ls1) # ovn-nbctl --db=$remote ls-del ls1 Hope it helps! Regards, On Tue, Mar 27, 2018 at 10:01 AM, Numan Siddique <[email protected]> wrote: > Hi Aliasgar, > > In your setup, if you kill the leader what is the behaviour ? Are you > still able to create or delete any resources ? Is a new leader elected ? > > In my setup, the command "ovn-nbctl ls-add" for example blocks until I > restart the ovsdb-server in node 1. And I don't see any other ovsdb-server > becoming leader. May be I have configured wrongly. > Could you please test this scenario if not yet please and let me know your > observations if possible. > > Thanks > Numan > > > On Thu, Mar 22, 2018 at 12:28 PM, Han Zhou <[email protected]> wrote: > >> Sounds good. >> >> Just checked the patch, by default the C IDL has "leader_only" as true, >> which ensures that connection is to leader only. This is the case for >> northd. So the lock works for northd hot active-standby purpose if all the >> ovsdb endpoints of a cluster are specified to northd, since all northds are >> connecting to the same DB, the leader. >> >> For neutron networking-ovn, this may not work yet, since I didn't see >> such logic in the python IDL in current patch series. It would be good if >> we add similar logic for python IDL. (@ben/numan, correct me if I am wrong) >> >> >> On Wed, Mar 21, 2018 at 6:49 PM, aginwala <[email protected]> wrote: >> >>> Hi : >>> >>> Just sorted out the correct settings and northd also works in ha in raft. >>> >>> There were 2 issues in the setup: >>> 1. I had started nb db without --db-nb-create-insecure-remote >>> 2. I also started northd locally on all 3 without remote which is like >>> all three northd trying to lock the ovsdb locally. >>> >>> Hence, the duplicate logs were populated in the southbound datapath due >>> to multiple northd trying to write the local copy. >>> >>> So, I now start nb db with --db-nb-create-insecure-remote and northd on >>> all 3 nodes using below command: >>> >>> ovn-northd -vconsole:emer -vsyslog:err -vfile:info --ovnnb-db="tcp: >>> 10.169.125.152:6641,tcp:10.169.125.131:6641,tcp:10.148.181.162:6641" >>> --ovnsb-db="tcp:10.169.125.152:6642,tcp:10.169.125.131:6642,tcp: >>> 10.148.181.162:6642" --no-chdir >>> --log-file=/var/log/openvswitch/ovn-northd.log >>> --pidfile=/var/run/openvswitch/ovn-northd.pid --detach --monitor >>> >>> >>> #At start, northd went active on the leader node and standby on other >>> two nodes. >>> >>> #After old leader crashed and new leader got elected, northd goes active >>> on any of the remaining 2 nodes as per sample logs below from non-leader >>> node: >>> 2018-03-22T00:20:30.732Z|00023|ovn_northd|INFO|ovn-northd lock lost. >>> This ovn-northd instance is now on standby. >>> 2018-03-22T00:20:30.743Z|00024|ovn_northd|INFO|ovn-northd lock >>> acquired. This ovn-northd instance is now active. >>> >>> # Also ovn-controller works similar way if leader goes down and connects >>> to any of the remaining 2 nodes: >>> 2018-03-22T01:21:56.250Z|00029|ovsdb_idl|INFO|tcp:10.148.181.162:6642: >>> clustered database server is disconnected from cluster; trying another >>> server >>> 2018-03-22T01:21:56.250Z|00030|reconnect|INFO|tcp:10.148.181.162:6642: >>> connection attempt timed out >>> 2018-03-22T01:21:56.250Z|00031|reconnect|INFO|tcp:10.148.181.162:6642: >>> waiting 4 seconds before reconnect >>> 2018-03-22T01:23:52.417Z|00043|reconnect|INFO|tcp:10.148.181.162:6642: >>> connected >>> >>> >>> >>> Above settings will also work if we put all the nodes behind the vip and >>> updates the ovn configs to use vips. So we don't need pacemaker explicitly >>> for northd HA :). >>> >>> Since the setup is complete now, I will populate the same in scale test >>> env and see how it behaves. >>> >>> @Numan: We can try the same with networking-ovn integration and see if >>> we find anything weird there too. Not sure if you have any exclusive >>> findings for this case. >>> >>> Let me know if something else is missed here. >>> >>> >>> >>> >>> Regards, >>> >>> On Wed, Mar 21, 2018 at 2:50 PM, Han Zhou <[email protected]> wrote: >>> >>>> Ali, sorry if I misunderstand what you are saying, but pacemaker here >>>> is for northd HA. pacemaker itself won't point to any ovsdb cluster node. >>>> All northds can point to a LB VIP for the ovsdb cluster, so if a member of >>>> ovsdb cluster is down it won't have impact to northd. >>>> >>>> Without clustering support of the ovsdb lock, I think this is what we >>>> have now for northd HA. Please suggest if anyone has any other idea. Thanks >>>> :) >>>> >>>> On Wed, Mar 21, 2018 at 1:12 PM, aginwala <[email protected]> wrote: >>>> >>>>> :) The only thing is while using pacemaker, if the node that pacemaker >>>>> if pointing to is down, all the active/standby northd nodes have to be >>>>> updated to new node from the cluster. But will dig in more to see what >>>>> else >>>>> I can find. >>>>> >>>>> @Ben: Any suggestions further? >>>>> >>>>> >>>>> Regards, >>>>> >>>>> On Wed, Mar 21, 2018 at 10:22 AM, Han Zhou <[email protected]> wrote: >>>>> >>>>>> >>>>>> >>>>>> On Wed, Mar 21, 2018 at 9:49 AM, aginwala <[email protected]> wrote: >>>>>> >>>>>>> Thanks Numan: >>>>>>> >>>>>>> Yup agree with the locking part. For now; yes I am running northd on >>>>>>> one node. I might right a script to monitor northd in cluster so that >>>>>>> if >>>>>>> the node where it's running goes down, script can spin up northd on one >>>>>>> other active nodes as a dirty hack. >>>>>>> >>>>>>> The "dirty hack" is pacemaker :) >>>>>> >>>>>> >>>>>>> Sure, will await for the inputs from Ben too on this and see how >>>>>>> complex would it be to roll out this feature. >>>>>>> >>>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> >>>>>>> On Wed, Mar 21, 2018 at 5:43 AM, Numan Siddique <[email protected] >>>>>>> > wrote: >>>>>>> >>>>>>>> Hi Aliasgar, >>>>>>>> >>>>>>>> ovsdb-server maintains locks per each connection and not across the >>>>>>>> db. A workaround for you now would be to configure all the ovn-northd >>>>>>>> instances to connect to one ovsdb-server if you want to have >>>>>>>> active/standy. >>>>>>>> >>>>>>>> Probably Ben can answer if there is a plan to support ovsdb locks >>>>>>>> across the db. We also need this support in networking-ovn as it also >>>>>>>> uses >>>>>>>> ovsdb locks. >>>>>>>> >>>>>>>> Thanks >>>>>>>> Numan >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Mar 21, 2018 at 1:40 PM, aginwala <[email protected]> wrote: >>>>>>>> >>>>>>>>> Hi Numan: >>>>>>>>> >>>>>>>>> Just figured out that ovn-northd is running as active on all 3 >>>>>>>>> nodes instead of one active instance as I continued to test further >>>>>>>>> which >>>>>>>>> results in db errors as per logs. >>>>>>>>> >>>>>>>>> >>>>>>>>> # on node 3, I run ovn-nbctl ls-add ls2 ; it populates below logs >>>>>>>>> in ovn-north >>>>>>>>> 2018-03-21T06:01:59.442Z|00007|ovsdb_idl|WARN|transaction error: >>>>>>>>> {"details":"Transaction causes multiple rows in \"Datapath_Binding\" >>>>>>>>> table >>>>>>>>> to have identical values (1) for index on column \"tunnel_key\". >>>>>>>>> First >>>>>>>>> row, with UUID 8c5d9342-2b90-4229-8ea1-001a733a915c, was inserted >>>>>>>>> by this transaction. Second row, with UUID >>>>>>>>> 8e06f919-4cc7-4ffc-9a79-20ce6663b683, >>>>>>>>> existed in the database before this transaction and was not modified >>>>>>>>> by the >>>>>>>>> transaction.","error":"constraint violation"} >>>>>>>>> >>>>>>>>> In southbound datapath list, 2 duplicate records gets created for >>>>>>>>> same switch. >>>>>>>>> >>>>>>>>> # ovn-sbctl list Datapath >>>>>>>>> _uuid : b270ae30-3458-445f-95d2-b14e8ebddd01 >>>>>>>>> external_ids : >>>>>>>>> {logical-switch="4d6674e3-ff9f-4f38-b050-0fa9bec9e34d", >>>>>>>>> name="ls2"} >>>>>>>>> tunnel_key : 2 >>>>>>>>> >>>>>>>>> _uuid : 8e06f919-4cc7-4ffc-9a79-20ce6663b683 >>>>>>>>> external_ids : >>>>>>>>> {logical-switch="4d6674e3-ff9f-4f38-b050-0fa9bec9e34d", >>>>>>>>> name="ls2"} >>>>>>>>> tunnel_key : 1 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> # on nodes 1 and 2 where northd is running, it gives below error: >>>>>>>>> 2018-03-21T06:01:59.437Z|00008|ovsdb_idl|WARN|transaction error: >>>>>>>>> {"details":"cannot delete Datapath_Binding row >>>>>>>>> 8e06f919-4cc7-4ffc-9a79-20ce6663b683 because of 17 remaining >>>>>>>>> reference(s)","error":"referential integrity violation"} >>>>>>>>> >>>>>>>>> As per commit message, for northd I re-tried setting >>>>>>>>> --ovnnb-db="tcp:10.169.125.152:6641,tcp:10.169.125.131:6641,tcp: >>>>>>>>> 10.148.181.162:6641" and --ovnsb-db="tcp:10.169.125.152:6642,tcp: >>>>>>>>> 10.169.125.131:6642,tcp:10.148.181.162:6642" and it did not help >>>>>>>>> either. >>>>>>>>> >>>>>>>>> There is no issue if I keep running only one instance of northd on >>>>>>>>> any of these 3 nodes. Hence, wanted to know is there something >>>>>>>>> else missing here to make only one northd instance as active and rest >>>>>>>>> as >>>>>>>>> standby? >>>>>>>>> >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> >>>>>>>>> On Thu, Mar 15, 2018 at 3:09 AM, Numan Siddique < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> That's great >>>>>>>>>> >>>>>>>>>> Numan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Mar 15, 2018 at 2:57 AM, aginwala <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Numan: >>>>>>>>>>> >>>>>>>>>>> I tried on new nodes (kernel : 4.4.0-104-generic , Ubuntu >>>>>>>>>>> 16.04)with fresh installation and it worked super fine for both >>>>>>>>>>> sb and nb dbs. Seems like some kernel issue on the previous >>>>>>>>>>> nodes when I re-installed raft patch as I was running different ovs >>>>>>>>>>> version >>>>>>>>>>> on those nodes before. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> For 2 HVs, I now set ovn-remote="tcp:10.169.125.152:6642, tcp: >>>>>>>>>>> 10.169.125.131:6642, tcp:10.148.181.162:6642" and started >>>>>>>>>>> controller and it works super fine. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Did some failover testing by rebooting/killing the leader ( >>>>>>>>>>> 10.169.125.152) and bringing it back up and it works as >>>>>>>>>>> expected. Nothing weird noted so far. >>>>>>>>>>> >>>>>>>>>>> # check-cluster gives below data one of the node(10.148.181.162) >>>>>>>>>>> post >>>>>>>>>>> leader failure >>>>>>>>>>> >>>>>>>>>>> ovsdb-tool check-cluster /etc/openvswitch/ovnsb_db.db >>>>>>>>>>> ovsdb-tool: leader /etc/openvswitch/ovnsb_db.db for term 2 has >>>>>>>>>>> log entries only up to index 18446744073709551615, but index 9 was >>>>>>>>>>> committed in a previous term (e.g. by /etc/openvswitch/ovnsb_db.db) >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> For check-cluster, are we planning to add more output showing >>>>>>>>>>> which node is active(leader), etc in upcoming versions ? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks a ton for helping sort this out. I think the patch looks >>>>>>>>>>> good to be merged post addressing of the comments by Justin along >>>>>>>>>>> with the >>>>>>>>>>> man page details for ovsdb-tool. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I will do some more crash testing for the cluster along with the >>>>>>>>>>> scale test and keep you posted if something unexpected is noted. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Mar 13, 2018 at 11:07 PM, Numan Siddique < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Mar 14, 2018 at 7:51 AM, aginwala <[email protected]> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Sure. >>>>>>>>>>>>> >>>>>>>>>>>>> To add on , I also ran for nb db too using different port and >>>>>>>>>>>>> Node2 crashes with same error : >>>>>>>>>>>>> # Node 2 >>>>>>>>>>>>> /usr/share/openvswitch/scripts/ovn-ctl >>>>>>>>>>>>> --db-nb-addr=10.99.152.138 --db-nb-port=6641 >>>>>>>>>>>>> --db-nb-cluster-remote-addr="t >>>>>>>>>>>>> cp:10.99.152.148:6645" --db-nb-cluster-local-addr="tcp: >>>>>>>>>>>>> 10.99.152.138:6645" start_nb_ovsdb >>>>>>>>>>>>> ovsdb-server: ovsdb error: /etc/openvswitch/ovnnb_db.db: >>>>>>>>>>>>> cannot identify file type >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> Hi Aliasgar, >>>>>>>>>>>> >>>>>>>>>>>> It worked for me. Can you delete the old db files in >>>>>>>>>>>> /etc/openvswitch/ and try running the commands again ? >>>>>>>>>>>> >>>>>>>>>>>> Below are the commands I ran in my setup. >>>>>>>>>>>> >>>>>>>>>>>> Node 1 >>>>>>>>>>>> ------- >>>>>>>>>>>> sudo /usr/share/openvswitch/scripts/ovn-ctl >>>>>>>>>>>> --db-sb-addr=192.168.121.91 --db-sb-port=6642 >>>>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>>>> --db-sb-cluster-local-addr=tcp:192.168.121.91:6644 >>>>>>>>>>>> start_sb_ovsdb >>>>>>>>>>>> >>>>>>>>>>>> Node 2 >>>>>>>>>>>> --------- >>>>>>>>>>>> sudo /usr/share/openvswitch/scripts/ovn-ctl >>>>>>>>>>>> --db-sb-addr=192.168.121.87 --db-sb-port=6642 >>>>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>>>> --db-sb-cluster-local-addr="tcp:192.168.121.87:6644" >>>>>>>>>>>> --db-sb-cluster-remote-addr="tcp:192.168.121.91:6644" >>>>>>>>>>>> start_sb_ovsdb >>>>>>>>>>>> >>>>>>>>>>>> Node 3 >>>>>>>>>>>> --------- >>>>>>>>>>>> sudo /usr/share/openvswitch/scripts/ovn-ctl >>>>>>>>>>>> --db-sb-addr=192.168.121.78 --db-sb-port=6642 >>>>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>>>> --db-sb-cluster-local-addr="tcp:192.168.121.78:6644" >>>>>>>>>>>> --db-sb-cluster-remote-addr="tcp:192.168.121.91:6644" >>>>>>>>>>>> start_sb_ovsdb >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks >>>>>>>>>>>> Numan >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Mar 13, 2018 at 9:40 AM, Numan Siddique < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Mar 13, 2018 at 9:46 PM, aginwala <[email protected]> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks Numan for the response. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> There is no command start_cluster_sb_ovsdb in the source >>>>>>>>>>>>>>> code too. Is that in a separate commit somewhere? Hence, I used >>>>>>>>>>>>>>> start_sb_ovsdb >>>>>>>>>>>>>>> which I think would not be a right choice? >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Sorry, I meant start_sb_ovsdb. Strange that it didn't work >>>>>>>>>>>>>> for you. Let me try it out again and update this thread. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>> Numan >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> # Node1 came up as expected. >>>>>>>>>>>>>>> ovn-ctl --db-sb-addr=10.99.152.148 --db-sb-port=6642 >>>>>>>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>>>>>>> --db-sb-cluster-local-addr="tcp:10.99.152.148:6644" >>>>>>>>>>>>>>> start_sb_ovsdb. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> # verifying its a clustered db with ovsdb-tool >>>>>>>>>>>>>>> db-local-address /etc/openvswitch/ovnsb_db.db >>>>>>>>>>>>>>> tcp:10.99.152.148:6644 >>>>>>>>>>>>>>> # ovn-sbctl show works fine and chassis are being populated >>>>>>>>>>>>>>> correctly. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> #Node 2 fails with error: >>>>>>>>>>>>>>> /usr/share/openvswitch/scripts/ovn-ctl >>>>>>>>>>>>>>> --db-sb-addr=10.99.152.138 --db-sb-port=6642 >>>>>>>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>>>>>>> --db-sb-cluster-remote-addr="tcp:10.99.152.148:6644" >>>>>>>>>>>>>>> --db-sb-cluster-local-addr="tcp:10.99.152.138:6644" >>>>>>>>>>>>>>> start_sb_ovsdb >>>>>>>>>>>>>>> ovsdb-server: ovsdb error: /etc/openvswitch/ovnsb_db.db: >>>>>>>>>>>>>>> cannot identify file type >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> # So i did start the sb db the usual way using start_ovsdb >>>>>>>>>>>>>>> to just get the db file created and killed the sb pid and >>>>>>>>>>>>>>> re-ran the >>>>>>>>>>>>>>> command which gave actual error where it complains for >>>>>>>>>>>>>>> join-cluster command >>>>>>>>>>>>>>> that is being called internally >>>>>>>>>>>>>>> /usr/share/openvswitch/scripts/ovn-ctl >>>>>>>>>>>>>>> --db-sb-addr=10.99.152.138 --db-sb-port=6642 >>>>>>>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>>>>>>> --db-sb-cluster-remote-addr="tcp:10.99.152.148:6644" >>>>>>>>>>>>>>> --db-sb-cluster-local-addr="tcp:10.99.152.138:6644" >>>>>>>>>>>>>>> start_sb_ovsdb >>>>>>>>>>>>>>> ovsdb-tool: /etc/openvswitch/ovnsb_db.db: not a clustered >>>>>>>>>>>>>>> database >>>>>>>>>>>>>>> * Backing up database to /etc/openvswitch/ovnsb_db.db.b >>>>>>>>>>>>>>> ackup1.15.0-70426956 >>>>>>>>>>>>>>> ovsdb-tool: 'join-cluster' command requires at least 4 >>>>>>>>>>>>>>> arguments >>>>>>>>>>>>>>> * Creating cluster database /etc/openvswitch/ovnsb_db.db >>>>>>>>>>>>>>> from existing one >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> # based on above error I killed the sb db pid again and try >>>>>>>>>>>>>>> to create a local cluster on node then re-ran the join >>>>>>>>>>>>>>> operation as per >>>>>>>>>>>>>>> the source code function. >>>>>>>>>>>>>>> ovsdb-tool join-cluster /etc/openvswitch/ovnsb_db.db >>>>>>>>>>>>>>> OVN_Southbound tcp:10.99.152.138:6644 tcp:10.99.152.148:6644 >>>>>>>>>>>>>>> which still complains >>>>>>>>>>>>>>> ovsdb-tool: I/O error: /etc/openvswitch/ovnsb_db.db: create >>>>>>>>>>>>>>> failed (File exists) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> # Node 3: I did not try as I am assuming the same failure as >>>>>>>>>>>>>>> node 2 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Let me know may know further. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, Mar 13, 2018 at 3:08 AM, Numan Siddique < >>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Aliasgar, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, Mar 13, 2018 at 7:11 AM, aginwala <[email protected] >>>>>>>>>>>>>>>> > wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi Ben/Noman: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I am trying to setup 3 node southbound db cluster using >>>>>>>>>>>>>>>>> raft10 <https://patchwork.ozlabs.org/patch/854298/> in >>>>>>>>>>>>>>>>> review. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> # Node 1 create-cluster >>>>>>>>>>>>>>>>> ovsdb-tool create-cluster /etc/openvswitch/ovnsb_db.db >>>>>>>>>>>>>>>>> /root/ovs-reviews/ovn/ovn-sb.ovsschema tcp: >>>>>>>>>>>>>>>>> 10.99.152.148:6642 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> A different port is used for RAFT. So you have to choose >>>>>>>>>>>>>>>> another port like 6644 for example. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> # Node 2 >>>>>>>>>>>>>>>>> ovsdb-tool join-cluster /etc/openvswitch/ovnsb_db.db >>>>>>>>>>>>>>>>> OVN_Southbound tcp:10.99.152.138:6642 tcp: >>>>>>>>>>>>>>>>> 10.99.152.148:6642 --cid 5dfcb678-bb1d-4377-b02d-a380ed >>>>>>>>>>>>>>>>> ec2982 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> #Node 3 >>>>>>>>>>>>>>>>> ovsdb-tool join-cluster /etc/openvswitch/ovnsb_db.db >>>>>>>>>>>>>>>>> OVN_Southbound tcp:10.99.152.101:6642 tcp: >>>>>>>>>>>>>>>>> 10.99.152.138:6642 tcp:10.99.152.148:6642 --cid >>>>>>>>>>>>>>>>> 5dfcb678-bb1d-4377-b02d-a380edec2982 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> # ovn remote is set to all 3 nodes >>>>>>>>>>>>>>>>> external_ids:ovn-remote="tcp:10.99.152.148:6642, tcp: >>>>>>>>>>>>>>>>> 10.99.152.138:6642, tcp:10.99.152.101:6642" >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> # Starting sb db on node 1 using below command on node 1: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> ovsdb-server --detach --monitor -vconsole:off -vraft >>>>>>>>>>>>>>>>> -vjsonrpc --log-file=/var/log/openvswitch/ovsdb-server-sb.log >>>>>>>>>>>>>>>>> --pidfile=/var/run/openvswitch/ovnsb_db.pid >>>>>>>>>>>>>>>>> --remote=db:OVN_Southbound,SB_Global,connections >>>>>>>>>>>>>>>>> --unixctl=ovnsb_db.ctl >>>>>>>>>>>>>>>>> --private-key=db:OVN_Southbound,SSL,private_key >>>>>>>>>>>>>>>>> --certificate=db:OVN_Southbound,SSL,certificate >>>>>>>>>>>>>>>>> --ca-cert=db:OVN_Southbound,SSL,ca_cert >>>>>>>>>>>>>>>>> --ssl-protocols=db:OVN_Southbound,SSL,ssl_protocols >>>>>>>>>>>>>>>>> --ssl-ciphers=db:OVN_Southbound,SSL,ssl_ciphers >>>>>>>>>>>>>>>>> --remote=punix:/var/run/openvswitch/ovnsb_db.sock >>>>>>>>>>>>>>>>> /etc/openvswitch/ovnsb_db.db >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> # check-cluster is returning nothing >>>>>>>>>>>>>>>>> ovsdb-tool check-cluster /etc/openvswitch/ovnsb_db.db >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> # ovsdb-server-sb.log below shows the leader is elected >>>>>>>>>>>>>>>>> with only one server and there are rbac related debug logs >>>>>>>>>>>>>>>>> with rpc replies >>>>>>>>>>>>>>>>> and empty params with no errors >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> 2018-03-13T01:12:02Z|00002|raft|DBG|server 63d1 added to >>>>>>>>>>>>>>>>> configuration >>>>>>>>>>>>>>>>> 2018-03-13T01:12:02Z|00003|raft|INFO|term 6: starting >>>>>>>>>>>>>>>>> election >>>>>>>>>>>>>>>>> 2018-03-13T01:12:02Z|00004|raft|INFO|term 6: elected >>>>>>>>>>>>>>>>> leader by 1+ of 1 servers >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Now Starting the ovsdb-server on the other clusters fails >>>>>>>>>>>>>>>>> saying >>>>>>>>>>>>>>>>> ovsdb-server: ovsdb error: /etc/openvswitch/ovnsb_db.db: >>>>>>>>>>>>>>>>> cannot identify file type >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Also noticed that man ovsdb-tool is missing cluster >>>>>>>>>>>>>>>>> details. Might want to address it in the same patch or >>>>>>>>>>>>>>>>> different. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Please advise to what is missing here for running >>>>>>>>>>>>>>>>> ovn-sbctl show as this command hangs. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I think you can use the ovn-ctl command >>>>>>>>>>>>>>>> "start_cluster_sb_ovsdb" for your testing (atleast for now) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> For your setup, I think you can start the cluster as >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> # Node 1 >>>>>>>>>>>>>>>> ovn-ctl --db-sb-addr=10.99.152.148 --db-sb-port=6642 >>>>>>>>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>>>>>>>> --db-sb-cluster-local-addr="tcp:10.99.152.148:6644" >>>>>>>>>>>>>>>> start_cluster_sb_ovsdb >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> # Node 2 >>>>>>>>>>>>>>>> ovn-ctl --db-sb-addr=10.99.152.138 --db-sb-port=6642 >>>>>>>>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>>>>>>>> --db-sb-cluster-local-addr="tcp:10.99.152.138:6644" >>>>>>>>>>>>>>>> --db-sb-cluster-remote-addr="tcp:10.99.152.148:6644" >>>>>>>>>>>>>>>> start_cluster_sb_ovsdb >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> # Node 3 >>>>>>>>>>>>>>>> ovn-ctl --db-sb-addr=10.99.152.101 --db-sb-port=6642 >>>>>>>>>>>>>>>> --db-sb-create-insecure-remote=yes >>>>>>>>>>>>>>>> --db-sb-cluster-local-addr="tcp:10.99.152.101:6644" >>>>>>>>>>>>>>>> --db-sb-cluster-remote-addr="tcp:10.99.152.148:6644" >>>>>>>>>>>>>>>> start_cluster_sb_ovsdb >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Let me know how it goes. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>>> Numan >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>> discuss mailing list >>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> discuss mailing list >>>>>>> [email protected] >>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
_______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
