That's great Numan
On Thu, Mar 15, 2018 at 2:57 AM, aginwala <[email protected]> wrote: > Hi Numan: > > I tried on new nodes (kernel : 4.4.0-104-generic , Ubuntu 16.04)with fresh > installation and it worked super fine for both sb and nb dbs. Seems like > some kernel issue on the previous nodes when I re-installed raft patch as I > was running different ovs version on those nodes before. > > > For 2 HVs, I now set ovn-remote="tcp:10.169.125.152:6642, tcp: > 10.169.125.131:6642, tcp:10.148.181.162:6642" and started controller and > it works super fine. > > > Did some failover testing by rebooting/killing the leader (10.169.125.152) > and bringing it back up and it works as expected. Nothing weird noted so > far. > > # check-cluster gives below data one of the node(10.148.181.162) post > leader failure > > ovsdb-tool check-cluster /etc/openvswitch/ovnsb_db.db > ovsdb-tool: leader /etc/openvswitch/ovnsb_db.db for term 2 has log entries > only up to index 18446744073709551615, but index 9 was committed in a > previous term (e.g. by /etc/openvswitch/ovnsb_db.db) > > > For check-cluster, are we planning to add more output showing which node > is active(leader), etc in upcoming versions ? > > > Thanks a ton for helping sort this out. I think the patch looks good to > be merged post addressing of the comments by Justin along with the man page > details for ovsdb-tool. > > > I will do some more crash testing for the cluster along with the scale > test and keep you posted if something unexpected is noted. > > > > Regards, > > > > On Tue, Mar 13, 2018 at 11:07 PM, Numan Siddique <[email protected]> > wrote: > >> >> >> On Wed, Mar 14, 2018 at 7:51 AM, aginwala <[email protected]> wrote: >> >>> Sure. >>> >>> To add on , I also ran for nb db too using different port and Node2 >>> crashes with same error : >>> # Node 2 >>> /usr/share/openvswitch/scripts/ovn-ctl --db-nb-addr=10.99.152.138 >>> --db-nb-port=6641 --db-nb-cluster-remote-addr="tcp:10.99.152.148:6645" >>> --db-nb-cluster-local-addr="tcp:10.99.152.138:6645" start_nb_ovsdb >>> ovsdb-server: ovsdb error: /etc/openvswitch/ovnnb_db.db: cannot identify >>> file type >>> >>> >>> >> Hi Aliasgar, >> >> It worked for me. Can you delete the old db files in /etc/openvswitch/ >> and try running the commands again ? >> >> Below are the commands I ran in my setup. >> >> Node 1 >> ------- >> sudo /usr/share/openvswitch/scripts/ovn-ctl --db-sb-addr=192.168.121.91 >> --db-sb-port=6642 --db-sb-create-insecure-remote=yes >> --db-sb-cluster-local-addr=tcp:192.168.121.91:6644 start_sb_ovsdb >> >> Node 2 >> --------- >> sudo /usr/share/openvswitch/scripts/ovn-ctl --db-sb-addr=192.168.121.87 >> --db-sb-port=6642 --db-sb-create-insecure-remote=yes >> --db-sb-cluster-local-addr="tcp:192.168.121.87:6644" >> --db-sb-cluster-remote-addr="tcp:192.168.121.91:6644" start_sb_ovsdb >> >> Node 3 >> --------- >> sudo /usr/share/openvswitch/scripts/ovn-ctl --db-sb-addr=192.168.121.78 >> --db-sb-port=6642 --db-sb-create-insecure-remote=yes >> --db-sb-cluster-local-addr="tcp:192.168.121.78:6644" >> --db-sb-cluster-remote-addr="tcp:192.168.121.91:6644" start_sb_ovsdb >> >> >> >> Thanks >> Numan >> >> >> >> >> >>> >>> On Tue, Mar 13, 2018 at 9:40 AM, Numan Siddique <[email protected]> >>> wrote: >>> >>>> >>>> >>>> On Tue, Mar 13, 2018 at 9:46 PM, aginwala <[email protected]> wrote: >>>> >>>>> Thanks Numan for the response. >>>>> >>>>> There is no command start_cluster_sb_ovsdb in the source code too. Is >>>>> that in a separate commit somewhere? Hence, I used start_sb_ovsdb >>>>> which I think would not be a right choice? >>>>> >>>> >>>> Sorry, I meant start_sb_ovsdb. Strange that it didn't work for you. Let >>>> me try it out again and update this thread. >>>> >>>> Thanks >>>> Numan >>>> >>>> >>>>> >>>>> # Node1 came up as expected. >>>>> ovn-ctl --db-sb-addr=10.99.152.148 --db-sb-port=6642 >>>>> --db-sb-create-insecure-remote=yes --db-sb-cluster-local-addr="tcp: >>>>> 10.99.152.148:6644" start_sb_ovsdb. >>>>> >>>>> # verifying its a clustered db with ovsdb-tool db-local-address >>>>> /etc/openvswitch/ovnsb_db.db >>>>> tcp:10.99.152.148:6644 >>>>> # ovn-sbctl show works fine and chassis are being populated correctly. >>>>> >>>>> #Node 2 fails with error: >>>>> /usr/share/openvswitch/scripts/ovn-ctl --db-sb-addr=10.99.152.138 >>>>> --db-sb-port=6642 --db-sb-create-insecure-remote=yes >>>>> --db-sb-cluster-remote-addr="tcp:10.99.152.148:6644" >>>>> --db-sb-cluster-local-addr="tcp:10.99.152.138:6644" start_sb_ovsdb >>>>> ovsdb-server: ovsdb error: /etc/openvswitch/ovnsb_db.db: cannot >>>>> identify file type >>>>> >>>>> # So i did start the sb db the usual way using start_ovsdb to just get >>>>> the db file created and killed the sb pid and re-ran the command which >>>>> gave >>>>> actual error where it complains for join-cluster command that is being >>>>> called internally >>>>> /usr/share/openvswitch/scripts/ovn-ctl --db-sb-addr=10.99.152.138 >>>>> --db-sb-port=6642 --db-sb-create-insecure-remote=yes >>>>> --db-sb-cluster-remote-addr="tcp:10.99.152.148:6644" >>>>> --db-sb-cluster-local-addr="tcp:10.99.152.138:6644" start_sb_ovsdb >>>>> ovsdb-tool: /etc/openvswitch/ovnsb_db.db: not a clustered database >>>>> * Backing up database to /etc/openvswitch/ovnsb_db.db.b >>>>> ackup1.15.0-70426956 >>>>> ovsdb-tool: 'join-cluster' command requires at least 4 arguments >>>>> * Creating cluster database /etc/openvswitch/ovnsb_db.db from >>>>> existing one >>>>> >>>>> >>>>> # based on above error I killed the sb db pid again and try to create >>>>> a local cluster on node then re-ran the join operation as per the source >>>>> code function. >>>>> ovsdb-tool join-cluster /etc/openvswitch/ovnsb_db.db OVN_Southbound >>>>> tcp:10.99.152.138:6644 tcp:10.99.152.148:6644 which still complains >>>>> ovsdb-tool: I/O error: /etc/openvswitch/ovnsb_db.db: create failed >>>>> (File exists) >>>>> >>>>> >>>>> # Node 3: I did not try as I am assuming the same failure as node 2 >>>>> >>>>> >>>>> Let me know may know further. >>>>> >>>>> >>>>> On Tue, Mar 13, 2018 at 3:08 AM, Numan Siddique <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi Aliasgar, >>>>>> >>>>>> On Tue, Mar 13, 2018 at 7:11 AM, aginwala <[email protected]> wrote: >>>>>> >>>>>>> Hi Ben/Noman: >>>>>>> >>>>>>> I am trying to setup 3 node southbound db cluster using raft10 >>>>>>> <https://patchwork.ozlabs.org/patch/854298/> in review. >>>>>>> >>>>>>> # Node 1 create-cluster >>>>>>> ovsdb-tool create-cluster /etc/openvswitch/ovnsb_db.db >>>>>>> /root/ovs-reviews/ovn/ovn-sb.ovsschema tcp:10.99.152.148:6642 >>>>>>> >>>>>> >>>>>> A different port is used for RAFT. So you have to choose another port >>>>>> like 6644 for example. >>>>>> >>>>> >>>>>>> >>>>>>> # Node 2 >>>>>>> ovsdb-tool join-cluster /etc/openvswitch/ovnsb_db.db OVN_Southbound >>>>>>> tcp:10.99.152.138:6642 tcp:10.99.152.148:6642 --cid >>>>>>> 5dfcb678-bb1d-4377-b02d-a380edec2982 >>>>>>> >>>>>>> #Node 3 >>>>>>> ovsdb-tool join-cluster /etc/openvswitch/ovnsb_db.db OVN_Southbound >>>>>>> tcp:10.99.152.101:6642 tcp:10.99.152.138:6642 tcp:10.99.152.148:6642 >>>>>>> --cid >>>>>>> 5dfcb678-bb1d-4377-b02d-a380edec2982 >>>>>>> >>>>>>> # ovn remote is set to all 3 nodes >>>>>>> external_ids:ovn-remote="tcp:10.99.152.148:6642, tcp: >>>>>>> 10.99.152.138:6642, tcp:10.99.152.101:6642" >>>>>>> >>>>>> >>>>>>> # Starting sb db on node 1 using below command on node 1: >>>>>>> >>>>>>> ovsdb-server --detach --monitor -vconsole:off -vraft -vjsonrpc >>>>>>> --log-file=/var/log/openvswitch/ovsdb-server-sb.log >>>>>>> --pidfile=/var/run/openvswitch/ovnsb_db.pid >>>>>>> --remote=db:OVN_Southbound,SB_Global,connections >>>>>>> --unixctl=ovnsb_db.ctl --private-key=db:OVN_Southbound,SSL,private_key >>>>>>> --certificate=db:OVN_Southbound,SSL,certificate >>>>>>> --ca-cert=db:OVN_Southbound,SSL,ca_cert >>>>>>> --ssl-protocols=db:OVN_Southbound,SSL,ssl_protocols >>>>>>> --ssl-ciphers=db:OVN_Southbound,SSL,ssl_ciphers >>>>>>> --remote=punix:/var/run/openvswitch/ovnsb_db.sock >>>>>>> /etc/openvswitch/ovnsb_db.db >>>>>>> >>>>>>> # check-cluster is returning nothing >>>>>>> ovsdb-tool check-cluster /etc/openvswitch/ovnsb_db.db >>>>>>> >>>>>>> # ovsdb-server-sb.log below shows the leader is elected with only >>>>>>> one server and there are rbac related debug logs with rpc replies and >>>>>>> empty >>>>>>> params with no errors >>>>>>> >>>>>>> 2018-03-13T01:12:02Z|00002|raft|DBG|server 63d1 added to >>>>>>> configuration >>>>>>> 2018-03-13T01:12:02Z|00003|raft|INFO|term 6: starting election >>>>>>> 2018-03-13T01:12:02Z|00004|raft|INFO|term 6: elected leader by 1+ >>>>>>> of 1 servers >>>>>>> >>>>>>> >>>>>>> Now Starting the ovsdb-server on the other clusters fails saying >>>>>>> ovsdb-server: ovsdb error: /etc/openvswitch/ovnsb_db.db: cannot >>>>>>> identify file type >>>>>>> >>>>>>> >>>>>>> Also noticed that man ovsdb-tool is missing cluster details. Might >>>>>>> want to address it in the same patch or different. >>>>>>> >>>>>>> >>>>>>> Please advise to what is missing here for running ovn-sbctl show as >>>>>>> this command hangs. >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> I think you can use the ovn-ctl command "start_cluster_sb_ovsdb" for >>>>>> your testing (atleast for now) >>>>>> >>>>>> For your setup, I think you can start the cluster as >>>>>> >>>>>> # Node 1 >>>>>> ovn-ctl --db-sb-addr=10.99.152.148 --db-sb-port=6642 >>>>>> --db-sb-create-insecure-remote=yes --db-sb-cluster-local-addr="tcp: >>>>>> 10.99.152.148:6644" start_cluster_sb_ovsdb >>>>>> >>>>>> # Node 2 >>>>>> ovn-ctl --db-sb-addr=10.99.152.138 --db-sb-port=6642 >>>>>> --db-sb-create-insecure-remote=yes --db-sb-cluster-local-addr="tc >>>>>> p:10.99.152.138:6644" --db-sb-cluster-remote-addr="tcp:10.99.152.148 >>>>>> :6644" start_cluster_sb_ovsdb >>>>>> >>>>>> # Node 3 >>>>>> ovn-ctl --db-sb-addr=10.99.152.101 --db-sb-port=6642 >>>>>> --db-sb-create-insecure-remote=yes --db-sb-cluster-local-addr="tc >>>>>> p:10.99.152.101:6644" --db-sb-cluster-remote-addr="tcp:10.99.152.148 >>>>>> :6644" start_cluster_sb_ovsdb >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> Let me know how it goes. >>>>>> >>>>>> Thanks >>>>>> Numan >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> discuss mailing list >>>>>>> [email protected] >>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
_______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
