Hi Numan, On Wed, Aug 18, 2021, at 17:42, Numan Siddique wrote: > On Wed, Aug 18, 2021 at 3:55 AM Krzysztof Klimonda > <[email protected]> wrote: > > > > Hi, > > > > After reading OVN upgrade documentation[1], my understanding is that the > > order of upgrading components is pretty important to ensure controlplane & > > dataplane stability. As I understand those are the upgrade steps: > > > > > 1. upgrade and restart ovn-controller on every chassis > > 2. upgrade ovn-nb-db and ovn-sb-db and migrate database schema > > 3. upgrade ovn-northd as the last component > > Even though this is the recommended procedure, I know that Openstack > tripleo deployments and Openshift upgrades the ovn-northd and > ovsdb-servers first > > > > > > First, is schema upgrade is done by ovn-ctl somehow? It didn't upgrade > > schema for me and I had to run "ovsdb-client migrate" command on both > > northbound and southbound databases. > > I think ovn-ctl should take care of upgrading the database to the > updated schema. Before restarting the ovsdb-servers, the ovn packages > were upgraded to the desired schema files right ? > If so, I think ovn-ctl should upgrade the database.
Yeah, those are kolla containers and after restart we use new image with new ovn packages. This is how kolla starts northbound db: "/usr/share/ovn/scripts/ovn-ctl run_nb_ovsdb --db-nb-addr=172.16.0.213 --db-nb-cluster-local-addr=172.16.0.213 --db-nb-sock=/run/ovn/ovnnb_db.sock --db-nb-pid=/run/ovn/ovnnb_db.pid --db-nb-file=/var/lib/openvswitch/ovn-nb/ovnnb.db --ovn-nb-logfile=/var/log/kolla/openvswitch/ovn-nb-db.log" - I'll double check if I can figure out why schema wasn't upgraded. > > > > > Second, in large deployments (250+ ovn-controllers) restarting ovn > > southbound cluster nodes leads to complete failure of the southbound > > database in my environment - once all ovn-controllers (and > > neutron-ovn-metadata-agents) start reconnecting to the cluster, the load > > generated by them makes cluster lose quorum, or even corrupt database on > > some nodes. > > If there are a lot of connections to ovsdb-servers, it would > definitely slow down. Maybe you can restart ovn-controllers in > phased manners ? Or pause all ovn-controllers and then unpause them > in a few groups so that ovsdb-servers are not overloaded. > I think in one of our production scale deployments we did something similar. By pause do you mean "debug/pause"? Thanks, I'll check it out. > > > > I'm running OVN 21.06 with ovsdb-server 2.14.0 - should I be upgrading to > > 2.15.x? I've also seen the new relay-based architecture introduced in > > 2.16.0 release but this seems be rather recent development and I'm worried > > about stability (I've seen some report about crashes and high memory usage). > > > > When running scale tests for ovn with kubernetes with hundreds of nodes, > > how are cluster upgrades handled? > > As I mentioned above, I think in the case of openshift, the master > nodes are upgraded first and then the worker nodes are upgraded. > I think during the master node upgrades, the worker nodes are paused. > My kubernetes/openshift knowledge is limited though. Thanks, any idea on upgrading ovsdb-server to 2.15.1 release? I see that there is a new database format - would that give any performance boost to northbound and southbound clusters? Or should I just start looking into relay-based southbound deployment to scale my cluster to 200+ nodes? Thanks Krzysztof > > Thanks > Numan > > > > > Regards, > > Krzysztof > > > > [1] https://docs.ovn.org/en/latest/intro/install/ovn-upgrades.html > > > > -- > > Krzysztof Klimonda > > [email protected] > > _______________________________________________ > > discuss mailing list > > [email protected] > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > > -- Krzysztof Klimonda [email protected] _______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
