Hi Numan,

On Wed, Aug 18, 2021, at 17:42, Numan Siddique wrote:
> On Wed, Aug 18, 2021 at 3:55 AM Krzysztof Klimonda
> <[email protected]> wrote:
> >
> > Hi,
> >
> > After reading OVN upgrade documentation[1], my understanding is that the 
> > order of upgrading components is pretty important to ensure controlplane & 
> > dataplane stability. As I understand those are the upgrade steps:
> 
> >
> > 1. upgrade and restart ovn-controller on every chassis
> > 2. upgrade ovn-nb-db and ovn-sb-db and migrate database schema
> > 3. upgrade ovn-northd as the last component
> 
> Even though this is the recommended procedure,  I know that Openstack
> tripleo deployments and Openshift upgrades the ovn-northd and
> ovsdb-servers first
> 
> 
> >
> > First, is schema upgrade is done by ovn-ctl somehow? It didn't upgrade 
> > schema for me and I had to run "ovsdb-client migrate" command on both 
> > northbound and southbound databases.
> 
> I think ovn-ctl should take care of upgrading the database to the
> updated schema.  Before restarting the ovsdb-servers, the ovn packages
> were upgraded to the desired schema files right ?
> If so, I think ovn-ctl should upgrade the database.

Yeah, those are kolla containers and after restart we use new image with new 
ovn packages. This is how kolla starts northbound db: 
"/usr/share/ovn/scripts/ovn-ctl run_nb_ovsdb --db-nb-addr=172.16.0.213 
--db-nb-cluster-local-addr=172.16.0.213  --db-nb-sock=/run/ovn/ovnnb_db.sock 
--db-nb-pid=/run/ovn/ovnnb_db.pid 
--db-nb-file=/var/lib/openvswitch/ovn-nb/ovnnb.db 
--ovn-nb-logfile=/var/log/kolla/openvswitch/ovn-nb-db.log" - I'll double check 
if I can figure out why schema wasn't upgraded.

> 
> >
> > Second, in large deployments (250+ ovn-controllers) restarting ovn 
> > southbound cluster nodes leads to complete failure of the southbound 
> > database in my environment - once all ovn-controllers (and 
> > neutron-ovn-metadata-agents) start reconnecting to the cluster, the load 
> > generated by them makes cluster lose quorum, or even corrupt database on 
> > some nodes.
> 
> If there are a lot of connections to ovsdb-servers, it would
> definitely slow down.   Maybe you can restart ovn-controllers in
> phased manners ?  Or pause all ovn-controllers and then unpause them
> in a few groups so that ovsdb-servers are not overloaded.
> I think in one of our production scale deployments we did something similar.

By pause do you mean "debug/pause"? Thanks, I'll check it out.

> 
> 
> > I'm running OVN 21.06 with ovsdb-server 2.14.0 - should I be upgrading to 
> > 2.15.x? I've also seen the new relay-based architecture introduced in 
> > 2.16.0 release but this seems be rather recent development and I'm worried 
> > about stability (I've seen some report about crashes and high memory usage).
> >
> > When running scale tests for ovn with kubernetes with hundreds of nodes, 
> > how are cluster upgrades handled?
> 
> As I mentioned above, I think in the case of openshift,  the master
> nodes are upgraded first and then the worker nodes are upgraded.
> I think during the master node upgrades, the worker nodes are paused.
> My kubernetes/openshift knowledge is limited though.

Thanks, any idea on upgrading ovsdb-server to 2.15.1 release? I see that there 
is a new database format - would that give any performance boost to northbound 
and southbound clusters? Or should I just start looking into relay-based 
southbound deployment to scale my cluster to 200+ nodes?

Thanks
Krzysztof

> 
> Thanks
> Numan
> 
> >
> > Regards,
> > Krzysztof
> >
> > [1] https://docs.ovn.org/en/latest/intro/install/ovn-upgrades.html
> >
> > --
> >   Krzysztof Klimonda
> >   [email protected]
> > _______________________________________________
> > discuss mailing list
> > [email protected]
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> >
> 


-- 
  Krzysztof Klimonda
  [email protected]
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to