On Thu, Feb 28, 2019 at 09:15:15AM -0800, Han Zhou wrote:
> In scalability test with ovn-scale-test, ovsdb-server SB load is not a
> problem at least with 1k HVs. However, if we restart the ovsdb-server,
> depending on the number of HVs and scale of logical objects, e.g. the
> number of logical ports, ovsdb-server of SB become an obvious bottleneck.
> 
> In our test with 1k HVs and 20k logical ports (200 lport * 100 lswitches
> connected by one single logical router). Restarting ovsdb-server of SB
> resulted in 100% CPU of ovsdb-server for more than 1 hour. All HVs (and
> northd) are reconnecting and resyncing the big amount of data at the same
> time.
> 
> Similar problem would happen in failover scenario. With active-active
> cluster, the problem can be aleviated slightly, because only 1/3 (assuming
> it is 3-node cluster) of the HVs will need to resync data from new servers,
> but it is still a serious problem.
> 
> For detailed discussions for the problem and solutions, see:
> https://mail.openvswitch.org/pipermail/ovs-discuss/2018-October/047591.html
> 
> The patches implements the proposal in that discussion. It introduces
> a new method monitor_cond_since to enable client to request changes that
> happened after a specific point so that the data has been cached already
> in client are not re-transfered. Scalability test shows dramatic improvement.
> All HVs finishes sync as soon as they reconnect since there is no new data
> to be transfered.

Thanks a lot.  I applied this to master.

I want to encourage you to send another patch adding a NEWS item.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to