On 11/19/21 12:02, Dumitru Ceara wrote: > Hi Ilya, > > On 11/19/21 3:22 AM, Ilya Maximets wrote: >> While adding new rows ovsdb-idl re-parses all the other rows that >> references this new one. For example, current ovn-kubernetes creates >> load balancers and adds the same load balancer to all logical switches >> and logical routers. So, then a new load balancer is added, rows for >> all logical switches and routers re-parsed. >> >> During initial database connection (or re-connection with >> monitor/monitor_cond or moniotr_cond_since with outdated last > > Typo: moniotr. > >> transaction id) the client downloads the whole content of a database. >> In case of OVN, there might be already thousands of load balancers >> configured. ovsdb-idl will process rows in that initial monitor reply >> one-by-one. Therefore, for each load balancer row, it will re-parse >> all rows for switches and routers. >> >> Assuming that we have 120 Logical Switches and 30K load balancers. >> Processing of the initial monitor reply will take 120 (switch rows) * >> 30K (load balancer references in a switch row) * 30K (load balancer >> rows) = 10^11 operations, which may take hours. > > Well, as mentioned in other discussions, the application should use Load > Balancer groups in that case but if it can not, or for the Southbound > database case where the contents are 120 Datapath_Binding records each > referring to 30K load balancer UUIDs, this patch is very beneficial. > >> >> Re-parsing doesn't change any internal structures of the IDL. It >> destroys and re-creates exactly same arcs between rows. The only >> thing that changes is the application-facing array of pointers. >> >> Since internal structures remains intact, suggested solution is to >> postpone the re-parsing of back references until all the monitor >> updates processed. This way we can re-parse each row only once. > > Cool! > >> >> Tested in a sandbox with 120 LSs, 120 LRs and 3K LBs, where each >> load balancer added to each LS and LR, by re-statring ovn-northd and >> measuring the time spent in ovsdb_idl_run(). >> >> Before the change: >> >> OVN_Southbound: ovsdb_idl_run took: 924 ms >> OVN_Northbound: ovsdb_idl_run took: 825118 ms --> 13.75 minutes! >> >> After: >> >> OVN_Southbound: ovsdb_idl_run took: 692 ms >> OVN_Northbound: ovsdb_idl_run took: 1698 ms >> > > I confirm; I also see signficant improvement when running against NB/SB > databases taken from an OpenShift large scale cluster. > >> Signed-off-by: Ilya Maximets <[email protected]> >> --- > > The code looks good to me, thanks! > > Acked-by: Dumitru Ceara <[email protected]> >
Thanks! I adjusted the commit message and applied. Best regards, Ilya Maximets. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
