On Mon, Sep 12, 2022 at 9:57 AM Numan Siddique <[email protected]> wrote: > > On Fri, Sep 9, 2022 at 2:25 AM Han Zhou <[email protected]> wrote: > > > > On Thu, Sep 8, 2022 at 8:46 AM Ilya Maximets <[email protected]> wrote: > > > > > > northd doesn't process changes incrementally, so it makes sense to > > > accumulate more database updates and process them in bulk, so we can > > > cover everything in a single recompute. > > > > > > ovsdb-server has a mechanism to start accumulating changes if the > > > client doesn't receive them fast enough, but it relies on the receive > > > buffer size, which is a few hundreds of KB on a typical system. > > > Unfortunately, that is enough to queue up several hundreds of small > > > updates, and it takes northd a lot of time to process them if poll > > > intervals are large, receiving at most 50 messages on each iteration > > > (half of which are updates for a _Server database). > > > > > > Calling ovsdb_idl_run() as long as something changes. This allows to > > > quickly process large bursts of database updates. For example, it > > > takes only 30-40 seconds for 'ovn-nbctl --wait=hv sync' to finish on > > > a 500-node cluster after the startup phase of the density-heavy > > > ovn-heater test, instead of 6-8 minutes without this change. > > > > > > 500 ms seems like a reasonable hard limit to avoid spinning for too > > > long if the database is changed constantly at a fast pace. > > > > > > Very long polling is also logged at INFO level to notify users. > > > Not using WARN or higher because it may happen under normal conditions, > > > e.g. on the initial connection to a large database or another type > > > of a single large update. Other notable polling attempts are logged > > > at debug level. > > > > > > Signed-off-by: Ilya Maximets <[email protected]> > > > --- > > > northd/ovn-northd.c | 38 ++++++++++++++++++++++++++++++++++---- > > > 1 file changed, 34 insertions(+), 4 deletions(-) > > > > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > > > index bd35802ed..96f17f15f 100644 > > > --- a/northd/ovn-northd.c > > > +++ b/northd/ovn-northd.c > > > @@ -685,6 +685,36 @@ get_probe_interval(const char *db, const struct > > nbrec_nb_global *nb) > > > return interval; > > > } > > > > > > +static struct ovsdb_idl_txn * > > > +run_idl_loop(struct ovsdb_idl_loop *idl_loop, const char *name) > > > +{ > > > + unsigned long long duration, start = time_msec(); > > > + unsigned int seqno = UINT_MAX; > > > + struct ovsdb_idl_txn *txn; > > > + int n = 0; > > > + > > > + /* Accumulate database changes as long as there are some, > > > + * but no longer than half a second. */ > > > + while (seqno != ovsdb_idl_get_seqno(idl_loop->idl) > > > + && time_msec() - start < 500) { > > > + seqno = ovsdb_idl_get_seqno(idl_loop->idl); > > > + ovsdb_idl_run(idl_loop->idl); > > > + n++; > > > + } > > > + > > > + txn = ovsdb_idl_loop_run(idl_loop); > > > + > > > + duration = time_msec() - start; > > > + /* ovsdb_idl_run() is called at least 2 times. Once directly and > > > + * once in the ovsdb_idl_loop_run(). n > 2 means that we received > > > + * data on at least 2 subsequent calls. */ > > > + if (n > 2 || duration > 100) { > > > + VLOG(duration > 500 ? VLL_INFO : VLL_DBG, > > > + "%s IDL run: %d iterations in %lld ms", name, n + 1, > > duration); > > > + } > > > + return txn; > > > +} > > > + > > > int > > > main(int argc, char *argv[]) > > > { > > > @@ -821,8 +851,8 @@ main(int argc, char *argv[]) > > > ovsdb_idl_set_lock(ovnsb_idl_loop.idl, "ovn_northd"); > > > } > > > > > > - struct ovsdb_idl_txn *ovnnb_txn = > > > - ovsdb_idl_loop_run(&ovnnb_idl_loop); > > > + struct ovsdb_idl_txn *ovnnb_txn = > > run_idl_loop(&ovnnb_idl_loop, > > > + > > "OVN_Northbound"); > > > unsigned int new_ovnnb_cond_seqno = > > > > > ovsdb_idl_get_condition_seqno(ovnnb_idl_loop.idl); > > > if (new_ovnnb_cond_seqno != ovnnb_cond_seqno) { > > > @@ -833,8 +863,8 @@ main(int argc, char *argv[]) > > > ovnnb_cond_seqno = new_ovnnb_cond_seqno; > > > } > > > > > > - struct ovsdb_idl_txn *ovnsb_txn = > > > - ovsdb_idl_loop_run(&ovnsb_idl_loop); > > > + struct ovsdb_idl_txn *ovnsb_txn = > > run_idl_loop(&ovnsb_idl_loop, > > > + > > "OVN_Southbound"); > > > unsigned int new_ovnsb_cond_seqno = > > > > > ovsdb_idl_get_condition_seqno(ovnsb_idl_loop.idl); > > > if (new_ovnsb_cond_seqno != ovnsb_cond_seqno) { > > > -- > > > 2.34.3 > > > > > > > Thanks Ilya for the great improvement! Applied to main. > > It is not a bug fix but it seems a good candidate for backporting to 22.09 > > given that the change is small and we haven't released yet. @Mark Michelson > > <[email protected]> @Numan Siddique <[email protected]> what do you think? > > +1 from me. > > Thanks > Numan > Thanks Numan. I backported it to 22.09.
Han > > > > Han > > _______________________________________________ > > dev mailing list > > [email protected] > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
