On Fri, Jan 5, 2024 at 9:40 AM Ilya Maximets <[email protected]> wrote: > > On 1/1/24 07:37, Mike Pattrick wrote: > > On Sun, Dec 17, 2023 at 9:03 PM Ilya Maximets <[email protected]> wrote: > >> > >> In case the difference between 'old' and 'new' rows is readily > >> available, it can be used to construct added/removed datums > >> instead. Diffs are typically much smaller than the column > >> itself. This change more than doubles the performance of a > >> transaction replay. > >> > >> For example, with this change applied, initial read of OVSDB > >> file containing 136K small transactions for large OVN port > >> groups and address sets on my laptop takes 11 seconds vs 24 > >> seconds without. > >> > >> Signed-off-by: Ilya Maximets <[email protected]> > >> --- > >> lib/ovsdb-data.c | 26 ++++++++++++++++++++++++++ > >> lib/ovsdb-data.h | 1 + > >> ovsdb/transaction.c | 15 ++++++++++++--- > >> 3 files changed, 39 insertions(+), 3 deletions(-) > >> > >> diff --git a/lib/ovsdb-data.c b/lib/ovsdb-data.c > >> index f18f74298..f0d7bee23 100644 > >> --- a/lib/ovsdb-data.c > >> +++ b/lib/ovsdb-data.c > >> @@ -2238,6 +2238,8 @@ ovsdb_symbol_table_insert(struct ovsdb_symbol_table > >> *symtab, > >> /* APIs for Generating and apply diffs. */ > >> > >> /* Find what needs to be added to and removed from 'old' to construct > >> 'new'. > >> + * If the optional 'diff' is porvided, it can be used to speed up > >> processing, > > > > Provided > > > >> + * in case it is smaller than the original 'old' and 'new'. > >> * > >> * The 'added' and 'removed' datums are always safe; the orders of keys > >> are > >> * maintained since they are added in order. */ > >> @@ -2246,6 +2248,7 @@ ovsdb_datum_added_removed(struct ovsdb_datum *added, > >> struct ovsdb_datum *removed, > >> const struct ovsdb_datum *old, > >> const struct ovsdb_datum *new, > >> + const struct ovsdb_datum *diff, > >> const struct ovsdb_type *type) > >> { > >> size_t oi, ni; > >> @@ -2258,6 +2261,29 @@ ovsdb_datum_added_removed(struct ovsdb_datum *added, > >> return; > >> } > >> > >> + /* Use diff, if provided, unless it's comparable in size. */ > >> + if (diff && diff->n * 2 < old->n + new->n) { > > > > I guess the motivation for the size check is that the old algorithm is > > O(n) and the diff algorithm is O(n lg n)? If so it may be worthwhile > > to note that in the comment, it wasn't initially clear to me. Or am I > > misunderstanding this? > > That's correct. If the size is comparable, the linear comparison > should be faster. I can update the comment like this: > > /* Use diff, if provided, unless it's comparable in size. With a large > * diff, the O(n log n) binary search of each element may be slower than > * a simple O(n) comparison between old and new. */ > > What do you think?
I think that's a good comment, thoroughly clarifies the conditional. -M > > Best regards, Ilya Maximets. > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
