On 1/1/24 07:37, Mike Pattrick wrote:
> On Sun, Dec 17, 2023 at 9:03 PM Ilya Maximets <[email protected]> wrote:
>>
>> In case the difference between 'old' and 'new' rows is readily
>> available, it can be used to construct added/removed datums
>> instead. Diffs are typically much smaller than the column
>> itself. This change more than doubles the performance of a
>> transaction replay.
>>
>> For example, with this change applied, initial read of OVSDB
>> file containing 136K small transactions for large OVN port
>> groups and address sets on my laptop takes 11 seconds vs 24
>> seconds without.
>>
>> Signed-off-by: Ilya Maximets <[email protected]>
>> ---
>> lib/ovsdb-data.c | 26 ++++++++++++++++++++++++++
>> lib/ovsdb-data.h | 1 +
>> ovsdb/transaction.c | 15 ++++++++++++---
>> 3 files changed, 39 insertions(+), 3 deletions(-)
>>
>> diff --git a/lib/ovsdb-data.c b/lib/ovsdb-data.c
>> index f18f74298..f0d7bee23 100644
>> --- a/lib/ovsdb-data.c
>> +++ b/lib/ovsdb-data.c
>> @@ -2238,6 +2238,8 @@ ovsdb_symbol_table_insert(struct ovsdb_symbol_table
>> *symtab,
>> /* APIs for Generating and apply diffs. */
>>
>> /* Find what needs to be added to and removed from 'old' to construct 'new'.
>> + * If the optional 'diff' is porvided, it can be used to speed up
>> processing,
>
> Provided
>
>> + * in case it is smaller than the original 'old' and 'new'.
>> *
>> * The 'added' and 'removed' datums are always safe; the orders of keys are
>> * maintained since they are added in order. */
>> @@ -2246,6 +2248,7 @@ ovsdb_datum_added_removed(struct ovsdb_datum *added,
>> struct ovsdb_datum *removed,
>> const struct ovsdb_datum *old,
>> const struct ovsdb_datum *new,
>> + const struct ovsdb_datum *diff,
>> const struct ovsdb_type *type)
>> {
>> size_t oi, ni;
>> @@ -2258,6 +2261,29 @@ ovsdb_datum_added_removed(struct ovsdb_datum *added,
>> return;
>> }
>>
>> + /* Use diff, if provided, unless it's comparable in size. */
>> + if (diff && diff->n * 2 < old->n + new->n) {
>
> I guess the motivation for the size check is that the old algorithm is
> O(n) and the diff algorithm is O(n lg n)? If so it may be worthwhile
> to note that in the comment, it wasn't initially clear to me. Or am I
> misunderstanding this?
That's correct. If the size is comparable, the linear comparison
should be faster. I can update the comment like this:
/* Use diff, if provided, unless it's comparable in size. With a large
* diff, the O(n log n) binary search of each element may be slower than
* a simple O(n) comparison between old and new. */
What do you think?
Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev