On 1/1/24 07:37, Mike Pattrick wrote:
> On Sun, Dec 17, 2023 at 9:03 PM Ilya Maximets <[email protected]> wrote:
>>
>> In case the difference between 'old' and 'new' rows is readily
>> available, it can be used to construct added/removed datums
>> instead.  Diffs are typically much smaller than the column
>> itself.  This change more than doubles the performance of a
>> transaction replay.
>>
>> For example, with this change applied, initial read of OVSDB
>> file containing 136K small transactions for large OVN port
>> groups and address sets on my laptop takes 11 seconds vs 24
>> seconds without.
>>
>> Signed-off-by: Ilya Maximets <[email protected]>
>> ---
>>  lib/ovsdb-data.c    | 26 ++++++++++++++++++++++++++
>>  lib/ovsdb-data.h    |  1 +
>>  ovsdb/transaction.c | 15 ++++++++++++---
>>  3 files changed, 39 insertions(+), 3 deletions(-)
>>
>> diff --git a/lib/ovsdb-data.c b/lib/ovsdb-data.c
>> index f18f74298..f0d7bee23 100644
>> --- a/lib/ovsdb-data.c
>> +++ b/lib/ovsdb-data.c
>> @@ -2238,6 +2238,8 @@ ovsdb_symbol_table_insert(struct ovsdb_symbol_table 
>> *symtab,
>>  /* APIs for Generating and apply diffs.  */
>>
>>  /* Find what needs to be added to and removed from 'old' to construct 'new'.
>> + * If the optional 'diff' is porvided, it can be used to speed up 
>> processing,
> 
> Provided
> 
>> + * in case it is smaller than the original 'old' and 'new'.
>>   *
>>   * The 'added' and 'removed' datums are always safe; the orders of keys are
>>   * maintained since they are added in order.   */
>> @@ -2246,6 +2248,7 @@ ovsdb_datum_added_removed(struct ovsdb_datum *added,
>>                            struct ovsdb_datum *removed,
>>                            const struct ovsdb_datum *old,
>>                            const struct ovsdb_datum *new,
>> +                          const struct ovsdb_datum *diff,
>>                            const struct ovsdb_type *type)
>>  {
>>      size_t oi, ni;
>> @@ -2258,6 +2261,29 @@ ovsdb_datum_added_removed(struct ovsdb_datum *added,
>>          return;
>>      }
>>
>> +    /* Use diff, if provided, unless it's comparable in size. */
>> +    if (diff && diff->n * 2 < old->n + new->n) {
> 
> I guess the motivation for the size check is that the old algorithm is
> O(n) and the diff algorithm is O(n lg n)? If so it may be worthwhile
> to note that in the comment, it wasn't initially clear to me. Or am I
> misunderstanding this?

That's correct.  If the size is comparable, the linear comparison
should be faster.  I can update the comment like this:

    /* Use diff, if provided, unless it's comparable in size.  With a large
     * diff, the O(n log n) binary search of each element may be slower than
     * a simple O(n) comparison between old and new. */

What do you think?

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to