On 9/21/21 12:33, Mark Gray wrote:
> On 17/09/2021 18:17, Ilya Maximets wrote:
>> Current algorithm for ovsdb_datum_subtract looks like this:
>>
>>   for-each atom in a:
>>       if atom in b:
>>           swap(atom, <last atom in 'a'>)
>>           destroy(atom)
>>   quicksort(a)
>>
>> Complexity:
>>
>>   Na * log2(Nb)  +  (Na - Nb) * log2(Na - Nb)
>>     Search          Comparisons for quicksort
>>
>> It's not optimal, especially because Nb << Na in a vast majority of
>> cases.
>>
>> Reversing the search phase to look up atoms from 'b' in 'a', and
>> closing gaps from deleted elements in 'a' by plain memory copy to
>> avoid quicksort.
>>
>> Resulted complexity:
>>
>>   Nb * log2(Na)  +    (Na - Nb)
>>     Search          Memory copies
>>
>> Subtraction is heavily used while executing database transactions.
>> For example, to remove one port from a logical switch in OVN.
>> Complexity of such operation if original logical switch had 100 ports
>> goes down from
>>
>>  100 * log2(1)   = 100 comparisons for search and
>>   99 * log2(99)  = 656 comparisons for quicksort
>>                    ------------------------------
>>                    756 comparisons in total
>> to only
>>
>>    1 * log2(100) = 7 comparisons for search
>>    + memory copy of 99 * sizeof (union ovsdb_atom) bytes.
>>
>> We could use memmove to close the gaps after removing atoms, but
>> it will lead to 2 memory copies inside the call, while we can perform
>> only one to the temporary 'result' and swap pointers.
>>
>> Performance in cases, where sizes of 'a' and 'b' are comparable,
>> should not change.  Cases with Nb >> Na should not happen in practice.
> 
> It seems like we are optimizing for the more common case in which Na >>
> Nb. Not sure if there are use cases in which the other case could bubble
> up as a bottleneck. Could we try to assess which case we are in by
> checking the size of the sets and then select the algorithm based on
> that assessment? i.e. have two algorithms?

In general, case where Nb >> Na means that user wants to delete big
number of elements from a small set.  This doesn't make much practical
sense, so I'm not sure if we need to optimize this scenario.
We can, sure, integrate switches for more optimal implementations
and add these more optimal implementations for particular cases if
we'll need them in the future.

> 
>>
>> All in all, this change allows ovsdb-server to perform several times
>> more transactions, that removes elements from sets, per second.
>>
>> Signed-off-by: Ilya Maximets <[email protected]>
>> ---
>>  lib/ovsdb-data.c | 52 +++++++++++++++++++++++++++++++++++++-----------
>>  1 file changed, 40 insertions(+), 12 deletions(-)
>>
>> diff --git a/lib/ovsdb-data.c b/lib/ovsdb-data.c
>> index 11bf95fed..b6129d6ba 100644
>> --- a/lib/ovsdb-data.c
>> +++ b/lib/ovsdb-data.c
>> @@ -2019,26 +2019,54 @@ ovsdb_datum_subtract(struct ovsdb_datum *a, const 
>> struct ovsdb_type *a_type,
>>                       const struct ovsdb_datum *b,
>>                       const struct ovsdb_type *b_type)
>>  {
>> -    bool changed = false;
>> -    size_t i;
>> +    size_t i, ai, bi, n_idx;
>> +    size_t *idx;
>>  
>>      ovs_assert(a_type->key.type == b_type->key.type);
>>      ovs_assert(a_type->value.type == b_type->value.type
>>                 || b_type->value.type == OVSDB_TYPE_VOID);
>>  
>> -    /* XXX The big-O of this could easily be improved. */
>> -    for (i = 0; i < a->n; ) {
>> -        unsigned int idx = ovsdb_datum_find(a, i, b, b_type);
>> -        if (idx != UINT_MAX) {
>> -            changed = true;
>> -            ovsdb_datum_remove_unsafe(a, i, a_type);
>> -        } else {
>> -            i++;
>> +    idx = xmalloc(b->n * sizeof *idx);
>> +    n_idx = 0;
>> +    for (bi = 0; bi < b->n; bi++) {
>> +        ai = ovsdb_datum_find(b, bi, a, b_type);
>> +        if (ai == UINT_MAX || (n_idx && ai <= idx[n_idx - 1])) {
> 
> If b and a are always sorted, will we ever hit the second clause (n_idx
> && ai <= idx[n_idx - 1])? For example, if the following elements are
> equivalent
> 
> b0 -> a0
> b1 -> a1
> ..
> ..
> 
> bi is always > bi-1 therefore ai is always greater than ai-1. If there
> is still a chance that we will hit it, could you add a more detailed
> comment?

Yes, you're right.  If both datums are correct, we don't need that
check.  Will remove.

> 
>> +            /* No such atom in 'a' or it's already marked for removal. */
>> +            continue;
>>          }
>> +        idx[n_idx++] = ai;
>>      }
>> -    if (changed) {
>> -        ovsdb_datum_sort_assert(a, a_type->key.type);
>> +    if (!n_idx) {
>> +        free(idx);
>> +        return;
>>      }
>> +
>> +    struct ovsdb_datum result;
>> +
>> +    ovsdb_datum_init_empty(&result);
>> +    ovsdb_datum_reallocate(&result, a_type, a->n - n_idx);
>> +
>> +    for (i = 0; i < n_idx; i++) {
> 
> Why don't you destroy the atoms in-place in the loop above (bi = 0; bi <
> b->n; bi++). Is it because you have to ovsdb_datum_find() each time? If
> so, maybe add a comment.

Yes, we're not removing right away because ovsdb_datum_find() will
use them.  Technically, we can change the signature of the search
function and only search starting from the previously found value,
but this unlikely to give a significant performance improvement
to justify yet another API change all over the codebase.
So I decided to just split the loop instead.  I'll add a comment.

> 
>> +        ai = idx[i];
>> +
>> +        /* Destroying atom. */
>> +        ovsdb_atom_destroy(&a->keys[ai], a_type->key.type);
>> +        if (a_type->value.type != OVSDB_TYPE_VOID) {
>> +            ovsdb_atom_destroy(&a->values[ai], a_type->value.type);
>> +        }
>> +
>> +        /* Copy non-removed atoms from 'a' to result. */
>> +        unsigned int start_idx = (i > 0) ? idx[i - 1] + 1 : 0;
> 
> It won't save many cycles but could you initialize 'start_idx' to 0
> outside the loop and set it to 'idx[i] + 1' after the statement below?

I'll move the declaration of 'start_idx' out of the loop, but I
prefer keeping the condition as is.  For me it looks more readable
this way.  If you have value update after the usage, it's a bit
misleading, and makes you look around to check what was the previous
value.

> 
>> +        ovsdb_datum_push_unsafe(&result, a, start_idx, ai - start_idx, 
>> a_type);
>> +    }
>> +    /* Copying remaining atoms. */
>> +    ovsdb_datum_push_unsafe(&result, a, idx[n_idx - 1] + 1,
>> +                            a->n - (idx[n_idx - 1] + 1), a_type);
> 
> You could also use 'start_idx' here ^ if you take my suggestion above
> which would make the code a bit easier to read IMO.

I'll use 'start_idx' here.  Good point.

> 
>> +    a->n = 0;
>> +
>> +    ovsdb_datum_swap(&result, a);
>> +    ovsdb_datum_destroy(&result, a_type);
>> +    free(idx);
>>  }
>>  
>>  struct ovsdb_symbol_table *
>>
> 
> I also did some performance testing on this and it seems to show an
> improvement! Good job!
> 

Thanks!

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to