On 4/6/23 10:21, Dumitru Ceara wrote:
> On 3/27/23 21:43, Ilya Maximets wrote:
>> Change sets in OVSDB monitor are storing all the changes that happened
>> between a particular transaction ID and now.  Initial change set
>> basically contains all the data.
>>
>> On each monitor request a new initial change set is created by creating
>> an empty change set and adding all the database rows.  Then it is
>> converted into JSON reply and immediately untracked and destroyed.
>>
>> This is causing significant performance issues if many clients are
>> requesting new monitors at the same time.  For example, that is
>> happening after database schema conversion, because conversion triggers
>> cancellation of all monitors.  After cancellation, every client sends
>> a new monitor request.  The server then creates a new initial change
>> set, sends a reply, destroys initial change set and repeats that for
>> each client.  On a system with 200 MB database and 500 clients,
>> cluster of 3 servers spends 20 minutes replying to all the clients
>> (200 MB x 500 = 100 GB):
>>
>>   timeval|WARN|Unreasonably long 1201525ms poll interval
>>
>> Of course, all the clients are already disconnected due to inactivity
>> at this point.  When they are re-connecting back, server accepts new
>> connections one at a time, so inactivity probes will not be triggered
>> anymore, but it still takes another 20 minutes to handle all the
>> incoming connections.
>>
>> Let's keep the initial change set around for as long as the monitor
>> itself exists.  This will allow us to not construct a new change set
>> on each new monitor request and even utilize the JSON cache in some
>> cases.  All that at a relatively small maintenance cost, since we'll
>> need to commit changes to one extra change set on every transaction.
>> Measured memory usage increase due to keeping around a shallow copy
>> of a database is about 10%.  Measured CPU usage difference during
>> normal operation is negligible.
> 
> I'm sure I'm missing something but I'm seeing this when triggering a
> schema conversion in an OVN sandbox (clustered NB):
> 
> parse_txn()
> -> update_schema()
>    -> ovsdb_replace()
>       -> ovsdb_monitor_prereplace_db()
>          // Here we iterate and delete each jsonrpc_monitor
>          // associated with the db.
>          // Deleting the last one will trigger the monitor to be
>          // deleted too:
>          -> ovsdb_jsonrpc_monitor_destroy()
>             -> ovsdb_monitor_remove_jsonrpc_monitor()
>                -> ovsdb_monitor_destroy()
> 
> At this point the initial change set gets destroyed too.  And, AFAICT,
> this is before the clients reconnected.  This confuses me, I guess; how
> come your change works? :)

This is a correct series of events.  The database schema changed, so
the structure of the change set has to be modified.  There is no
simple way to do that (we'll need an equivalent of ovsdb_convert() to
do that), so the change set is destroyed.

When the very first connection is established for the converted database,
the new initial change set will be created from scratch.  But it will
stay and will be re-used for each subsequent connection afterwards.
That's the goal of this patch.

Does that make sense?

Best regards, Ilya Maximets.

> 
>>
>> With this change it takes only 30 seconds to send out all the monitor
>> replies in the example above.  So, it's a 40x performance improvement.
>> On a more reasonable setup with 250 nodes, the process takes up to
>> 8-10 seconds instead of 4-5 minutes.
>>
>> Conditional monitoring will benefit from this change as well, however
>> results might be less impressive due to lack of JSON cache.
>>
>> Signed-off-by: Ilya Maximets <i.maxim...@ovn.org>
>> ---
>>  ovsdb/monitor.c | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/ovsdb/monitor.c b/ovsdb/monitor.c
>> index 191befcae..3cdd03b20 100644
>> --- a/ovsdb/monitor.c
>> +++ b/ovsdb/monitor.c
>> @@ -609,7 +609,10 @@ ovsdb_monitor_untrack_change_set(struct ovsdb_monitor 
>> *dbmon,
>>      ovs_assert(mcs);
>>      if (--mcs->n_refs == 0) {
>>          if (mcs == dbmon->init_change_set) {
>> -            dbmon->init_change_set = NULL;
>> +            /* The initial change set should exist as long as the
>> +             * monitor itself. */
>> +            mcs->n_refs++;
>> +            return;
>>          } else if (mcs == dbmon->new_change_set) {
>>              dbmon->new_change_set = NULL;
>>          }
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to