On 11/2/21 17:34, Dumitru Ceara wrote:
> On 10/14/21 4:58 PM, Ilya Maximets wrote:
>> If user frequently changes a lot of rows in a database, transaction
>> history could grow way larger than the database itself.  This wastes
>> a lot of memory and also makes monitor_cond_since slower than
>> usual monotor_cond if the transaction id is old enough, because
>> re-construction of the changes from a history is slower than just
>> creation of initial database snapshot.  This is also the case if
>> user deleted a lot of data, so transaction history still holds all of
>> it while the database itself doesn't.
>>
>> In case of current lb-per-service model in ovn-kubernetes, each
>> load-balancer is added to every logical switch/router.  Such a
>> transaction touches more than a half of a OVN_Northbound database.
> 
> Hi Ilya
> 
> This is also being fixed in ovn-kubernetes and load balancer groups will
> be used instead; this will avoid generating such inefficient
> transactions.  Nevertheless, limiting transaction history size makes
> sense in general.
> 
> I have a tiny comment below; with it addressed:
> 
> Acked-by: Dumitru Ceara <[email protected]>
> 
> Thanks,
> Dumitru
> 
>> And each of these transactions is added to the transaction history.
>> Since transaction history depth is 100, in worst case scenario,
>> it will hold 100 copies of a database increasing memory consumption
>> dramatically.  In tests with 3000 LBs and 120 LSs, memory goes up
>> to 3 GB, while holding at 30 MB if transaction history disabled in
>> the code.
>>
>> Fixing that by keeping count of the number of ovsdb_atom's in the
>> database and not allowing the total number of atoms in transaction
>> history to grow larger than this value.  Counting atoms is fairly
>> cheap because we don't need to iterate over them, so it doesn't have
>> significant performance impact.  It would be ideal to measure the
>> size of individual atoms, but that will hit the performance.
>> Counting cells instead of atoms is not sufficient, because OVN
>> users are adding hundreds or thousands of atoms to a single cell,
>> so they are largely different in size.
>>
>> Signed-off-by: Ilya Maximets <[email protected]>
>> ---
>>  ovsdb/ovsdb.c       |  5 ++++
>>  ovsdb/ovsdb.h       |  3 +++
>>  ovsdb/transaction.c | 56 +++++++++++++++++++++++++++++++++++++++++++--
>>  3 files changed, 62 insertions(+), 2 deletions(-)
>>
>> diff --git a/ovsdb/ovsdb.c b/ovsdb/ovsdb.c
>> index 126d16a2f..e6d866182 100644
>> --- a/ovsdb/ovsdb.c
>> +++ b/ovsdb/ovsdb.c
>> @@ -422,6 +422,8 @@ ovsdb_create(struct ovsdb_schema *schema, struct 
>> ovsdb_storage *storage)
>>      ovs_list_init(&db->triggers);
>>      db->run_triggers_now = db->run_triggers = false;
>>  
>> +    db->n_atoms = 0;
>> +
>>      db->is_relay = false;
>>      ovs_list_init(&db->txn_forward_new);
>>      hmap_init(&db->txn_forward_sent);
>> @@ -518,6 +520,9 @@ ovsdb_get_memory_usage(const struct ovsdb *db, struct 
>> simap *usage)
>>      }
>>  
>>      simap_increase(usage, "cells", cells);
>> +    simap_increase(usage, "atoms", db->n_atoms);
>> +    simap_increase(usage, "txn-history", db->n_txn_history);
>> +    simap_increase(usage, "txn-history-atoms", db->n_txn_history_atoms);
>>  
>>      if (db->storage) {
>>          ovsdb_storage_get_memory_usage(db->storage, usage);
>> diff --git a/ovsdb/ovsdb.h b/ovsdb/ovsdb.h
>> index 4a7bd0f0e..ec2d235ec 100644
>> --- a/ovsdb/ovsdb.h
>> +++ b/ovsdb/ovsdb.h
>> @@ -90,8 +90,11 @@ struct ovsdb {
>>      /* History trasanctions for incremental monitor transfer. */
>>      bool need_txn_history;     /* Need to maintain history of transactions. 
>> */
>>      unsigned int n_txn_history; /* Current number of history transactions. 
>> */
>> +    unsigned int n_txn_history_atoms; /* Total number of atoms in history. 
>> */
>>      struct ovs_list txn_history; /* Contains "struct 
>> ovsdb_txn_history_node. */
>>  
>> +    size_t n_atoms;  /* Total number of ovsdb atoms in the database. */
>> +
>>      /* Relay mode. */
>>      bool is_relay;  /* True, if database is in relay mode. */
>>      /* List that holds transactions waiting to be forwarded to the server. 
>> */
>> diff --git a/ovsdb/transaction.c b/ovsdb/transaction.c
>> index dcccc61c0..a7ca09b58 100644
>> --- a/ovsdb/transaction.c
>> +++ b/ovsdb/transaction.c
>> @@ -41,6 +41,9 @@ struct ovsdb_txn {
>>      struct ovs_list txn_tables; /* Contains "struct ovsdb_txn_table"s. */
>>      struct ds comment;
>>      struct uuid txnid; /* For clustered mode only. It is the eid. */
>> +    size_t n_atoms;    /* Number of atoms in all transaction rows. */
>> +    ssize_t n_atoms_diff;  /* Difference between number of added and
>> +                            * removed atoms. */
>>  };
>>  
>>  /* A table modified by a transaction. */
>> @@ -842,6 +845,37 @@ check_index_uniqueness(struct ovsdb_txn *txn OVS_UNUSED,
>>      return NULL;
>>  }
>>  
>> +static struct ovsdb_error * OVS_WARN_UNUSED_RESULT
>> +count_atoms(struct ovsdb_txn *txn, struct ovsdb_txn_row *txn_row)
>> +{
>> +    struct ovsdb_table *table = txn_row->table;
>> +    ssize_t n_atoms_old = 0, n_atoms_new = 0;
>> +    struct shash_node *node;
>> +
>> +    SHASH_FOR_EACH (node, &table->schema->columns) {
>> +        const struct ovsdb_column *column = node->data;
>> +        const struct ovsdb_type *type = &column->type;
>> +        unsigned int idx = column->index;
>> +
>> +        if (txn_row->old) {
>> +            n_atoms_old += txn_row->old->fields[idx].n;
>> +            if (type->value.type != OVSDB_TYPE_VOID) {
>> +                n_atoms_old += txn_row->old->fields[idx].n;
>> +            }
>> +        }
>> +        if (txn_row->new) {
>> +            n_atoms_new += txn_row->new->fields[idx].n;
>> +            if (type->value.type != OVSDB_TYPE_VOID) {
>> +                n_atoms_new += txn_row->new->fields[idx].n;
>> +            }
>> +        }
>> +    }
>> +
>> +    txn->n_atoms += n_atoms_old + n_atoms_new;
>> +    txn->n_atoms_diff += n_atoms_new - n_atoms_old;
>> +    return NULL;
>> +}
>> +
>>  static struct ovsdb_error * OVS_WARN_UNUSED_RESULT
>>  update_version(struct ovsdb_txn *txn OVS_UNUSED, struct ovsdb_txn_row 
>> *txn_row)
>>  {
>> @@ -910,6 +944,12 @@ ovsdb_txn_precommit(struct ovsdb_txn *txn)
>>          return error;
>>      }
>>  
>> +    /* Count atoms. */
>> +    error = for_each_txn_row(txn, count_atoms);
>> +    if (error) {
>> +        return error;
> 
> This should never happen, maybe it's better to change this to something
> similar to what we do for update_version:
> 
> return OVSDB_WRAP_BUG("can't happen", error);
> 

Makes sense.  Thanks!

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to