On 7/2/21 12:48 PM, Mark Gray wrote:
> On 12/06/2021 03:00, Ilya Maximets wrote:
>> It might be important for clients to know that relay lost connection
>> with the relay remote, so they could re-connect to other relay.
> 
> Yeah this makes sense. I guess there are some deployment scenarios we
> should think about. For example:
> 
> * Are there any special considerations for upgrades? The relays seem to
> be quite ephemeral and could be stopped/started without much risk?

It should be easy to just stop them and start new ones, e.g.
scale up and down according to the current load.  So, upgrades
should not be a problem.  They also able to update database
schema on the fly, if needed.

> * What about raft leader elections?

Relays are not using 'leader-only', so they can connect to any
raft member.  Hence, elections should not be a problem.  If one of
the raft members will fell out of the cluster, relay will just
re-connect to a different server.  We may also extend relays in
the future to support leader-only connection, if needed.

> * I suppose there is a risk that inconsistencies could build up between
> relays in case of error?

If data inconsistency will be detected by relay, it will flag it,
re-connect, and download a fresh copy of a database.  If servers
are sending different consistent data to different relays, well..
we can't really do anything about this.  Something must be seriously
wrong with main servers.

> * Everything will be eventually consistent but I guess that is OK for OVN.

Yep.  All relays are getting the same updates, but may be in a
slightly different time.  The same is true for raft members, so
it should be fine for OVN.

> 
> I don't know a whole lot about this so I am just prompting you to see
> have you considered these?

Sure, I thought about this.  And these are good questions to ask.

> 
>>
>> Signed-off-by: Ilya Maximets <[email protected]>
>> ---
>>  ovsdb/_server.xml    | 17 +++++++++--------
>>  ovsdb/ovsdb-server.c |  3 ++-
>>  ovsdb/relay.c        | 34 ++++++++++++++++++++++++++++++++++
>>  ovsdb/relay.h        |  4 ++++
>>  4 files changed, 49 insertions(+), 9 deletions(-)
>>
>> diff --git a/ovsdb/_server.xml b/ovsdb/_server.xml
>> index 414be6715..b4606b25b 100644
>> --- a/ovsdb/_server.xml
>> +++ b/ovsdb/_server.xml
>> @@ -70,6 +70,15 @@
>>        case of a relay database - until it connects to the relay source.
>>      </column>
>>  
>> +    <column name="connected">
>> +      True if the database is connected to its storage.  A standalone 
>> database
>> +      is always connected.  A clustered database is connected if the server 
>> is
>> +      in contact with a majority of its cluster.  A relay database is 
>> connected
>> +      if the server is in contact with the relay source, i.e. is connected 
>> to
>> +      the server it syncs from.  An unconnected database cannot be modified 
>> and
>> +      its data might be unavailable or stale.
>> +    </column>
>> +
>>      <group title="Clustered Databases">
>>        <p>
>>          These columns are most interesting and in some cases only relevant 
>> for
>> @@ -77,14 +86,6 @@
>>          column is <code>clustered</code>.
>>        </p>
>>  
>> -      <column name="connected">
>> -        True if the database is connected to its storage.  A standalone or
>> -        active-backup database is always connected.  A clustered database is
>> -        connected if the server is in contact with a majority of its 
>> cluster.
>> -        An unconnected database cannot be modified and its data might be
>> -        unavailable or stale.
>> -      </column>
>> -
>>        <column name="leader">
>>          True if the database is the leader in its cluster.  For a 
>> standalone or
>>          active-backup database, this is always true.  Always false for 
>> relay.
>> diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c
>> index 77b1fbe40..cdd6cf7fd 100644
>> --- a/ovsdb/ovsdb-server.c
>> +++ b/ovsdb/ovsdb-server.c
>> @@ -1190,7 +1190,8 @@ update_database_status(struct ovsdb_row *row, struct 
>> db *db)
>>      ovsdb_util_write_string_column(row, "model",
>>          db->db->is_relay ? "relay" : 
>> ovsdb_storage_get_model(db->db->storage));
>>      ovsdb_util_write_bool_column(row, "connected",
>> -                                 
>> ovsdb_storage_is_connected(db->db->storage));
>> +        db->db->is_relay ? ovsdb_relay_is_connected(db->db)
>> +                         : ovsdb_storage_is_connected(db->db->storage));
>>      ovsdb_util_write_bool_column(row, "leader",
>>          db->db->is_relay ? false : 
>> ovsdb_storage_is_leader(db->db->storage));
>>      ovsdb_util_write_uuid_column(row, "cid",
>> diff --git a/ovsdb/relay.c b/ovsdb/relay.c
>> index ef689c649..4a8f5c206 100644
>> --- a/ovsdb/relay.c
>> +++ b/ovsdb/relay.c
>> @@ -31,6 +31,7 @@
>>  #include "ovsdb-error.h"
>>  #include "row.h"
>>  #include "table.h"
>> +#include "timeval.h"
>>  #include "transaction.h"
>>  #include "transaction-forward.h"
>>  #include "util.h"
>> @@ -47,8 +48,36 @@ struct relay_ctx {
>>      struct ovsdb_schema *new_schema;
>>      schema_change_callback schema_change_cb;
>>      void *schema_change_aux;
>> +
>> +    long long int last_connected;
>>  };
>>  
>> +#define RELAY_MAX_RECONNECTION_MS 30000
>> +
>> +/* Reports if the database connected to the relay source and functional, 
>> i.e.
> 
> s/database connected/database is connected/

OK.

> 
>> + * it actively monitors the source and is able to forward transactions.  */
>> +bool
>> +ovsdb_relay_is_connected(struct ovsdb *db)
>> +{
>> +    struct relay_ctx *ctx = shash_find_data(&relay_dbs, db->name);
>> +
>> +    if (!ctx || !ovsdb_cs_is_alive(ctx->cs)) {
>> +        return false;
>> +    }
>> +
>> +    if (ovsdb_cs_may_send_transaction(ctx->cs)) {
>> +        return true;
>> +    }
>> +
>> +    /* Trying to avoid connection state flapping by delaying report for
>> +     * upper layer and giving ovsdb-cs some time to reconnect. */
>> +    if (time_msec() - ctx->last_connected < RELAY_MAX_RECONNECTION_MS) {
>> +        return true;
>> +    }
>> +
>> +    return false;
>> +}
>> +
>>  static struct json *
>>  ovsdb_relay_compose_monitor_request(const struct json *schema_json, void 
>> *ctx_)
>>  {
>> @@ -118,6 +147,7 @@ ovsdb_relay_add_db(struct ovsdb *db, const char *remote,
>>      ctx->schema_change_aux = schema_change_aux;
>>      ctx->db = db;
>>      ctx->cs = ovsdb_cs_create(db->name, 3, &relay_cs_ops, ctx);
>> +    ctx->last_connected = 0;
>>      shash_add(&relay_dbs, db->name, ctx);
>>      ovsdb_cs_set_leader_only(ctx->cs, false);
>>      ovsdb_cs_set_remote(ctx->cs, remote, true);
>> @@ -302,6 +332,10 @@ ovsdb_relay_run(void)
>>          ovsdb_txn_forward_run(ctx->db, ctx->cs);
>>          ovsdb_cs_run(ctx->cs, &events);
>>  
>> +        if (ovsdb_cs_may_send_transaction(ctx->cs)) {
>> +            ctx->last_connected = time_msec();
>> +        }
>> +>          struct ovsdb_cs_event *event;
>>          LIST_FOR_EACH_POP (event, list_node, &events) {
>>              if (!ctx->db) {
>> diff --git a/ovsdb/relay.h b/ovsdb/relay.h
>> index 68586e9db..390ea70c8 100644
>> --- a/ovsdb/relay.h
>> +++ b/ovsdb/relay.h
>> @@ -17,6 +17,8 @@
>>  #ifndef OVSDB_RELAY_H
>>  #define OVSDB_RELAY_H 1
>>  
>> +#include <stdbool.h>
>> +
>>  struct json;
>>  struct ovsdb;
>>  struct ovsdb_schema;
>> @@ -31,4 +33,6 @@ void ovsdb_relay_del_db(struct ovsdb *);
>>  void ovsdb_relay_run(void);
>>  void ovsdb_relay_wait(void);
>>  
>> +bool ovsdb_relay_is_connected(struct ovsdb *);
>> +
>>  #endif /* OVSDB_RELAY_H */
>>
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to