On 12/06/2021 03:00, Ilya Maximets wrote: > It might be important for clients to know that relay lost connection > with the relay remote, so they could re-connect to other relay.
Yeah this makes sense. I guess there are some deployment scenarios we should think about. For example: * Are there any special considerations for upgrades? The relays seem to be quite ephemeral and could be stopped/started without much risk? * What about raft leader elections? * I suppose there is a risk that inconsistencies could build up between relays in case of error? * Everything will be eventually consistent but I guess that is OK for OVN. I don't know a whole lot about this so I am just prompting you to see have you considered these? > > Signed-off-by: Ilya Maximets <[email protected]> > --- > ovsdb/_server.xml | 17 +++++++++-------- > ovsdb/ovsdb-server.c | 3 ++- > ovsdb/relay.c | 34 ++++++++++++++++++++++++++++++++++ > ovsdb/relay.h | 4 ++++ > 4 files changed, 49 insertions(+), 9 deletions(-) > > diff --git a/ovsdb/_server.xml b/ovsdb/_server.xml > index 414be6715..b4606b25b 100644 > --- a/ovsdb/_server.xml > +++ b/ovsdb/_server.xml > @@ -70,6 +70,15 @@ > case of a relay database - until it connects to the relay source. > </column> > > + <column name="connected"> > + True if the database is connected to its storage. A standalone > database > + is always connected. A clustered database is connected if the server > is > + in contact with a majority of its cluster. A relay database is > connected > + if the server is in contact with the relay source, i.e. is connected to > + the server it syncs from. An unconnected database cannot be modified > and > + its data might be unavailable or stale. > + </column> > + > <group title="Clustered Databases"> > <p> > These columns are most interesting and in some cases only relevant > for > @@ -77,14 +86,6 @@ > column is <code>clustered</code>. > </p> > > - <column name="connected"> > - True if the database is connected to its storage. A standalone or > - active-backup database is always connected. A clustered database is > - connected if the server is in contact with a majority of its cluster. > - An unconnected database cannot be modified and its data might be > - unavailable or stale. > - </column> > - > <column name="leader"> > True if the database is the leader in its cluster. For a standalone > or > active-backup database, this is always true. Always false for relay. > diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c > index 77b1fbe40..cdd6cf7fd 100644 > --- a/ovsdb/ovsdb-server.c > +++ b/ovsdb/ovsdb-server.c > @@ -1190,7 +1190,8 @@ update_database_status(struct ovsdb_row *row, struct db > *db) > ovsdb_util_write_string_column(row, "model", > db->db->is_relay ? "relay" : > ovsdb_storage_get_model(db->db->storage)); > ovsdb_util_write_bool_column(row, "connected", > - > ovsdb_storage_is_connected(db->db->storage)); > + db->db->is_relay ? ovsdb_relay_is_connected(db->db) > + : ovsdb_storage_is_connected(db->db->storage)); > ovsdb_util_write_bool_column(row, "leader", > db->db->is_relay ? false : ovsdb_storage_is_leader(db->db->storage)); > ovsdb_util_write_uuid_column(row, "cid", > diff --git a/ovsdb/relay.c b/ovsdb/relay.c > index ef689c649..4a8f5c206 100644 > --- a/ovsdb/relay.c > +++ b/ovsdb/relay.c > @@ -31,6 +31,7 @@ > #include "ovsdb-error.h" > #include "row.h" > #include "table.h" > +#include "timeval.h" > #include "transaction.h" > #include "transaction-forward.h" > #include "util.h" > @@ -47,8 +48,36 @@ struct relay_ctx { > struct ovsdb_schema *new_schema; > schema_change_callback schema_change_cb; > void *schema_change_aux; > + > + long long int last_connected; > }; > > +#define RELAY_MAX_RECONNECTION_MS 30000 > + > +/* Reports if the database connected to the relay source and functional, i.e. s/database connected/database is connected/ > + * it actively monitors the source and is able to forward transactions. */ > +bool > +ovsdb_relay_is_connected(struct ovsdb *db) > +{ > + struct relay_ctx *ctx = shash_find_data(&relay_dbs, db->name); > + > + if (!ctx || !ovsdb_cs_is_alive(ctx->cs)) { > + return false; > + } > + > + if (ovsdb_cs_may_send_transaction(ctx->cs)) { > + return true; > + } > + > + /* Trying to avoid connection state flapping by delaying report for > + * upper layer and giving ovsdb-cs some time to reconnect. */ > + if (time_msec() - ctx->last_connected < RELAY_MAX_RECONNECTION_MS) { > + return true; > + } > + > + return false; > +} > + > static struct json * > ovsdb_relay_compose_monitor_request(const struct json *schema_json, void > *ctx_) > { > @@ -118,6 +147,7 @@ ovsdb_relay_add_db(struct ovsdb *db, const char *remote, > ctx->schema_change_aux = schema_change_aux; > ctx->db = db; > ctx->cs = ovsdb_cs_create(db->name, 3, &relay_cs_ops, ctx); > + ctx->last_connected = 0; > shash_add(&relay_dbs, db->name, ctx); > ovsdb_cs_set_leader_only(ctx->cs, false); > ovsdb_cs_set_remote(ctx->cs, remote, true); > @@ -302,6 +332,10 @@ ovsdb_relay_run(void) > ovsdb_txn_forward_run(ctx->db, ctx->cs); > ovsdb_cs_run(ctx->cs, &events); > > + if (ovsdb_cs_may_send_transaction(ctx->cs)) { > + ctx->last_connected = time_msec(); > + } > +> struct ovsdb_cs_event *event; > LIST_FOR_EACH_POP (event, list_node, &events) { > if (!ctx->db) { > diff --git a/ovsdb/relay.h b/ovsdb/relay.h > index 68586e9db..390ea70c8 100644 > --- a/ovsdb/relay.h > +++ b/ovsdb/relay.h > @@ -17,6 +17,8 @@ > #ifndef OVSDB_RELAY_H > #define OVSDB_RELAY_H 1 > > +#include <stdbool.h> > + > struct json; > struct ovsdb; > struct ovsdb_schema; > @@ -31,4 +33,6 @@ void ovsdb_relay_del_db(struct ovsdb *); > void ovsdb_relay_run(void); > void ovsdb_relay_wait(void); > > +bool ovsdb_relay_is_connected(struct ovsdb *); > + > #endif /* OVSDB_RELAY_H */ > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
