On 12/06/2021 03:00, Ilya Maximets wrote: > Main documentation for the service model and tutorial with the use case > and configuration examples. > > Signed-off-by: Ilya Maximets <[email protected]> > --- > Documentation/automake.mk | 1 + > Documentation/ref/ovsdb.7.rst | 62 ++++++++++++-- > Documentation/topics/index.rst | 1 + > Documentation/topics/ovsdb-relay.rst | 124 +++++++++++++++++++++++++++ > NEWS | 3 + > ovsdb/ovsdb-server.1.in | 27 +++--- > 6 files changed, 200 insertions(+), 18 deletions(-) > create mode 100644 Documentation/topics/ovsdb-relay.rst > > diff --git a/Documentation/automake.mk b/Documentation/automake.mk > index bc30f94c5..213d9c867 100644 > --- a/Documentation/automake.mk > +++ b/Documentation/automake.mk > @@ -52,6 +52,7 @@ DOC_SOURCE = \ > Documentation/topics/networking-namespaces.rst \ > Documentation/topics/openflow.rst \ > Documentation/topics/ovs-extensions.rst \ > + Documentation/topics/ovsdb-relay.rst \ > Documentation/topics/ovsdb-replication.rst \ > Documentation/topics/porting.rst \ > Documentation/topics/record-replay.rst \ > diff --git a/Documentation/ref/ovsdb.7.rst b/Documentation/ref/ovsdb.7.rst > index e4f1bf766..a5b8a9c33 100644 > --- a/Documentation/ref/ovsdb.7.rst > +++ b/Documentation/ref/ovsdb.7.rst > @@ -121,13 +121,14 @@ schema checksum from a schema or database file, > respectively. > Service Models > ============== > > -OVSDB supports three service models for databases: **standalone**, > -**active-backup**, and **clustered**. The service models provide different > -compromises among consistency, availability, and partition tolerance. They > -also differ in the number of servers required and in terms of performance. > The > -standalone and active-backup database service models share one on-disk > format, > -and clustered databases use a different format, but the OVSDB programs work > -with both formats. ``ovsdb(5)`` documents these file formats. > +OVSDB supports four service models for databases: **standalone**, > +**active-backup**, **relay** and **clustered**. The service models provide > +different compromises among consistency, availability, and partition > tolerance. > +They also differ in the number of servers required and in terms of > performance. > +The standalone and active-backup database service models share one on-disk > +format, and clustered databases use a different format, but the OVSDB > programs > +work with both formats. ``ovsdb(5)`` documents these file formats. Relay > +databases has no on-disk storage.
s/has/have > > RFC 7047, which specifies the OVSDB protocol, does not mandate or specify > any particular service model. > @@ -406,6 +407,50 @@ following consequences: > that the client previously read. The OVSDB client library in Open vSwitch > uses this feature to avoid servers with stale data. > > +Relay Service Model > +------------------- > + > +A **relay** database is a way to scale out read-mostly access to the > +existing database working in any service model including relay. > + > +Relay database creates and maintains an OVSDB connection with other OVSDB s/other/another > +server. It uses this connection to maintain in-memory copy of the remote s/maintain/maintain an/ > +database (a.k.a. the ``relay source``) keeping the copy up-to-date as the > +database content changes on relay source in the real time. s/on/on the/ > + > +The purpose of relay server is to scale out the number of database clients. > +Read-only transactions and monitor requests are fully handled by the relay > +server itself. For the transactions that requests database modifications, > +relay works as a proxy between the client and the relay source, i.e. it > +forwards transactions and replies between them. > + > +Compared to a clustered and active-backup models, relay service model > provides > +read and write access to the database similarly to a clustered database (and > +even more scalable), but with generally insignificant performance overhead of > +an active-backup model. At the same time it doesn't increase availability > that > +needs to be covered by the service model of the relay source. > + > +Relay database has no on-disk storage and therefore cannot be converted to > +any other service model. > + > +If there is already a database started in any service model, to start a relay > +database server use ``ovsdb-server relay:<DB_NAME>:<relay source>``, where > +``<DB_NAME>`` is the database name as specified in the schema of the database > +that existing server runs, and ``<relay source>`` is an OVSDB connection > method > +(see `Connection Methods`_ below) that connects to the existing database > +server. ``<relay source>`` could contain a comma-separated list of > connection > +methods, e.g. to connect to any server of the clustered database. > +Multiple relay servers could be started for the same relay source. > + > +Since the way how relay handles read and write transactions is very similar s/the way how relay handles/the way relays handle/ > +to the clustered model where "cluster" means "set or relay servers connected Do you mean "set of" here? > +to the same relay source", "follower" means "relay server" and the "leader" > +means "relay source", same consistency consequences as for the clustered > +model applies to relay as well (See `Understanding Cluster Consistency`_ > +above). > + > +Open vSwitch 2.16 introduced support for relay service model. > + > Database Replication > ==================== > > @@ -414,7 +459,8 @@ Replication, in this context, means to make, and keep > up-to-date, a read-only > copy of the contents of a database (the ``replica``). One use of replication > is to keep an up-to-date backup of a database. A replica used solely for > backup would not need to support clients of its own. A set of replicas that > do > -serve clients could be used to scale out read access to the primary database. > +serve clients could be used to scale out read access to the primary database, > +however `Relay Service Model`_ is more suitable for that purpose. > > A database replica is set up in the same way as a backup server in an > active-backup pair, with the difference that the replica is never promoted to > diff --git a/Documentation/topics/index.rst b/Documentation/topics/index.rst > index 0036567eb..d8ccbd757 100644 > --- a/Documentation/topics/index.rst > +++ b/Documentation/topics/index.rst > @@ -44,6 +44,7 @@ OVS > openflow > bonding > networking-namespaces > + ovsdb-relay > ovsdb-replication > dpdk/index > windows > diff --git a/Documentation/topics/ovsdb-relay.rst > b/Documentation/topics/ovsdb-relay.rst > new file mode 100644 > index 000000000..40d294c55 > --- /dev/null > +++ b/Documentation/topics/ovsdb-relay.rst > @@ -0,0 +1,124 @@ > +.. > + Copyright 2021, Red Hat, Inc. > + > + Licensed under the Apache License, Version 2.0 (the "License"); you may > + not use this file except in compliance with the License. You may obtain > + a copy of the License at > + > + http://www.apache.org/licenses/LICENSE-2.0 > + > + Unless required by applicable law or agreed to in writing, software > + distributed under the License is distributed on an "AS IS" BASIS, > WITHOUT > + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See > the > + License for the specific language governing permissions and limitations > + under the License. > + > + Convention for heading levels in Open vSwitch documentation: > + > + ======= Heading 0 (reserved for the title in a document) > + ------- Heading 1 > + ~~~~~~~ Heading 2 > + +++++++ Heading 3 > + ''''''' Heading 4 > + > + Avoid deeper levels because they do not render well. > + > +=============================== > +Scaling OVSDB Access With Relay > +=============================== > + > +Open vSwitch 2.16 introduced support for OVSDB Relay mode with the goal to > +increase database scalability for a big deployments. Mainly, OVN (Open > Virtual > +Network) Southbound Database deployments. This document describes the main > +concept and provides the configuration examples. > + > +What is OVSDB Relay? > +-------------------- > + > +Relay is a database service model in which one ``ovsdb-server`` (``relay``) > +connects to another standalone or clustered database server > +(``relay source``) and maintains in-memory copy of its data, receiving > +all the updates via this OVSDB connection. Relay server handles all the > +read-only requests (monitors and transactions) on its own and forwards all > the > +transactions that requires database modifications to the relay source. s/that requires/that require/ > + > +Why is this needed? > +------------------- > + > +Some OVN deployment could have hundreds or even thousands nodes, on each of s/nodes,/of nodes. On/ > +these nodes there is an ovn-controller, which is connected to the > +OVN_Southbound database that is served by a standalone or clustered OVSDB. > +Standalone database is handled by a single ovsdb-server process and clustered > +could consist of 3 to 5 ovsdb-server processes. For the clustered database, > +higher number of servers may significantly increase transaction latency due > +to necessity for these servers to reach consensus. So, in the end limited > +number of ovsdb-server processes serves ever growing number of clients and > this > +leads to performance issues. > + > +Read-only access could be scaled up with OVSDB replication on top of > +active-backup service model, but ovn-controller is a read-mostly client, not > +a read-only, i.e. it needs to execute write transactions from time to time. > +Here relay service model comes into play. > + > +2-Tier Deployment > +----------------- > + > +Solution for the scaling issue could look like a 2-tier deployment, where > +a set of relay servers is connected to the main database cluster > +(OVN_Southbound) and clients (ovn-conrtoller) connected to these relay > +servers:: > + > + 172.16.0.1 > + +--------------------+ +----+ ovsdb-relay-1 +--+---+ client-1 > + | | | | > + | Clustered | | +---+ client-2 > + | Database | | ... > + | | | +---+ client-N > + | 10.0.0.2 | | > + | ovsdb-server-2 | | 172.16.0.2 > + | + + | +----+ ovsdb-relay-2 +--+---+ client-N+1 > + | | | | | | > + | | + +---+ +---+ client-N+2 > + | | 10.0.0.1 | | ... > + | | ovsdb-server-1 | | +---+ client-2N > + | | + | | > + | | | | | > + | + + | + ... ... ... ... ... > + | ovsdb-server-3 | | > + | 10.0.0.3 | | +---+ client-KN-1 > + | | | 172.16.0.K | > + +--------------------+ +----+ ovsdb-relay-K +--+---+ client-KN > + > +In practice, the picture might look a bit more complex, because all relay > +servers might connect to any member of a main cluster and clients might > +connect to any relay server of their choice. > + > +Assuming that servers of a main cluster started like this:: > + > + $ ovsdb-server --remote=ptcp:10.0.0.1:6642 ovn-sb-1.db > + > +The same for other two servers. In this case relay servers could be > +started like this:: > + > + $ REMOTES=tcp:10.0.0.1:6642,tcp:10.0.0.2:6642,tcp:10.0.0.3:6642 > + $ ovsdb-server --remote=ptcp:172.16.0.1:6642 relay:OVN_Southbound:$REMOTES > + $ ... > + $ ovsdb-server --remote=ptcp:172.16.0.K:6642 relay:OVN_Southbound:$REMOTES > + > +Every relay server could connect to any of the cluster members of their > choice, > +fairness of load distribution is achieved by shuffling remotes. I guess this assumes a large number of remotes? What I mean here is there is no mechanism actively shuffling - it is dependent on a large number of clients connecting to randomly selected nodes? As relays are meant to be ephemeral, what would happen if we brought one down for some reason? I presume that all connections would then migrate to the next client in their list? In this case, it is probably likely that they all have the same list which would cause them all to propagate to the same relay? > + > +For the actual clients, they could be configured to connect to any of the > +relay servers. For ovn-controllers the configuration could look like this:: > + > + $ REMOTES=tcp:172.16.0.1:6642,...,tcp:172.16.0.K:6642 > + $ ovs-vsctl set Open_vSwitch . external-ids:ovn-remote=$REMOTES > + > +Setup like this allows the system to serve ``K * N`` clients while having > only > +``K`` actual connections on the main clustered database keeping it in a > +stable state. > + > +It's also possible to create multi-tier deployments by connecting one set > +of relay servers to another (smaller) set of relay servers, or even create > +tree-like structures by the cost of increased latency for write transactions, > +because they will be forwarded multiple times. > diff --git a/NEWS b/NEWS > index ebba17b22..391b0abba 100644 > --- a/NEWS > +++ b/NEWS > @@ -1,6 +1,9 @@ > Post-v2.15.0 > --------------------- > - OVSDB: > + * Introduced new database service model - "relay". Targeted to scale > out > + read-mostly access (ovn-controller) to existing databases. > + For more information: ovsdb(7) and > Documentation/topics/ovsdb-relay.rst > * New command line options --record/--replay for ovsdb-server and > ovsdb-client to record and replay all the incoming transactions, > monitors, etc. More datails in > Documentation/topics/record-replay.rst. > diff --git a/ovsdb/ovsdb-server.1.in b/ovsdb/ovsdb-server.1.in > index fdd52e8f6..dac0f02cb 100644 > --- a/ovsdb/ovsdb-server.1.in > +++ b/ovsdb/ovsdb-server.1.in > @@ -10,6 +10,7 @@ ovsdb\-server \- Open vSwitch database server > .SH SYNOPSIS > \fBovsdb\-server\fR > [\fIdatabase\fR]\&... > +[\fIrelay:schema_name:remote\fR]\&... > [\fB\-\-remote=\fIremote\fR]\&... > [\fB\-\-run=\fIcommand\fR] > .so lib/daemon-syn.man > @@ -35,12 +36,15 @@ For an introduction to OVSDB and its implementation in > Open vSwitch, > see \fBovsdb\fR(7). > .PP > Each OVSDB file may be specified on the command line as \fIdatabase\fR. > -If none is specified, the default is \fB@DBDIR@/conf.db\fR. The database > -files must already have been created and initialized using, for > -example, \fBovsdb\-tool\fR's \fBcreate\fR, \fBcreate\-cluster\fR, or > -\fBjoin\-cluster\fR command. > +Relay databases may be specified on the command line as > +\fIrelay:schema_name:remote\fR. For a detailed description of relay database > +argument, see \fBovsdb\fR(7). > +If none of database files or relay databases is specified, the default is > +\fB@DBDIR@/conf.db\fR. The database files must already have been created and > +initialized using, for example, \fBovsdb\-tool\fR's \fBcreate\fR, > +\fBcreate\-cluster\fR, or \fBjoin\-cluster\fR command. > .PP > -This OVSDB implementation supports standalone, active-backup, and > +This OVSDB implementation supports standalone, active-backup, relay and > clustered database service models, as well as database replication. > See the Service Models section of \fBovsdb\fR(7) for more information. > .PP > @@ -50,7 +54,9 @@ successfully join a cluster (if the database file is > freshly created > with \fBovsdb\-tool join\-cluster\fR) or connect to a cluster that it > has already joined. Use \fBovsdb\-client wait\fR (see > \fBovsdb\-client\fR(1)) to wait until the server has successfully > -joined and connected to a cluster. > +joined and connected to a cluster. The same is true for relay databases. > +Same commands could be used to wait for a relay database to connect to > +the relay source (remote). > .PP > In addition to user-specified databases, \fBovsdb\-server\fR version > 2.9 and later also always hosts a built-in database named > @@ -243,10 +249,11 @@ not list remotes added indirectly because they were > read from the > database by configuring a > \fBdb:\fIdb\fB,\fItable\fB,\fIcolumn\fR remote. > . > -.IP "\fBovsdb\-server/add\-db \fIdatabase\fR" > -Adds the \fIdatabase\fR to the running \fBovsdb\-server\fR. The database > -file must already have been created and initialized using, for example, > -\fBovsdb\-tool create\fR. > +.IP "\fBovsdb\-server/add\-db \fIdatabase\fR > +Adds the \fIdatabase\fR to the running \fBovsdb\-server\fR. \fIdatabase\fR > +could be a database file or a relay description in the following format: > +\fIrelay:schema_name:remote\fR. The database file must already have been > +created and initialized using, for example, \fBovsdb\-tool create\fR. > . > .IP "\fBovsdb\-server/remove\-db \fIdatabase\fR" > Removes \fIdatabase\fR from the running \fBovsdb\-server\fR. \fIdatabase\fR > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
