On Thu, May 2, 2024 at 5:52 PM Ilya Maximets <[email protected]> wrote:
> On 4/26/24 18:54, Ihar Hrachyshka wrote: > > Remove the notion of cluster/leave --force since it was never > > implemented. Instead of these instructions, document how a broken > > cluster can be re-initialized with the old database contents. > > > > Signed-off-by: Ihar Hrachyshka <[email protected]> > > > > --- > > > > v1: initial version. > > v2: remove --force mentioned in ovsdb-server(1). > > v3: multiple language and markup changes suggested by Ilya. > > Thanks, Ihar! This version looks good to me in general. > I have a couple of minor nits below. If you agree, I can > fold those in while applying the change. > Feel free to. And thanks for your patience. > > Let me know what you think. > > Best regards, Ilya Maximets. > > > > > --- > > Documentation/ref/ovsdb.7.rst | 44 ++++++++++++++++++++++++++++------- > > ovsdb/ovsdb-server.1.in | 3 +-- > > 2 files changed, 37 insertions(+), 10 deletions(-) > > > > diff --git a/Documentation/ref/ovsdb.7.rst > b/Documentation/ref/ovsdb.7.rst > > index 46ed13e61..5766e64b9 100644 > > --- a/Documentation/ref/ovsdb.7.rst > > +++ b/Documentation/ref/ovsdb.7.rst > > @@ -315,16 +315,11 @@ The above methods for adding and removing servers > only work for healthy > > clusters, that is, for clusters with no more failures than their maximum > > tolerance. For example, in a 3-server cluster, the failure of 2 servers > > prevents servers joining or leaving the cluster (as well as database > access). > > + > > To prevent data loss or inconsistency, the preferred solution to this > problem > > is to bring up enough of the failed servers to make the cluster healthy > again, > > -then if necessary remove any remaining failed servers and add new > ones. If > > -this cannot be done, though, use ``ovs-appctl`` to invoke > ``cluster/leave > > ---force`` on a running server. This command forces the server to which > it is > > -directed to leave its cluster and form a new single-node cluster that > contains > > -only itself. The data in the new cluster may be inconsistent with the > former > > -cluster: transactions not yet replicated to the server will be lost, and > > -transactions not yet applied to the cluster may be committed. > Afterward, any > > -servers in its former cluster will regard the server to have failed. > > +then if necessary remove any remaining failed servers and add new ones. > If this > > Nit: 2 spaces between sentences. > > > +is not an option, see the next section for `Manual cluster recovery`_. > > > > Once a server leaves a cluster, it may never rejoin it. Instead, > create a new > > server and join it to the cluster. > > @@ -362,6 +357,39 @@ Clustered OVSDB does not support the OVSDB > "ephemeral columns" feature. > > ones when they work with schemas for clustered databases. Future > versions of > > OVSDB might add support for this feature. > > > > +Manual cluster recovery > > +~~~~~~~~~~~~~~~~~~~~~~~ > > + > > +.. important:: > > Nit: An empty line here would be nice to be consistent at least > within this document. > > > + The procedure below will result in ``cid`` and ``sid`` change. A > *new* > > Nit: 2 spaces between sentences. > > > + cluster will be initialized. > > + > > +To recover a clustered database after a failure: > > + > > +1. Stop *all* old cluster ``ovsdb-server`` instances before proceeding. > > + > > +2. Pick one of the old members which will serve as a bootstrap member > of the > > + to-be-recovered cluster. > > + > > +3. Convert its database file to the standalone format using ``ovsdb-tool > > + cluster-to-standalone``. > > + > > +4. Backup the standalone database file. > > + > > +5. Create a new single-node cluster with ``ovsdb-tool create-cluster`` > > + using the previously saved standalone database file, then start > > + ``ovsdb-server``. > > + > > +Once the single-node cluster is up and running and serves the restored > data, > > +new members should be created and join the new cluster, as usual > (``ovsdb-tool > > +join-cluster``). > > I'm having hard time reading 'new members should be created and join' as > my brain wants to relate 'should be' to both 'created' and 'join' and > 'should be join' is not a correct construct. > > How about: "new members should be created and added to the cluster, as > usual, > with ``ovsdb-tool join-cluster``." ? > Though it doesn't confuse me, I am not a native speaker, and I find your version at least as good as mine, so feel free to change. > > Also, should it be a step 6 ? > > It won't hurt to fold it into the list. > > + > > +.. note:: > > + > > + The data in the new cluster may be inconsistent with the former > cluster: > > + transactions not yet replicated to the server chosen in step 2 will > be lost, > > + and transactions not yet applied to the cluster may be committed. > > + > > Upgrading from version 2.14 and earlier to 2.15 and later > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > diff --git a/ovsdb/ovsdb-server.1.in b/ovsdb/ovsdb-server.1.in > > index 9fabf2d67..23b8e6e9c 100644 > > --- a/ovsdb/ovsdb-server.1.in > > +++ b/ovsdb/ovsdb-server.1.in > > @@ -461,8 +461,7 @@ This does not result in a three server cluster that > lacks quorum. > > . > > .IP "\fBcluster/kick \fIdb server\fR" > > Start graceful removal of \fIserver\fR from \fIdb\fR's cluster, like > > -\fBcluster/leave\fR (without \fB\-\-force\fR) except that it can > > -remove any server, not just this one. > > +\fBcluster/leave\fR, except that it can remove any server, not just > this one. > > .IP > > \fIserver\fR may be a server ID, as printed by \fBcluster/sid\fR, or > > the server's local network address as passed to \fBovsdb-tool\fR's > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
