On 7/7/23 07:14, Han Zhou wrote:
>
>
> On Mon, Jul 3, 2023 at 9:46 PM Ilya Maximets <[email protected]
> <mailto:[email protected]>> wrote:
>>
>> On 7/1/23 04:43, Han Zhou wrote:
>> >
>> >
>> > On Fri, Jun 30, 2023 at 4:30 PM Ilya Maximets <[email protected]
>> > <mailto:[email protected]> <mailto:[email protected]
>> > <mailto:[email protected]>>> wrote:
>> >>
>> >> On 6/25/23 19:35, Han Zhou wrote:
>> >> > When a server becomes unstable due to overloading or intermittent
>> >> > partitioning, it may miss some heartbeats and then starts election with
>> >> > a new term, which would disrupt the otherwise healthy cluster formed by
>> >> > the rest of the healthy nodes. Such situation may exist for a long time
>> >> > until the "flapping" server is shutdown or recovered completely, which
>> >> > can severely impact the availability of the cluster. The pre-vote
>> >> > mechanism introduced in the raft paper section 9.6 can prevent such
>> >> > problems. This patch implements the pre-vote mechanism, in a way that is
>> >> > backward compatible and safe for upgrades.
>> >>
>> >> Thanks, Han! This is definitely interesting. I don't think that this
>> >> should be a solution for overloading though, overloading should be fixed
>> >> by, well, reducing the load or increasing the election timer. But some
>> >> intermittent network issues is indeed a valid case.
>> >
>> > Sorry that I wasn't clear about "overloading". I was referring to the case
>> > when the machine is getting overloaded, not necessarily because of the
>> > ovsdb-server process but because of other applications/VMs sharing the
>> > same physical resource, which is likely what we have hit in the production.
>> > You may argue it is an improper resource allocation, but from OVSDB point
>> > of view, it is not an excuse for the whole RAFT cluster down. The pre-vote
>> > can avoid such disaster, regardless of the factors of the problematic
>> > server that we can't control.
>>
>> OK. Makes sense.
>>
>> >
>> >>
>> >> I looked through the code and I don't see any significant problems right
>> >> away, but I would like to experiment with it a bit more next week, since
>> >> the voting logic is fairly complex in general.
>> >>
>> > Appreciated!
>> >
>> >> It's nice to see you found a use case for the stop-raft-rpc failure
>> >> test. :)
>> >>
>> >> See some comments inline.
>> >>
>> >> Best regards, Ilya Maximets.
>> >>
>> >> >
>> >> > Signed-off-by: Han Zhou <[email protected] <mailto:[email protected]>
>> >> > <mailto:[email protected] <mailto:[email protected]>>>
>> >> > ---
>> >> > ovsdb/raft-rpc.c | 19 ++++++++-
>> >> > ovsdb/raft-rpc.h | 3 ++
>> >> > ovsdb/raft.c | 88 ++++++++++++++++++++++++++++++------------
>> >> > tests/ovsdb-cluster.at <http://ovsdb-cluster.at>
>> >> > <http://ovsdb-cluster.at <http://ovsdb-cluster.at>> | 42
>> >> > ++++++++++++++++++++
>> >> > 4 files changed, 127 insertions(+), 25 deletions(-)
>> >> >
>> >> > diff --git a/ovsdb/raft-rpc.c b/ovsdb/raft-rpc.c
>> >> > index dd14d81091fc..f750ba897046 100644
>> >> > --- a/ovsdb/raft-rpc.c
>> >> > +++ b/ovsdb/raft-rpc.c
>> >> > @@ -283,6 +283,10 @@ raft_vote_request_to_jsonrpc(const struct
>> >> > raft_vote_request *rq,
>> >> > json_object_put(args, "leadership_transfer",
>> >> > json_boolean_create(true));
>> >> > }
>> >> > + if (rq->is_prevote) {
>> >> > + json_object_put(args, "is_prevote",
>> >> > + json_boolean_create(true));
>> >> > + }
>>
>> Hmm. I don't think this is fully backward compatible. If the other
>> server is the old one, the ovsdb_parser_finish() in the
>> raft_rpc_from_jsonrpc() will report an error, because not all the members
>> of the json message were consumed. And hence the pre-vote will never
>> succeed in a cluster where upgraded servers do not have a quorum.
>>
>> Let's say we have a cluster of 5. The leader + 2 old servers + 2 new.
>> Leader goes down. 2 new servers don't have a quorum and old ones do
>> not reply to pre-vote requests. I suppose, one of the old ones will
>> have to initiate an election then, and the new ones will reply to the
>> actual vote. So, one of the old servers can become a leader, but new
>> servers can not. This is probably not an issue since the cluster can
>> still elect a leader. What do you think?
>>
>> Maybe a clarification on that will be useful in the commit message.
>>
>
> Thanks for pointing this out. I just checked the logs during an upgrade test,
> and it indeed had the error in the old cluster node:
>
> 2023-07-07T04:46:24.466Z|00070|raft|INFO|unix#6: syntax
> "{"cluster":"a5cc9100-9d30-466a-b97c-55e72300b510","from":"7aaec107-2fe2-431e-a7f8-136f17ef9600","is_prevote":true,"last_log_index":9,"last_log_term":1,"term":1,"to":"b8ea2911-fcc3-4048-b140-f609918b7e67"}":
> syntax error: Parsing raft vote_request RPC failed: Member 'is_prevote' is
> present but not allowed here.
>
> I didn't consider this regarding backward compatibility. I was mainly trying
> to avoid crashing the old nodes by unknown message types, and didn't know
> ovsdb rpc doesn't allow unused fields. But fortunately the node can continue
> the work after this error, and just as you mentioned, the old node then
> initiated the vote and the new one accepted it. So, maybe it is not a real
> issue except the unpleasant error log. I will update the commit message, and
> probably include this in NEWS as well.
>
>>
>> >> > }
>> >> >
>> >> > static void
>> >> > @@ -294,6 +298,8 @@ raft_vote_request_from_jsonrpc(struct ovsdb_parser
>> >> > *p,
>> >> > rq->last_log_term = raft_parse_required_uint64(p, "last_log_term");
>> >> > rq->leadership_transfer
>> >> > = raft_parse_optional_boolean(p, "leadership_transfer") == 1;
>> >> > + rq->is_prevote
>> >> > + = raft_parse_optional_boolean(p, "is_prevote") == 1;
>> >> > }
>> >> >
>> >> > static void
>> >> > @@ -305,6 +311,9 @@ raft_format_vote_request(const struct
>> >> > raft_vote_request *rq, struct ds *s)
>> >> > if (rq->leadership_transfer) {
>> >> > ds_put_cstr(s, " leadership_transfer=true");
>> >> > }
>> >> > + if (rq->is_prevote) {
>> >> > + ds_put_cstr(s, " is_prevote=true");
>> >> > + }
>> >> > }
>> >> >
>> >> > /* raft_vote_reply. */
>> >> > @@ -326,6 +335,9 @@ raft_vote_reply_to_jsonrpc(const struct
>> >> > raft_vote_reply *rpy,
>> >> > {
>> >> > raft_put_uint64(args, "term", rpy->term);
>> >> > json_object_put_format(args, "vote", UUID_FMT,
>> >> > UUID_ARGS(&rpy->vote));
>> >> > + if (rpy->is_prevote) {
>> >> > + json_object_put(args, "is_prevote", json_boolean_create(true));
>> >> > + }
>> >> > }
>> >> >
>> >> > static void
>> >> > @@ -334,6 +346,7 @@ raft_vote_reply_from_jsonrpc(struct ovsdb_parser *p,
>> >> > {
>> >> > rpy->term = raft_parse_required_uint64(p, "term");
>> >> > rpy->vote = raft_parse_required_uuid(p, "vote");
>> >> > + rpy->is_prevote = raft_parse_optional_boolean(p, "is_prevote") ==
>> >> > 1;
>> >> > }
>> >> >
>> >> > static void
>> >> > @@ -341,6 +354,9 @@ raft_format_vote_reply(const struct raft_vote_reply
>> >> > *rpy, struct ds *s)
>> >> > {
>> >> > ds_put_format(s, " term=%"PRIu64, rpy->term);
>> >> > ds_put_format(s, " vote="SID_FMT, SID_ARGS(&rpy->vote));
>> >> > + if (rpy->is_prevote) {
>> >> > + ds_put_cstr(s, " is_prevote=true");
>> >> > + }
>> >> > }
>> >> >
>> >> > /* raft_add_server_request */
>> >> > @@ -1008,7 +1024,8 @@ raft_rpc_get_vote(const union raft_rpc *rpc)
>> >> > return NULL;
>> >> >
>> >> > case RAFT_RPC_VOTE_REPLY:
>> >> > - return &raft_vote_reply_cast(rpc)->vote;
>> >> > + return raft_vote_reply_cast(rpc)->is_prevote ? NULL :
>> >> > + &raft_vote_reply_cast(rpc)->vote;
>> >>
>> >> It might be better to create a pointer variable here and access it
>> >> instead of casting twice.
>> >>
>> > That was my first attempt, but I can't define a local variable under the
>> > 'case' statement unless I add a pair of '{}' for the whole block, which
>> > looks ugly :(
>> > I will try to find if there are similar patterns in the code base.
>>
>> We have a few places with braces around the case body, it should be
>> fine. You may also define a pointer at the top of the function
>> and assign it in the case. RAFT_RPC_VOTE_REPLY is the main use-case
>> for this function, so having a pointer for it declared at the top
>> should be fine as well.
>>
>
> Ack
>
>> >
>> >> >
>> >> > default:
>> >> > OVS_NOT_REACHED();
>> >> > diff --git a/ovsdb/raft-rpc.h b/ovsdb/raft-rpc.h
>> >> > index 221f24d00128..10f30618e3e0 100644
>> >> > --- a/ovsdb/raft-rpc.h
>> >> > +++ b/ovsdb/raft-rpc.h
>> >> > @@ -125,12 +125,15 @@ struct raft_vote_request {
>> >> > uint64_t last_log_index; /* Index of candidate's last log entry. */
>> >> > uint64_t last_log_term; /* Term of candidate's last log entry. */
>> >> > bool leadership_transfer; /* True to override minimum election
>> >> > timeout. */
>> >> > + bool is_prevote; /* True: pre-vote; False: real vote
>> >> > (default). */
>> >> > };
>> >> >
>> >> > struct raft_vote_reply {
>> >> > struct raft_rpc_common common;
>> >> > uint64_t term; /* Current term, for candidate to update
>> >> > itself. */
>> >> > struct uuid vote; /* Server ID of vote. */
>> >> > + bool is_prevote; /* Copy the is_prevote from the request,
>> >> > primarily
>> >> > + for validation. */
>> >>
>> >> Each comment line should start with a '*'.
>> >>
>> >> "Copy of the ..." ?
>> >>
>> >
>> > Ack.
>> >
>> >> > };
>> >> >
>> >> > struct raft_add_server_request {
>> >> > diff --git a/ovsdb/raft.c b/ovsdb/raft.c
>> >> > index b2361b1737a2..4d7a3f112cad 100644
>> >> > --- a/ovsdb/raft.c
>> >> > +++ b/ovsdb/raft.c
>> >> > @@ -305,6 +305,11 @@ struct raft {
>> >> >
>> >> > /* Candidates only. Reinitialized at start of election. */
>> >> > int n_votes; /* Number of votes for me. */
>> >> > + bool prevote_passed; /* Indicates if it passed pre-vote
>> >> > phase.
>> >> > + Pre-vote mechanism is introduced in
>> >> > raft
>> >> > + paper section 9.6. We implement it
>> >> > as a
>> >> > + sub-state of candidate to minize
>> >> > the change
>> >> > + and keep backward compatibility. */
>> >>
>> >> Each comment line should start with a '*'.
>> >>
>> >> And s/minize/minimize/.
>> >>
>> > Ack.
>> >
>> >> >
>> >> > /* Followers and candidates only. */
>> >> > bool candidate_retrying; /* The earlier election timed-out and
>> >> > we are
>> >> > @@ -372,7 +377,8 @@ static void raft_become_follower(struct raft *);
>> >> > static void raft_reset_election_timer(struct raft *);
>> >> > static void raft_reset_ping_timer(struct raft *);
>> >> > static void raft_send_heartbeats(struct raft *);
>> >> > -static void raft_start_election(struct raft *, bool
>> >> > leadership_transfer);
>> >> > +static void raft_start_election(struct raft *, bool is_prevote,
>> >> > + bool leadership_transfer);
>> >> > static bool raft_truncate(struct raft *, uint64_t new_end);
>> >> > static void raft_get_servers_from_log(struct raft *, enum vlog_level);
>> >> > static void raft_get_election_timer_from_log(struct raft *);
>> >> > @@ -1069,7 +1075,8 @@ raft_open(struct ovsdb_log *log, struct raft
>> >> > **raftp)
>> >> > /* If there's only one server, start an election right away so
>> >> > that the
>> >> > * cluster bootstraps quickly. */
>> >> > if (hmap_count(&raft->servers) == 1) {
>> >> > - raft_start_election(raft, false);
>> >> > + /* No pre-vote needed since we are the only one. */
>> >> > + raft_start_election(raft, false, false);
>> >> > }
>> >> > } else {
>> >> > raft->join_timeout = time_msec() + 1000;
>> >> > @@ -1360,7 +1367,7 @@ void
>> >> > raft_take_leadership(struct raft *raft)
>> >> > {
>> >> > if (raft->role != RAFT_LEADER) {
>> >> > - raft_start_election(raft, true);
>> >> > + raft_start_election(raft, false, true);
>> >> > }
>> >> > }
>> >> >
>> >> > @@ -1766,12 +1773,12 @@ raft_set_term(struct raft *raft, uint64_t term,
>> >> > const struct uuid *vote)
>> >> > return true;
>> >> > }
>> >> >
>> >> > -static void
>> >> > +static bool
>> >> > raft_accept_vote(struct raft *raft, struct raft_server *s,
>> >> > const struct uuid *vote)
>> >> > {
>> >> > if (uuid_equals(&s->vote, vote)) {
>> >> > - return;
>> >> > + return false;
>> >> > }
>> >> > if (!uuid_is_zero(&s->vote)) {
>> >> > static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
>> >> > @@ -1785,13 +1792,18 @@ raft_accept_vote(struct raft *raft, struct
>> >> > raft_server *s,
>> >> > s->vote = *vote;
>> >> > if (uuid_equals(vote, &raft->sid)
>> >> > && ++raft->n_votes > hmap_count(&raft->servers) / 2) {
>> >> > - raft_become_leader(raft);
>> >> > + return true;
>> >> > }
>> >> > + return false;
>> >> > }
>> >> >
>> >> > static void
>> >> > -raft_start_election(struct raft *raft, bool leadership_transfer)
>> >> > +raft_start_election(struct raft *raft, bool is_prevote,
>> >> > + bool leadership_transfer)
>> >> > {
>> >> > + /* Leadership transfer doesn't use prevote. */
>> >> > + ovs_assert(!is_prevote || !leadership_transfer);
>> >> > +
>> >> > if (raft->leaving) {
>> >> > return;
>> >> > }
>> >> > @@ -1801,7 +1813,7 @@ raft_start_election(struct raft *raft, bool
>> >> > leadership_transfer)
>> >> > return;
>> >> > }
>> >> >
>> >> > - if (!raft_set_term(raft, raft->term + 1, &raft->sid)) {
>> >> > + if (!is_prevote && !raft_set_term(raft, raft->term + 1,
>> >> > &raft->sid)) {
>> >> > return;
>> >> > }
>> >> >
>> >> > @@ -1809,26 +1821,33 @@ raft_start_election(struct raft *raft, bool
>> >> > leadership_transfer)
>> >> >
>> >> > raft->leader_sid = UUID_ZERO;
>> >> > raft->role = RAFT_CANDIDATE;
>> >> > - /* If there was no leader elected since last election, we know we
>> >> > are
>> >> > - * retrying now. */
>> >> > - raft->candidate_retrying = !raft->had_leader;
>> >> > - raft->had_leader = false;
>> >> > + raft->prevote_passed = !is_prevote;
>> >> > +
>> >> > + if (is_prevote || leadership_transfer) {
>> >> > + /* If there was no leader elected since last election, we know
>> >> > we are
>> >> > + * retrying now. */
>> >> > + raft->candidate_retrying = !raft->had_leader;
>> >> > + raft->had_leader = false;
>> >> > +
>> >> > + raft->election_start = time_msec();
>> >> > + raft->election_won = 0;
>> >> > + }
>> >> >
>> >> > raft->n_votes = 0;
>> >> >
>> >> > - raft->election_start = time_msec();
>> >> > - raft->election_won = 0;
>> >> > raft->leadership_transfer = leadership_transfer;
>> >> >
>> >> > static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> >> > if (!VLOG_DROP_INFO(&rl)) {
>> >> > long long int now = time_msec();
>> >> > + char *comment = is_prevote ? "prevote" : "vote";
>> >>
>> >> Should user-visible logs say "pre-vote" instead of "prevote" ?
>> >
>> > I am ok either way. I was thinking about the situation when searching the
>> > logs with \<vote\> in vim, but it shouldn't matter much :) I can change it
>> > if pre-vote looks better to users.
>>
>> It was just a bit confusing that all the comments use 'pre-vote',
>> but the log entry is using 'prevote'.
>>
>> An alternative might be to use "starting election" for an actual
>> vote and "starting preliminary election" for a pre-vote.
>>
>> We may also use 'preliminary': true/false, instead of "is_prevote".
>> Might be more user-friendly.
>>
>> 'prevote_passed' may be turned into 'preliminary_vote_passed'.
>> It's on the edge of being too long of a name, not sure what would
>> be a better one though.
>>
>> What do you think?
>
> I'd prefer "pre-vote" since it is the terminology used in the RAFT paper. I
> will update the log to "pre-vote" in v2.
>
>>
>> >>
>> >> > if (now >= raft->election_timeout) {
>> >> > VLOG_INFO("term %"PRIu64": %lld ms timeout expired, "
>> >> > - "starting election",
>> >> > - raft->term, now - raft->election_base);
>> >> > + "starting election (%s)",
>> >> > + raft->term, now - raft->election_base, comment);
>> >> > } else {
>> >> > - VLOG_INFO("term %"PRIu64": starting election", raft->term);
>> >> > + VLOG_INFO("term %"PRIu64": starting election (%s)",
>> >> > + raft->term, comment);
>> >> > }
>> >> > }
>> >> > raft_reset_election_timer(raft);
>> >> > @@ -1853,6 +1872,7 @@ raft_start_election(struct raft *raft, bool
>> >> > leadership_transfer)
>> >> > ? raft->entries[raft->log_end - raft->log_start -
>> >> > 1].term
>> >> > : raft->snap.term),
>> >> > .leadership_transfer = leadership_transfer,
>> >> > + .is_prevote = is_prevote,
>> >> > },
>> >> > };
>> >> > if (failure_test != FT_DONT_SEND_VOTE_REQUEST) {
>> >> > @@ -1861,7 +1881,13 @@ raft_start_election(struct raft *raft, bool
>> >> > leadership_transfer)
>> >> > }
>> >> >
>> >> > /* Vote for ourselves. */
>> >> > - raft_accept_vote(raft, me, &raft->sid);
>> >> > + if (raft_accept_vote(raft, me, &raft->sid)) {
>> >> > + /* We just started vote, so it shouldn't be accepted yet
>> >> > unless this is
>> >> > + * a one-node cluster. In such case we don't do pre-vote, and
>> >> > become
>> >> > + * leader immediately. */
>> >> > + ovs_assert(!is_prevote);
>> >> > + raft_become_leader(raft);
>> >> > + }
>> >> > }
>> >> >
>> >> > static void
>> >> > @@ -2041,10 +2067,10 @@ raft_run(struct raft *raft)
>> >> > raft_reset_election_timer(raft);
>> >> > } else {
>> >> > raft_become_follower(raft);
>> >> > - raft_start_election(raft, false);
>> >> > + raft_start_election(raft, true, false);
>> >> > }
>> >> > } else {
>> >> > - raft_start_election(raft, false);
>> >> > + raft_start_election(raft, true, false);
>> >> > }
>> >> >
>> >> > }
>> >> > @@ -3673,6 +3699,10 @@ raft_handle_vote_request__(struct raft *raft,
>> >> > return false;
>> >> > }
>> >> >
>> >> > + if (rq->is_prevote) {
>> >> > + return true;
>> >> > + }
>> >> > +
>> >> > /* Record a vote for the peer. */
>> >> > if (!raft_set_term(raft, raft->term, &rq->common.sid)) {
>> >> > return false;
>> >> > @@ -3685,7 +3715,7 @@ raft_handle_vote_request__(struct raft *raft,
>> >> >
>> >> > static void
>> >> > raft_send_vote_reply(struct raft *raft, const struct uuid *dst,
>> >> > - const struct uuid *vote)
>> >> > + const struct uuid *vote, bool is_prevote)
>> >> > {
>> >> > union raft_rpc rpy = {
>> >> > .vote_reply = {
>> >> > @@ -3695,6 +3725,7 @@ raft_send_vote_reply(struct raft *raft, const
>> >> > struct uuid *dst,
>> >> > },
>> >> > .term = raft->term,
>> >> > .vote = *vote,
>> >> > + .is_prevote = is_prevote,
>> >> > },
>> >> > };
>> >> > raft_send(raft, &rpy);
>> >> > @@ -3705,7 +3736,9 @@ raft_handle_vote_request(struct raft *raft,
>> >> > const struct raft_vote_request *rq)
>> >> > {
>> >> > if (raft_handle_vote_request__(raft, rq)) {
>> >> > - raft_send_vote_reply(raft, &rq->common.sid, &raft->vote);
>> >> > + raft_send_vote_reply(raft, &rq->common.sid,
>> >> > + rq->is_prevote ? &rq->common.sid :
>> >> > &raft->vote,
>> >> > + rq->is_prevote);
>> >> > }
>> >> > }
>> >> >
>> >> > @@ -3723,7 +3756,14 @@ raft_handle_vote_reply(struct raft *raft,
>> >> >
>> >> > struct raft_server *s = raft_find_peer(raft, &rpy->common.sid);
>> >> > if (s) {
>> >> > - raft_accept_vote(raft, s, &rpy->vote);
>> >> > + if (raft_accept_vote(raft, s, &rpy->vote)) {
>> >>
>> >> IIUC, the raft_accept_vote() here can't succeed for pre-vote because
>> >> everyone votes for themselves on pre-vote. Is that correct?
>> >>
>> > pre-vote happens the same way as the real vote, except that pre-vote
>> > doesn't increase the term.
>> > This function is for the candidate to handle a vote-reply from a peer for
>> > a vote/pre-vote request sent by itself.
>> > So, raft_accept_vote should succeed in normal cases, for both pre-vote and
>> > vote.
>> > If it is pre-vote and it is successful, it starts the real vote
>> > immediately.
>> >
>> >> What happens if we received a real vote during a pre-vote phase for
>> >> some reason? Will this logic fail? Is this scenario possible?
>> >
>> > The logic of handling the vote/pre-vote reply message only depends on the
>> > status of the server itself, regardless of the message's is_prevote field.
>> > So we don't really need to care if it is a pre-vote reply or vote reply.
>> > The is_prevote field in the reply was added only for validation purpose,
>> > not useful for the RPC itself, as mentioned in the comments.
>>
>> Oh, thanks for explanation, I misread the raft_handle_vote_request()
>> the first time. I'll take a closer look.
>>
> No problem. Thanks for your review. Do you have other comments? If not, I
> will send v2 with all the comments addressed.
I don't have any further comments at the moment,
feel free to send a new version.
Best regards, Ilya Maximets.
>
> Regards,
> Han
>
>> >
>> > Thanks,
>> > Han
>> >>
>> >> > + if (raft->prevote_passed) {
>> >> > + raft_become_leader(raft);
>> >> > + } else {
>> >> > + /* Start the real election. */
>> >> > + raft_start_election(raft, false, false);
>> >> > + }
>> >> > + }
>> >> > }
>> >> > }
>> >> >
>> >> > @@ -4357,7 +4397,7 @@ raft_handle_become_leader(struct raft *raft,
>> >> > VLOG_INFO("received leadership transfer from %s in term
>> >> > %"PRIu64,
>> >> > raft_get_nickname(raft, &rq->common.sid, buf, sizeof
>> >> > buf),
>> >> > rq->term);
>> >> > - raft_start_election(raft, true);
>> >> > + raft_start_election(raft, false, true);
>> >> > }
>> >> > }
>> >> >
>> >> > diff --git a/tests/ovsdb-cluster.at <http://ovsdb-cluster.at>
>> >> > <http://ovsdb-cluster.at <http://ovsdb-cluster.at>>
>> >> > b/tests/ovsdb-cluster.at <http://ovsdb-cluster.at>
>> >> > <http://ovsdb-cluster.at <http://ovsdb-cluster.at>>
>> >> > index 9fbf5dc897f2..8a81136999e0 100644
>> >> > --- a/tests/ovsdb-cluster.at <http://ovsdb-cluster.at>
>> >> > <http://ovsdb-cluster.at <http://ovsdb-cluster.at>>
>> >> > +++ b/tests/ovsdb-cluster.at <http://ovsdb-cluster.at>
>> >> > <http://ovsdb-cluster.at <http://ovsdb-cluster.at>>
>> >> > @@ -715,6 +715,48 @@ done
>> >> >
>> >> > AT_CLEANUP
>> >> >
>> >> > +
>> >> > +AT_SETUP([OVSDB cluster - disruptive server])
>> >> > +AT_KEYWORDS([ovsdb server negative unix cluster disruptive])
>> >> > +
>> >> > +n=3
>> >> > +schema_name=`ovsdb-tool schema-name $abs_srcdir/idltest.ovsschema`
>> >> > +ordinal_schema > schema
>> >> > +AT_CHECK([ovsdb-tool '-vPATTERN:console:%c|%p|%m' create-cluster s1.db
>> >> > $abs_srcdir/idltest.ovsschema unix:s1.raft], [0], [], [stderr])
>> >> > +cid=`ovsdb-tool db-cid s1.db`
>> >> > +schema_name=`ovsdb-tool schema-name $abs_srcdir/idltest.ovsschema`
>> >> > +for i in `seq 2 $n`; do
>> >> > + AT_CHECK([ovsdb-tool join-cluster s$i.db $schema_name
>> >> > unix:s$i.raft unix:s1.raft])
>> >> > +done
>> >> > +
>> >> > +on_exit 'kill `cat *.pid`'
>> >> > +for i in `seq $n`; do
>> >> > + AT_CHECK([ovsdb-server -v -vconsole:off -vsyslog:off --detach
>> >> > --no-chdir --log-file=s$i.log --pidfile=s$i.pid --unixctl=s$i
>> >> > --remote=punix:s$i.ovsdb s$i.db])
>> >> > +done
>> >> > +for i in `seq $n`; do
>> >> > + AT_CHECK([ovsdb_client_wait unix:s$i.ovsdb $schema_name connected])
>> >> > +done
>> >> > +
>> >> > +# An unstable follower shouldn't disrupt the healthy cluster -
>> >> > shouldn't
>> >> > +# trigger term change.
>> >> > +AT_CHECK([ovs-appctl -t "`pwd`"/s2 cluster/failure-test
>> >> > stop-raft-rpc], [0], [ignore])
>> >> > +OVS_WAIT_UNTIL([ovs-appctl -t "`pwd`"/s2 cluster/status $schema_name |
>> >> > grep "Role: candidate"])
>> >> > +AT_CHECK([ovs-appctl -t "`pwd`"/s2 cluster/failure-test clear], [0],
>> >> > [ignore])
>> >> > +
>> >> > +# Should step back to follower.
>> >> > +OVS_WAIT_UNTIL([ovs-appctl -t "`pwd`"/s2 cluster/status $schema_name |
>> >> > grep "Role: follower"])
>> >> > +
>> >> > +# No term change.
>> >> > +for i in `seq $n`; do
>> >> > + AT_CHECK([ovs-appctl -t "`pwd`"/s$i cluster/status $schema_name |
>> >> > grep "Term: 1"], [0], [ignore])
>> >> > +done
>> >> > +
>> >> > +for i in `seq $n`; do
>> >> > + OVS_APP_EXIT_AND_WAIT_BY_TARGET([`pwd`/s$i], [s$i.pid])
>> >> > +done
>> >> > +
>> >> > +AT_CLEANUP
>> >> > +
>> >> >
>> >> > AT_BANNER([OVSDB - cluster tests])
>> >> >
>> >>
>>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev