If a server that doesn't have an up-to-date log attempts a pre-vote,
it is possible that it will be granted.  This can happen when the
request is not considered disruptive (sufficient time have passed
since the last message seen) because the current code doesn't check
the log length if the receiver already voted for any other server on
the current term.

This is not good, as the pre-vote supposed to determine if the
requester can win elections on the next term, and it can not if its
log is not up-to-date.

In general, the current vote has no meaning for the next term.  At the
beginning of the next term the vote will be set to zero in any case,
so the only thing we should be checking is the log being up-to-date.

The test for a disruptive server with an outdated log reproduces the
issue, so it was extended to make sure the outdated server never wins
the pre-vote.

Fixes: 85634fd58004 ("ovsdb: raft: Support pre-vote mechanism to deal with 
disruptive server.")
Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2026-January/428993.html
Reported-by: Han Zhou <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
---
 ovsdb/raft.c           | 27 ++++++++++++++++++---------
 tests/ovsdb-cluster.at |  4 ++++
 2 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/ovsdb/raft.c b/ovsdb/raft.c
index d549a3fb5..f754b49a5 100644
--- a/ovsdb/raft.c
+++ b/ovsdb/raft.c
@@ -3909,15 +3909,24 @@ raft_handle_vote_request__(struct raft *raft,
 {
     /* Figure 3.1: "If votedFor is null or candidateId, and candidate's vote is
      * at least as up-to-date as receiver's log, grant vote (sections 3.4,
-     * 3.6)." */
-    if (uuid_equals(&raft->vote, &rq->common.sid)) {
-        /* Already voted for this candidate in this term.  Resend vote. */
-        return true;
-    } else if (!uuid_is_zero(&raft->vote)) {
-        /* Already voted for different candidate in this term.  Send a reply
-         * saying what candidate we did vote for.  This isn't a necessary part
-         * of the Raft protocol but it can make debugging easier. */
-        return true;
+     * 3.6)."
+     *
+     * Note: The vote from the current term is not meaningful for the pre-vote,
+     * since the pre-vote supposed to determine if the vote for the *next* term
+     * can be successful or not.  The votedFor will be null at the beginning of
+     * the next term.  Hence the only requirement for granting a pre-vote is
+     * the candidate's log being at least as up-to-date as receiver's. */
+    if (!rq->is_prevote) {
+        if (uuid_equals(&raft->vote, &rq->common.sid)) {
+            /* Already voted for this candidate in this term.  Resend vote. */
+            return true;
+        } else if (!uuid_is_zero(&raft->vote)) {
+            /* Already voted for different candidate in this term.  Send a
+             * reply saying what candidate we did vote for.  This isn't a
+             * necessary part of the Raft protocol but it can make debugging
+             * easier. */
+            return true;
+        }
     }
 
     /* Section 3.6.1: "The RequestVote RPC implements this restriction: the RPC
diff --git a/tests/ovsdb-cluster.at b/tests/ovsdb-cluster.at
index 7e3eef8d4..47f66b2ca 100644
--- a/tests/ovsdb-cluster.at
+++ b/tests/ovsdb-cluster.at
@@ -1106,6 +1106,10 @@ for i in $(seq $n); do
              [0], [lmxyz])
 done
 
+# Check that s3 never won a pre-vote.
+AT_CHECK([grep 'send notification, method="vote_request"' s3.log], [0], 
[stdout])
+AT_CHECK([test $(grep -c vote_request stdout) -eq $(grep -c 
'"is_prevote":true' stdout)])
+
 for i in $(seq $n); do
     OVS_APP_EXIT_AND_WAIT_BY_TARGET([$(pwd)/s$i], [s$i.pid])
 done
-- 
2.52.0

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to