Commit 3c2d6274bcee ("raft: Transfer leadership before creating
snapshots.") made it such that raft leaders transfer leadership before
snapshotting. However, there's still the case when the next leader to
be is in the process of snapshotting. To avoid delays in that case too,
we now explicitly allow snapshots only on followers. Cluster members
will have to wait until the current election is settled before
snapshotting.
Given the following logs taken from an OVN_Southbound 3-server cluster
during a scale test:
S1 (old leader):
2021-12-10T19:07:51.226Z|00823|raft|INFO|Transferring leadership to write a
snapshot.
2021-12-10T19:08:03.830Z|00824|ovsdb|INFO|OVN_Southbound: Database compaction
took 12601ms
2021-12-10T19:08:03.833Z|00825|timeval|WARN|Unreasonably long 12604ms poll
interval (10632ms user, 1924ms system)
2021-12-10T19:08:03.940Z|00838|raft|INFO|server 8b8d is leader for term 43
S2 (follower):
2021-12-10T19:08:00.870Z|00481|raft|INFO|server 8b8d is leader for term 43
S3 (new leader):
2021-12-10T19:07:51.242Z|01083|raft|INFO|received leadership transfer from
f5c9 in term 42
2021-12-10T19:07:51.244Z|01084|raft|INFO|term 43: starting election
2021-12-10T19:08:00.805Z|01085|ovsdb|INFO|OVN_Southbound: Database compaction
took 9559ms
2021-12-10T19:08:00.869Z|01100|raft|INFO|term 43: elected leader by 2+ of 3
servers
We see that the leader to be (S3) receives the leadership transfer,
initiates the election and immediately after starts a snapshot that
takes ~9.5 seconds. During this time, S2 votes for S3 electing it
as cluster leader but S3 doesn't effectively become leader until it
finishes snapshotting, essentially keeping the cluster without a
leader for up to ~9.5 seconds.
With the current change, S3 will delay compaction and snapshotting until
the election is finished.
Signed-off-by: Dumitru Ceara <[email protected]>
---
ovsdb/raft.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/ovsdb/raft.c b/ovsdb/raft.c
index ce40c5bc075c..6ffcb21db1e2 100644
--- a/ovsdb/raft.c
+++ b/ovsdb/raft.c
@@ -4226,7 +4226,7 @@ raft_may_snapshot(const struct raft *raft)
&& !raft->leaving
&& !raft->left
&& !raft->failed
- && raft->role != RAFT_LEADER
+ && raft->role == RAFT_FOLLOWER
&& raft->last_applied >= raft->log_start);
}
--
2.27.0
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev