On 4/2/25 4:47 PM, Mike Pattrick wrote: > On Thu, Mar 27, 2025 at 6:11 PM Ilya Maximets <i.maxim...@ovn.org> wrote: >> >> When the cluster falls apart, there is no leader and so there are no >> heartbeats that need to be sent or received on time. In this case >> the RAFT_TIMER_THRESHOLD is too small, as the process will only wake >> up for a full election timer. When this happens, cooperative >> multitasking code will emit warnings and traces thinking there is a >> timer overrun. >> >> Fix that by setting the multitasking interval according to the full >> election timer, if there is no known leader. >> >> Fixes: d4a15647b917 ("ovsdb: raft: Enable cooperative multitasking.") >> Signed-off-by: Ilya Maximets <i.maxim...@ovn.org> >> --- >> >> Cc: Frode Nordahl <fnord...@ubuntu.com> >> >> ovsdb/raft.c | 11 ++++++++--- >> 1 file changed, 8 insertions(+), 3 deletions(-) >> >> diff --git a/ovsdb/raft.c b/ovsdb/raft.c >> index 9c3c351b5..a9da28780 100644 >> --- a/ovsdb/raft.c >> +++ b/ovsdb/raft.c >> @@ -2170,9 +2170,14 @@ raft_run(struct raft *raft) >> raft_reset_ping_timer(raft); >> } >> >> - uint64_t interval = raft->joining >> - ? RAFT_JOIN_TIMEOUT_MS >> - : RAFT_TIMER_THRESHOLD(raft->election_timer); >> + uint64_t interval = RAFT_TIMER_THRESHOLD(raft->election_timer); > > Why not set interval in an else clause?
I had it written this way at first, but then changed for a couple reasons: 1. The RAFT_TIMER_THRESHOLD is the most common case that we'll use the vast majority of the time. 2. We'll need to have a negative check for the leader_sid in the else branch and the comment in that branch become a little more complicated in order to explain what's going on. 2 may be not a big deal, but 1 seems a good enough reason. Does that make sense? Best regards, Ilya Maximets. > > -M > >> + >> + if (raft->joining) { >> + interval = RAFT_JOIN_TIMEOUT_MS; >> + } else if (uuid_is_zero(&raft->leader_sid)) { >> + /* There are no heartbeats to handle when there is no leader. */ >> + interval = raft->election_timer; >> + } >> cooperative_multitasking_set( >> &raft_run_cb, (void *) raft, time_msec(), >> interval + interval / 10, "raft_run"); >> -- >> 2.47.0 >> >> _______________________________________________ >> dev mailing list >> d...@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev