TaiJuWu commented on code in PR #20318:
URL: https://github.com/apache/kafka/pull/20318#discussion_r2462531728
##########
raft/src/main/java/org/apache/kafka/raft/LeaderState.java:
##########
@@ -188,6 +188,29 @@ public void resetBeginQuorumEpochTimer(long currentTimeMs)
{
beginQuorumEpochTimer.reset(beginQuorumEpochTimeoutMs);
}
+ /**
+ * Determines the set of replicas that should receive a {@code
BeginQuorumEpoch} request
+ * based on the elapsed time since their last fetch.
+ * <p>
+ * For each remote voter (excluding the local node), if the time since the
last
+ * fetch exceeds the configured {@code beginQuorumEpochTimeoutMs}, the
replica
+ * is considered to need a new quorum epoch request.
+ *
+ * @param currentTimeMs the current system time in milliseconds
+ * @return an unmodifiable set of {@link ReplicaKey} objects representing
replicas
+ * that need to receive a {@code BeginQuorumEpoch} request
+ */
+ public Set<ReplicaKey> needToSendBeginQuorumRequests(long currentTimeMs) {
+ return voterStates.values()
+ .stream()
+ .filter(
+ state -> state.replicaKey.id() !=
localVoterNode.voterKey().id() &&
+ currentTimeMs - state.lastFetchTimestamp >=
beginQuorumEpochTimeoutMs
Review Comment:
I also think the algorithm works because if the condition is not meet, it
just downgrade to current version (beginQuorum like heartbeat.
##########
raft/src/main/java/org/apache/kafka/raft/LeaderState.java:
##########
@@ -188,6 +188,29 @@ public void resetBeginQuorumEpochTimer(long currentTimeMs)
{
beginQuorumEpochTimer.reset(beginQuorumEpochTimeoutMs);
}
+ /**
+ * Determines the set of replicas that should receive a {@code
BeginQuorumEpoch} request
+ * based on the elapsed time since their last fetch.
+ * <p>
+ * For each remote voter (excluding the local node), if the time since the
last
+ * fetch exceeds the configured {@code beginQuorumEpochTimeoutMs}, the
replica
+ * is considered to need a new quorum epoch request.
+ *
+ * @param currentTimeMs the current system time in milliseconds
+ * @return an unmodifiable set of {@link ReplicaKey} objects representing
replicas
+ * that need to receive a {@code BeginQuorumEpoch} request
+ */
+ public Set<ReplicaKey> needToSendBeginQuorumRequests(long currentTimeMs) {
+ return voterStates.values()
+ .stream()
+ .filter(
+ state -> state.replicaKey.id() !=
localVoterNode.voterKey().id() &&
+ currentTimeMs - state.lastFetchTimestamp >=
beginQuorumEpochTimeoutMs
Review Comment:
Umm, but there is an issue which is the value of beginQuorumEpochTimeoutMs
can be set from users.
If the values is too small, I think the condition is not meet.
By the way, there is any reason we need to hardcode ` max fetch wait time`?
In my mind, we can make it configurable and enforce user to set bigger than
`QUORUM_FETCH_TIMEOUT_MS_CONFIG`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]