Shilun Fan created RATIS-2291:
---------------------------------
Summary: Fix failing
TestInstallSnapshotNotificationWithGrpc#testAddNewFollowersNoSnapshot
Key: RATIS-2291
URL: https://issues.apache.org/jira/browse/RATIS-2291
Project: Ratis
Issue Type: Bug
Components: test
Reporter: Shilun Fan
Assignee: Shilun Fan
During the investigation of RATIS-2251, we encountered a persistent unit test
failure in
TestInstallSnapshotNotificationWithGrpc#testAddNewFollowersNoSnapshot.
Initially, we suspected this was caused by the JUnit version upgrade, but
further analysis confirms that the test also fails under JUnit 4. Detailed
discussions and debugging steps can be found in the comments of PR #1227.
When addressing the issue with the unit test :
{{TestInstallSnapshotNotificationWithGrpc#testAddNewFollowersNoSnapshot}}
I found that the error persists even after applying the fix from RATIS-2045.
RATIS-2045 fixed the issue where SnapshotInstallationHandler didn't notify
followers to install snapshots when the snapshot index was -1 and the leader's
firstAvailableLogIndex was 0 (PR
[#1053|https://github.com/apache/ratis/pull/1053]).
This PR changes the behavior of whether followers pull snapshots from the
leader.
>From the logs, we can observe that the newly added followers {{s1}} and {{s2}}
>have both synchronized snapshots from the leader {{{}s0{}}}. As a result, the
>snapshot index for followers {{s1}} and {{s2}} becomes {{16}} (16 because we
>manually created messages twice), instead of {{{}-1{}}}. Therefore, the
>current check condition is problematic.
{code:java}
follower s1:
2025-05-10 17:23:10,134 [grpc-default-executor-2] INFO
impl.SnapshotInstallationHandler
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(262)) -
s1@group-F83BA0BDB609: Received notification to install snapshot at index 0
2025-05-10 17:23:10,137 [grpc-default-executor-2] INFO
impl.SnapshotInstallationHandler
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(297)) -
s1@group-F83BA0BDB609: notifyInstallSnapshot: nextIndex is 0 but the leader's
first available index is 0.
......
2025-05-10 17:23:11,151 [grpc-default-executor-0] INFO
impl.SnapshotInstallationHandler
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(365)) -
s1@group-F83BA0BDB609: InstallSnapshot notification result: SNAPSHOT_INSTALLED,
at index: (t:1, i:16)
follower s2:
2025-05-10 17:23:11,214 [grpc-default-executor-2] INFO
impl.SnapshotInstallationHandler
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(262)) -
s2@group-F83BA0BDB609: Received notification to install snapshot at index 0
2025-05-10 17:23:11,214 [grpc-default-executor-2] INFO
impl.SnapshotInstallationHandler
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(297)) -
s2@group-F83BA0BDB609: notifyInstallSnapshot: nextIndex is 0 but the leader's
first available index is 0.
......
2025-05-10 17:23:12,217 [grpc-default-executor-0] INFO
impl.SnapshotInstallationHandler
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(365)) -
s2@group-F83BA0BDB609: InstallSnapshot notification result: SNAPSHOT_INSTALLED,
at index: (t:1, i:16) {code}
Logs before applying RATIS-2045:
{code:java}
2025-05-10 17:42:54,878 [grpc-default-executor-0] INFO
impl.SnapshotInstallationHandler
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(221)) -
s1@group-46FD094EFC86: Received notification to install snapshot at index 0
2025-05-10 17:42:54,878 [grpc-default-executor-0] INFO
impl.SnapshotInstallationHandler
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(230)) -
s1@group-46FD094EFC86: InstallSnapshot notification result: ALREADY_INSTALLED,
current snapshot index: -1
.....
2025-05-10 17:42:54,880 [grpc-default-executor-2] INFO
impl.SnapshotInstallationHandler
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(221)) -
s2@group-46FD094EFC86: Received notification to install snapshot at index 0
2025-05-10 17:42:54,880 [grpc-default-executor-2] INFO
impl.SnapshotInstallationHandler
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(230)) -
s2@group-46FD094EFC86: InstallSnapshot notification result: ALREADY_INSTALLED,
current snapshot index: -1 {code}
So, if we believe that #1053 is reasonable, we should modify the check
condition by adjusting the expected value to match the leader's value.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)