Shilun Fan created RATIS-2291:
---------------------------------

             Summary: Fix failing 
TestInstallSnapshotNotificationWithGrpc#testAddNewFollowersNoSnapshot
                 Key: RATIS-2291
                 URL: https://issues.apache.org/jira/browse/RATIS-2291
             Project: Ratis
          Issue Type: Bug
          Components: test
            Reporter: Shilun Fan
            Assignee: Shilun Fan


During the investigation of RATIS-2251, we encountered a persistent unit test 
failure in 
TestInstallSnapshotNotificationWithGrpc#testAddNewFollowersNoSnapshot. 
Initially, we suspected this was caused by the JUnit version upgrade, but 
further analysis confirms that the test also fails under JUnit 4. Detailed 
discussions and debugging steps can be found in the comments of PR #1227.
 
When addressing the issue with the unit test : 
{{TestInstallSnapshotNotificationWithGrpc#testAddNewFollowersNoSnapshot}}

I found that the error persists even after applying the fix from RATIS-2045.

RATIS-2045 fixed the issue where SnapshotInstallationHandler didn't notify 
followers to install snapshots when the snapshot index was -1 and the leader's 
firstAvailableLogIndex was 0 (PR 
[#1053|https://github.com/apache/ratis/pull/1053]).

This PR changes the behavior of whether followers pull snapshots from the 
leader.

>From the logs, we can observe that the newly added followers {{s1}} and {{s2}} 
>have both synchronized snapshots from the leader {{{}s0{}}}. As a result, the 
>snapshot index for followers {{s1}} and {{s2}} becomes {{16}} (16 because we 
>manually created messages twice), instead of {{{}-1{}}}. Therefore, the 
>current check condition is problematic.
 
{code:java}
follower s1:
2025-05-10 17:23:10,134 [grpc-default-executor-2] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(262)) - 
s1@group-F83BA0BDB609: Received notification to install snapshot at index 0
2025-05-10 17:23:10,137 [grpc-default-executor-2] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(297)) - 
s1@group-F83BA0BDB609: notifyInstallSnapshot: nextIndex is 0 but the leader's 
first available index is 0.
......
2025-05-10 17:23:11,151 [grpc-default-executor-0] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(365)) - 
s1@group-F83BA0BDB609: InstallSnapshot notification result: SNAPSHOT_INSTALLED, 
at index: (t:1, i:16)

follower s2:
2025-05-10 17:23:11,214 [grpc-default-executor-2] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(262)) - 
s2@group-F83BA0BDB609: Received notification to install snapshot at index 0
2025-05-10 17:23:11,214 [grpc-default-executor-2] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(297)) - 
s2@group-F83BA0BDB609: notifyInstallSnapshot: nextIndex is 0 but the leader's 
first available index is 0.
......
2025-05-10 17:23:12,217 [grpc-default-executor-0] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(365)) - 
s2@group-F83BA0BDB609: InstallSnapshot notification result: SNAPSHOT_INSTALLED, 
at index: (t:1, i:16) {code}
Logs before applying RATIS-2045:
{code:java}
2025-05-10 17:42:54,878 [grpc-default-executor-0] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(221)) - 
s1@group-46FD094EFC86: Received notification to install snapshot at index 0
2025-05-10 17:42:54,878 [grpc-default-executor-0] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(230)) - 
s1@group-46FD094EFC86: InstallSnapshot notification result: ALREADY_INSTALLED, 
current snapshot index: -1
.....
2025-05-10 17:42:54,880 [grpc-default-executor-2] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(221)) - 
s2@group-46FD094EFC86: Received notification to install snapshot at index 0
2025-05-10 17:42:54,880 [grpc-default-executor-2] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(230)) - 
s2@group-46FD094EFC86: InstallSnapshot notification result: ALREADY_INSTALLED, 
current snapshot index: -1 {code}
So, if we believe that #1053 is reasonable, we should modify the check 
condition by adjusting the expected value to match the leader's value.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to