slfan1989 opened a new pull request, #1257:
URL: https://github.com/apache/ratis/pull/1257

   ## What changes were proposed in this pull request?
   
   During the investigation of 
[RATIS-2251](https://issues.apache.org/jira/browse/RATIS-2251), we encountered 
a persistent unit test failure in 
TestInstallSnapshotNotificationWithGrpc#testAddNewFollowersNoSnapshot. 
Initially, we suspected this was caused by the JUnit version upgrade, but 
further analysis confirms that the test also fails under JUnit 4. Detailed 
discussions and debugging steps can be found in the comments of PR #1227.
   
   When addressing the issue with the unit test : 
TestInstallSnapshotNotificationWithGrpc#testAddNewFollowersNoSnapshot
   
   I found that the error persists even after applying the fix from 
[RATIS-2045](https://issues.apache.org/jira/browse/RATIS-2045).
   
   [RATIS-2045](https://issues.apache.org/jira/browse/RATIS-2045) fixed the 
issue where SnapshotInstallationHandler didn't notify followers to install 
snapshots when the snapshot index was -1 and the leader's 
firstAvailableLogIndex was 0 (PR 
[#1053](https://github.com/apache/ratis/pull/1053)).
   
   This PR changes the behavior of whether followers pull snapshots from the 
leader.
   
   From the logs, we can observe that the newly added followers `s1` and `s2` 
have both synchronized snapshots from the leader `s0`. As a result, the 
snapshot index for followers `s1` and `s2` becomes 16 (16 because we manually 
created messages twice), instead of -1. Therefore, the current check condition 
is problematic.
   
   ```
   follower s1:
   2025-05-10 17:23:10,134 [grpc-default-executor-2] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(262)) - 
s1@group-F83BA0BDB609: Received notification to install snapshot at index 0
   2025-05-10 17:23:10,137 [grpc-default-executor-2] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(297)) - 
s1@group-F83BA0BDB609: notifyInstallSnapshot: nextIndex is 0 but the leader's 
first available index is 0.
   ......
   2025-05-10 17:23:11,151 [grpc-default-executor-0] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(365)) - 
s1@group-F83BA0BDB609: InstallSnapshot notification result: SNAPSHOT_INSTALLED, 
at index: (t:1, i:16)
   
   follower s2:
   2025-05-10 17:23:11,214 [grpc-default-executor-2] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(262)) - 
s2@group-F83BA0BDB609: Received notification to install snapshot at index 0
   2025-05-10 17:23:11,214 [grpc-default-executor-2] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(297)) - 
s2@group-F83BA0BDB609: notifyInstallSnapshot: nextIndex is 0 but the leader's 
first available index is 0.
   ......
   2025-05-10 17:23:12,217 [grpc-default-executor-0] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(365)) - 
s2@group-F83BA0BDB609: InstallSnapshot notification result: SNAPSHOT_INSTALLED, 
at index: (t:1, i:16) 
   ```
   
   Logs before applying 
[RATIS-2045](https://issues.apache.org/jira/browse/RATIS-2045):
   
   ```
   2025-05-10 17:42:54,878 [grpc-default-executor-0] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(221)) - 
s1@group-46FD094EFC86: Received notification to install snapshot at index 0
   2025-05-10 17:42:54,878 [grpc-default-executor-0] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(230)) - 
s1@group-46FD094EFC86: InstallSnapshot notification result: ALREADY_INSTALLED, 
current snapshot index: -1
   .....
   2025-05-10 17:42:54,880 [grpc-default-executor-2] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(221)) - 
s2@group-46FD094EFC86: Received notification to install snapshot at index 0
   2025-05-10 17:42:54,880 [grpc-default-executor-2] INFO  
impl.SnapshotInstallationHandler 
(SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(230)) - 
s2@group-46FD094EFC86: InstallSnapshot notification result: ALREADY_INSTALLED, 
current snapshot index: -1 
   ```
   
   So, if we believe that #1053 is reasonable, we should modify the check 
condition by adjusting the expected value to match the leader's value.
   
   
   ## What is the link to the Apache JIRA
   
   JIRA: RATIS-2291. Fix failing 
TestInstallSnapshotNotificationWithGrpc#testAddNewFollowersNoSnapshot.
   
   ## How was this patch tested?
   
   CI: https://github.com/slfan1989/ratis/actions/runs/14949784656


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to