slfan1989 commented on PR #1227: URL: https://github.com/apache/ratis/pull/1227#issuecomment-2868728386
@szetszwo @adoroszlai When addressing the issue with the unit test : `TestInstallSnapshotNotificationWithGrpc#testAddNewFollowersNoSnapshot` I found that the error persists even after applying the fix from RATIS-2045. RATIS-2045 fixed the issue where SnapshotInstallationHandler didn't notify followers to install snapshots when the snapshot index was -1 and the leader's firstAvailableLogIndex was 0 (PR #1053). This PR changes the behavior of whether followers pull snapshots from the leader. From the logs, we can observe that the newly added followers `s1` and `s2` have both synchronized snapshots from the leader `s0`. As a result, the snapshot index for followers `s1` and `s2` becomes `16` (16 because we manually created messages twice), instead of `-1`. Therefore, the current check condition is problematic. ``` follower s1: 2025-05-10 17:23:10,134 [grpc-default-executor-2] INFO impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(262)) - s1@group-F83BA0BDB609: Received notification to install snapshot at index 0 2025-05-10 17:23:10,137 [grpc-default-executor-2] INFO impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(297)) - s1@group-F83BA0BDB609: notifyInstallSnapshot: nextIndex is 0 but the leader's first available index is 0. ...... 2025-05-10 17:23:11,151 [grpc-default-executor-0] INFO impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(365)) - s1@group-F83BA0BDB609: InstallSnapshot notification result: SNAPSHOT_INSTALLED, at index: (t:1, i:16) follower s2: 2025-05-10 17:23:11,214 [grpc-default-executor-2] INFO impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(262)) - s2@group-F83BA0BDB609: Received notification to install snapshot at index 0 2025-05-10 17:23:11,214 [grpc-default-executor-2] INFO impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(297)) - s2@group-F83BA0BDB609: notifyInstallSnapshot: nextIndex is 0 but the leader's first available index is 0. ...... 2025-05-10 17:23:12,217 [grpc-default-executor-0] INFO impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(365)) - s2@group-F83BA0BDB609: InstallSnapshot notification result: SNAPSHOT_INSTALLED, at index: (t:1, i:16) ``` Logs before applying RATIS-2045: ``` 2025-05-10 17:42:54,878 [grpc-default-executor-0] INFO impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(221)) - s1@group-46FD094EFC86: Received notification to install snapshot at index 0 2025-05-10 17:42:54,878 [grpc-default-executor-0] INFO impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(230)) - s1@group-46FD094EFC86: InstallSnapshot notification result: ALREADY_INSTALLED, current snapshot index: -1 ..... 2025-05-10 17:42:54,880 [grpc-default-executor-2] INFO impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(221)) - s2@group-46FD094EFC86: Received notification to install snapshot at index 0 2025-05-10 17:42:54,880 [grpc-default-executor-2] INFO impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:notifyStateMachineToInstallSnapshot(230)) - s2@group-46FD094EFC86: InstallSnapshot notification result: ALREADY_INSTALLED, current snapshot index: -1 ``` So, if we believe that #1053 is reasonable, we should modify the check condition by adjusting the expected value to match the leader's value.  Unit Test Result:  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
