[
https://issues.apache.org/jira/browse/RATIS-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tsz-wo Sze resolved RATIS-1577.
-------------------------------
Fix Version/s: 2.3.0
Assignee: Attila Doroszlai
Resolution: Fixed
The pull request is now merged. Thanks, [~adoroszlai]!
> Install snapshot failure
> -------------------------
>
> Key: RATIS-1577
> URL: https://issues.apache.org/jira/browse/RATIS-1577
> Project: Ratis
> Issue Type: Bug
> Components: server
> Affects Versions: 2.3.0
> Reporter: Attila Doroszlai
> Assignee: Attila Doroszlai
> Priority: Blocker
> Fix For: 2.3.0
>
> Time Spent: 2h 20m
> Remaining Estimate: 0h
>
> Found this while testing Ozone HA with Ratis built from current {{master}}:
> {code}
> scm2_1 | 2022-05-08 09:36:23,915 [grpc-default-executor-0] INFO
> impl.SnapshotInstallationHandler:
> dd69b21b-bf75-4092-bd68-19f526eb2628@group-9D638F7E0D95: receive
> installSnapshot:
> 517d4b01-482b-4891-b0de-80441d4eb637->dd69b21b-bf75-4092-bd68-19f526eb2628#0-t2,notify:(t:1,
> i:0)
> scm2_1 | 2022-05-08 09:36:23,923 [grpc-default-executor-0] INFO
> server.RaftServer$Division:
> dd69b21b-bf75-4092-bd68-19f526eb2628@group-9D638F7E0D95: change Leader from
> null to 517d4b01-482b-4891-b0de-80441d4eb637 at term 2 for installSnapshot,
> leader elected after 2809ms
> scm2_1 | 2022-05-08 09:36:23,927 [grpc-default-executor-0] INFO
> impl.SnapshotInstallationHandler:
> dd69b21b-bf75-4092-bd68-19f526eb2628@group-9D638F7E0D95: Received
> notification to install snapshot at index 0
> scm2_1 | 2022-05-08 09:36:23,958 [grpc-default-executor-0] INFO
> impl.SnapshotInstallationHandler:
> dd69b21b-bf75-4092-bd68-19f526eb2628@group-9D638F7E0D95:
> notifyInstallSnapshot: nextIndex is 0 but the leader's first available index
> is 0.
> scm2_1 | 2022-05-08 09:36:23,960 [grpc-default-executor-0] INFO
> ha.SCMStateMachine: Received install snapshot notification from SCM leader:
> scm1:9894 with term index: (t:1, i:0)
> scm2_1 | 2022-05-08 09:36:23,961 [pool-16-thread-1] INFO
> ha.SCMHAManagerImpl: Downloading checkpoint from leader SCM scm1 and
> reloading state from the checkpoint.
> scm2_1 | 2022-05-08 09:36:23,965 [grpc-default-executor-0] ERROR
> impl.SnapshotInstallationHandler:
> dd69b21b-bf75-4092-bd68-19f526eb2628@group-9D638F7E0D95: installSnapshot
> failed
> scm2_1 | java.lang.IllegalStateException:
> inProgressInstallSnapshotRequest: 0 is not eligible, firstAvailableLogIndex: 0
> scm2_1 | at
> org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60)
> scm2_1 | at
> org.apache.ratis.server.impl.SnapshotInstallationHandler.notifyStateMachineToInstallSnapshot(SnapshotInstallationHandler.java:287)
> scm2_1 | at
> org.apache.ratis.server.impl.SnapshotInstallationHandler.installSnapshotImpl(SnapshotInstallationHandler.java:115)
> scm2_1 | at
> org.apache.ratis.server.impl.SnapshotInstallationHandler.installSnapshot(SnapshotInstallationHandler.java:84)
> scm2_1 | at
> org.apache.ratis.server.impl.RaftServerImpl.installSnapshot(RaftServerImpl.java:1427)
> {code}
> The log above is from docker-compose cluster, but it is easily reproducible
> using MiniCluster-based integration test, too:
> {code}
> mvn clean test -DfailIfNoTests=false -Dtest=TestStorageContainerManagerHA
> {code}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)