[ 
https://issues.apache.org/jira/browse/RATIS-2306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17953734#comment-17953734
 ] 

Alexey Goncharuk commented on RATIS-2306:
-----------------------------------------

I see that there is a configuration snapshot that is managed by the Ratis 
internals and my initial idea was that whenever a new server is initialized, 
this snapshot should be written synchronously and the server should start only 
if the configuration is persisted. However, I see that in the case of a new 
server start, the configuration change event is sent to the state machine as if 
the configuration change was submitted through Raft log. Should this initial 
configuration change event still be sent to the state machine after a restart? 
Even after some entries were committed to the log?

> Initial raft group is not recovered if server is stopped immediately after 
> start
> --------------------------------------------------------------------------------
>
>                 Key: RATIS-2306
>                 URL: https://issues.apache.org/jira/browse/RATIS-2306
>             Project: Ratis
>          Issue Type: Bug
>    Affects Versions: 3.1.3
>            Reporter: Alexey Goncharuk
>            Priority: Major
>
> Following up a discussion on the user list 
> (https://lists.apache.org/thread/znjzgzvt488w0cf68jcc12nylydfz5vf)
> As discussed on the list, `RECOVER` startup option should restore the initial 
> Raft group passed in the builder. However it seems that the persistence of 
> the initial group is asynchronous, and if the server is stopped quickly 
> enough, the bootstrapped group is never recovered. Below is the test that can 
> be run in the Ratis project:
> {code:java}
>   @Test
>   public void testGroupRecoveryOnRestart()
>       throws IOException
>   {
>     File tempDir = 
> Files.createTempDirectory(getClass().getSimpleName()).toFile();
>     RaftPeerId singlePeerId = RaftPeerId.valueOf("s0");
>     RaftGroupId groupId = RaftGroupId.valueOf(UUID.randomUUID());
>     RaftProperties properties = new RaftProperties();
>     RaftConfigKeys.Rpc.setType(properties, RpcType.valueOf("netty"));
>     RaftServerConfigKeys.setStorageDir(properties, 
> Collections.singletonList(tempDir));
>     {
>       RaftPeer singlePeer = RaftPeer
>               .newBuilder()
>               .setId(singlePeerId)
>               .setAddress(NetUtils.localhostWithFreePort())
>               .setAdminAddress(NetUtils.localhostWithFreePort())
>               .setClientAddress(NetUtils.localhostWithFreePort())
>               .setDataStreamAddress(NetUtils.localhostWithFreePort())
>               .build();
>       try (RaftServer server = RaftServer.newBuilder()
>               .setServerId(singlePeerId)
>               .setGroup(RaftGroup.valueOf(groupId, singlePeer))
>               .setOption(RaftStorage.StartupOption.FORMAT)
>               .setStateMachine(new SimpleStateMachine4Testing())
>               .setProperties(properties)
>               .build()) {
>         server.start();
>         System.out.println("Started with group: " + 
> server.getDivision(groupId).getInfo());
>       }
>     }
>     {
>       // Restart with RECOVER option
>       try (RaftServer server = RaftServer.newBuilder()
>               .setServerId(singlePeerId)
>               .setGroup(RaftGroup.valueOf(groupId))
>               .setOption(RaftStorage.StartupOption.RECOVER)
>               .setStateMachine(new SimpleStateMachine4Testing())
>               .setProperties(properties)
>               .build()) {
>         server.start();
>         RaftGroup group = Iterables.getOnlyElement(server.getGroups());
>         Assertions.assertEquals(groupId, group.getGroupId());
>         Assertions.assertEquals(1, group.getPeers().size());
>         Assertions.assertEquals(singlePeerId, 
> Iterables.getOnlyElement(group.getPeers()).getId());
>       }
>     }
>   }
> {code}
> A small sleep after the first server start or waiting for a state machine 
> configuration change event will make the test pass.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to