Hello, We have been working on a setup with multiple RaftGroups and we are using Ratis. I had a few questions on the best ways to initialize a RaftGroup especially in the deployed environment I am in. Please correct me if I am wrong in any of this as I am new to Ratis and Raft! From my understanding, we can start a RaftServer with a single RaftGroup (with all the peers known). We have a solution to this deployed and works! If we want to change peers in a group, we are supposed to use the Admin client to make a set configuration request and add/remove peers.
In the setup we have, machines are added on a rolling basis so this made it a little bit difficult to get all the peers to start the RaftGroup. One thing we tried was we added all the peers initially in the Empty RaftGroup, then once all the peers were up, we created a new RaftGroup with the GroupManagementAPI with all the peers (by doing a DNS lookup for all peers). However, every time I tried to do an operation on this RaftGroup I received the following error: group-C0E08770FC33 is not in [RUNNING]: current state is STARTING which confused me because each server said they were in state running. Idk if thats enough details to know whats wrong with that approach but let me know and I can try supplying more details. The exact setup here was, all Peers start a RaftServer with EmptyGroup. then each peer made a group with the GroupManagementAPI with the new group and all the peers. I then tried starting one peer in the RaftGroup, and then making SetConfiguration request to one by one add the machines to that RaftGroup. I was getting some errors about [peer1,peer2]|listeners:[] due to NOPROGRESS I didnt debug further in this approach, but the steps I did here were one Peer was started in the RaftGroup with only itself as the peers. The other 2 were in the EmptyGroup. Then I one by one made a SetConfiguration with a new peer, but I could not make it past the first one. So right now, we wait for all the peers to deploy, and then send a command to each peer to start a RaftServer with the RaftGroup and all the peers. This was successful and matches the docs on how to do it. My main question, is there a recommendation on how to start RaftGroups for us? For example, our deployment system takes down a machine and brings up a new one, one machine at a time. Should those all be setConfiguration request on the RaftGroup? We will have multiple RaftGroup setup here, so understanding how to make these changes will be beneficial to us. Should we always start a RaftGroup with 3 peers as well? Sorry for the long email, but I hope someone can assist us! Thanks, Sahith
