Hello,

We have been working on a setup with multiple RaftGroups and we are using
Ratis. I had a few questions on the best ways to initialize a RaftGroup
especially in the deployed environment I am in. Please correct me if I am
wrong in any of this as I am new to Ratis and Raft! From my understanding,
we can start a RaftServer with a single RaftGroup (with all the peers
known). We have a solution to this deployed and works! If we want to change
peers in a group, we are supposed to use the Admin client to make a set
configuration request and add/remove peers.

In the setup we have, machines are added on a rolling basis so this made it
a little bit difficult to get all the peers to start the RaftGroup. One
thing we tried was we added all the peers initially in the Empty RaftGroup,
then once all the peers were up, we created a new RaftGroup with the
GroupManagementAPI with all the peers (by doing a DNS lookup for all
peers). However, every time I tried to do an operation on this RaftGroup I
received the following error:

group-C0E08770FC33 is not in [RUNNING]: current state is STARTING


which confused me because each server said they were in state running. Idk
if thats enough details to know whats wrong with that approach but let me
know and I can try supplying more details. The exact setup here was, all
Peers start a RaftServer with EmptyGroup. then each peer made a group with
the GroupManagementAPI with the new group and all the peers.

I then tried starting one peer in the RaftGroup, and then making
SetConfiguration request to one by one add the machines to that RaftGroup.
I was getting some errors about
[peer1,peer2]|listeners:[] due to NOPROGRESS

I didnt debug further in this approach, but the steps I did here were one
Peer was started in the RaftGroup with only itself as the peers. The other
2 were in the EmptyGroup. Then I one by one made a SetConfiguration with a
new peer, but I could not make it past the first one.

So right now, we wait for all the peers to deploy, and then send a command
to each peer to start a RaftServer with the RaftGroup and all the peers.
This was successful and matches the docs on how to do it.

My main question, is there a recommendation on how to start RaftGroups for
us? For example, our deployment system takes down a machine and brings up a
new one, one machine at a time. Should those all be setConfiguration
request on the RaftGroup?  We will have multiple RaftGroup setup here, so
understanding how to make these changes will be beneficial to us. Should we
always start a RaftGroup with 3 peers as well?

Sorry for the long email, but I hope someone can assist us!

Thanks,

Sahith

Reply via email to