Filed https://issues.apache.org/jira/browse/RATIS-1612 . Tsz-Wo
On Mon, Jul 4, 2022 at 11:45 AM Tsz Wo Sze <[email protected]> wrote: > Hi Riguz, > > Oops, you are right that there is no way to start a server as a Listener > since the RaftServer.Builder takes a group as a parameter and treat all the > peers in the group as voting members. This is a missing feature. We > should fix it. Thanks a lot for checking! > > Tsz-Wo > > > On Sun, Jul 3, 2022 at 7:42 PM Riguz Lee <[email protected]> wrote: > >> Hi Tsz Wo, >> >> >> Thanks for your explaination, it now become clear that we may use the >> following steps to scale up: >> >> >> 1. Change the cluster (N1 -N5) configuration and add listeners(N6-N11) by >> admin api. Since the configuration itself is also a raft log, this >> operation only need to be executed once in any of N1-N5. >> >> 2. Start new nodes as listener >> >> 3. Wait until the new nodes catch up with the origin cluter. We can check >> the commit info by getGroupManagementApi.getCommitInfos to make sure all >> nodes has same committed log index. >> >> 4. Change the configuration to swithch N6-N11 to peers. Also, this only >> need to be executed in any of N1-N5. >> >> >> After that, the cluster should be able to elect a new leader. The step to >> create new nodes as listeners is optional, so a simplified flow would be: >> >> >> 1. Do nothing with the original cluter N1-N5 >> >> 2. Start new nodes(N6-N11) using new configuration >> >> 3. Change configuration of previous nodes to add new peers >> >> I've tried this approach and seems working. But I could not found a way >> to start a new node as listener, since the listener support is introduced >> in 2.3.0, is't still working in progress? >> >> >> Thanks, >> >> Riguz >> >> >> >> Original Email >> >> Sender:"Tsz Wo Sze"< [email protected] >; >> >> Sent Time:2022/7/2 1:31 >> >> To:"user"< [email protected] >; >> >> Subject:Re: How to correctly scale the raft cluster? >> >> > * It's possible to update the old configuration first by using >> client.admin().setConfiguration(), let's say set N=11 first, then start new >> nodes. However, since 5 < 11/2, the cluster won't be able to elect leader >> until at least 1 new node join. >> Yes, you are right. Also, even if one node has joined, the group has to >> wait for it to catch up with the previous log entries in order to obtain a >> majority for committing new entries. >> > * Or may be we should limit the count when scaling? From N=5 -> N=7 -> >> N=9 -> N=11. >> >> We may start the 6 new nodes as listeners first. Listeners receive log >> entries but they are not voting members and they won't be counted for >> majority. When the listeners catch up, we may change them to normal nodes >> so that they become voting members. >> Tsz-Wo >> >> On Fri, Jul 1, 2022 at 10:23 AM Tsz Wo Sze <[email protected] >> <http://undefined>> wrote: >> >>> Hi Riguz, >>> > Start 6 new nodes with new configuration N=11, while keeping the >>> previous nodes running >>> This step probably won't work as expected since it will create a new >>> group but not adding nodes to the original group. We must use the >>> setConfiguration API to change configuration (add/remove nodes); see >>> https://github.com/apache/ratis/blob/bd83e7d7fd41540c8bda6bd92a52ac99ccec2076/ratis-client/src/main/java/org/apache/ratis/client/api/AdminApi.java#L35 >>> <http://undefined> >>> Hope it helps. Thanks a lot for trying Ratis! >>> Tsz-Wo >>> >>> On Fri, Jul 1, 2022 at 12:30 AM Riguz Lee <[email protected] >>> <http://undefined>> wrote: >>> >>>> >>>> >>>> Hi, >>>> >>>> >>>> I'm testing scaling up/down the raft cluster, but ratis is not working >>>> as expected in new cluster. My steps are: >>>> >>>> >>>> * Initialize a cluster with 5 nodes, the size and peers of the cluster >>>> is configured in a configuration file, let's say N=5. The cluster works >>>> perfectly, raft logs are synchronized across the cluster. >>>> >>>> * Start 6 new nodes with new configuration N=11, while keeping the >>>> previous nodes running >>>> >>>> * Recreate the previous nodes with N=11 one by one >>>> >>>> >>>> According the raft paper, raft should be able to handle configuration >>>> change by design, but after the above steps, what I've found is that: >>>> >>>> >>>> - New nodes not able to join the cluster >>>> >>>> - Old nodes still has a size of 5(by >>>> *client.getGroupManagementApi(peerId).info(groupId)*) >>>> >>>> >>>> So how should I scale the cluster correctly? A few thoughts of mine: >>>> >>>> >>>> * Definitely the old cluster should not be stopped while starting new >>>> nodes, otherwise new nodes might be able to elect new leader(eg. N=11 with >>>> 6 new nodes) and raft logs in old nodes will be overriden. >>>> >>>> * It's possible to update the old configuration first by using >>>> client.admin().setConfiguration(), let's say set N=11 first, then start new >>>> nodes. However, since 5 < 11/2, the cluster won't be able to elect leader >>>> until at least 1 new node join. >>>> >>>> * Or may be we should limit the count when scaling? From N=5 -> N=7 -> >>>> N=9 -> N=11. >>>> >>>> >>>> Thanks, >>>> >>>> Riguz Lee >>>> >>>> >>>> >>>>
