Thanks for your explanation. With a closer look I can see that after the new server install the snapshot it will call state machine `reinitialize' method and recover the state.
Best, tison. Tsz Wo Sze <[email protected]> 于2022年7月4日周一 01:51写道: > Hi tison, > > > When a new node join an existing cluster in this case, how it exactly > applies logs? Will it applies raft logs from the beginning one by one? > > The leader takes a snapshot from time to time and then purges the old log > entries. When a new server joins the group, the leader will first send a > snapshot, if there is any, and then send the remaining log entries. > > Tsz-Wo > > > On Sun, Jul 3, 2022 at 9:26 AM tison <[email protected]> wrote: > >> Hi Tsz-Wo, >> >> I have one question that may be a bit out of topic: >> >> When a new node join an existing cluster in this case, how it exactly >> applies logs? Will it applies raft logs from the beginning one by one? TiKV >> will sends a snapshot of state machine from leader to the new node in this >> case to speed up the catch up progress and I'd like to know how Ratis >> implements catch-up. >> >> Best, >> tison. >> >> >> Tsz Wo Sze <[email protected]> 于2022年7月2日周六 01:31写道: >> >>> > * It's possible to update the old configuration first by using >>> client.admin().setConfiguration(), let's say set N=11 first, then start new >>> nodes. However, since 5 < 11/2, the cluster won't be able to elect leader >>> until at least 1 new node join. >>> Yes, you are right. Also, even if one node has joined, the group has to >>> wait for it to catch up with the previous log entries in order to obtain a >>> majority for committing new entries. >>> >>> > * Or may be we should limit the count when scaling? From N=5 -> N=7 -> >>> N=9 -> N=11. >>> >>> We may start the 6 new nodes as listeners first. Listeners receive log >>> entries but they are not voting members and they won't be counted for >>> majority. When the listeners catch up, we may change them to normal nodes >>> so that they become voting members. >>> >>> Tsz-Wo >>> >>> >>> On Fri, Jul 1, 2022 at 10:23 AM Tsz Wo Sze <[email protected]> wrote: >>> >>>> Hi Riguz, >>>> >>>> > Start 6 new nodes with new configuration N=11, while keeping the >>>> previous nodes running >>>> >>>> This step probably won't work as expected since it will create a new >>>> group but not adding nodes to the original group. We must use the >>>> setConfiguration API to change configuration (add/remove nodes); see >>>> https://github.com/apache/ratis/blob/bd83e7d7fd41540c8bda6bd92a52ac99ccec2076/ratis-client/src/main/java/org/apache/ratis/client/api/AdminApi.java#L35 >>>> >>>> Hope it helps. Thanks a lot for trying Ratis! >>>> >>>> Tsz-Wo >>>> >>>> On Fri, Jul 1, 2022 at 12:30 AM Riguz Lee <[email protected]> wrote: >>>> >>>>> >>>>> >>>>> Hi, >>>>> >>>>> >>>>> I'm testing scaling up/down the raft cluster, but ratis is not working >>>>> as expected in new cluster. My steps are: >>>>> >>>>> >>>>> * Initialize a cluster with 5 nodes, the size and peers of the cluster >>>>> is configured in a configuration file, let's say N=5. The cluster works >>>>> perfectly, raft logs are synchronized across the cluster. >>>>> >>>>> * Start 6 new nodes with new configuration N=11, while keeping the >>>>> previous nodes running >>>>> >>>>> * Recreate the previous nodes with N=11 one by one >>>>> >>>>> >>>>> According the raft paper, raft should be able to handle configuration >>>>> change by design, but after the above steps, what I've found is that: >>>>> >>>>> >>>>> - New nodes not able to join the cluster >>>>> >>>>> - Old nodes still has a size of 5(by >>>>> *client.getGroupManagementApi(peerId).info(groupId)*) >>>>> >>>>> >>>>> So how should I scale the cluster correctly? A few thoughts of mine: >>>>> >>>>> >>>>> * Definitely the old cluster should not be stopped while starting new >>>>> nodes, otherwise new nodes might be able to elect new leader(eg. N=11 with >>>>> 6 new nodes) and raft logs in old nodes will be overriden. >>>>> >>>>> * It's possible to update the old configuration first by using >>>>> client.admin().setConfiguration(), let's say set N=11 first, then start >>>>> new >>>>> nodes. However, since 5 < 11/2, the cluster won't be able to elect leader >>>>> until at least 1 new node join. >>>>> >>>>> * Or may be we should limit the count when scaling? From N=5 -> N=7 -> >>>>> N=9 -> N=11. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Riguz Lee >>>>> >>>>> >>>>> >>>>>
