Filed https://issues.apache.org/jira/browse/RATIS-1612 .
Tsz-Wo

On Mon, Jul 4, 2022 at 11:45 AM Tsz Wo Sze <[email protected]> wrote:

> Hi Riguz,
>
> Oops, you are right that there is no way to start a server as a Listener
> since the RaftServer.Builder takes a group as a parameter and treat all the
> peers in the group as voting members.  This is a missing feature.  We
> should fix it.  Thanks a lot for checking!
>
> Tsz-Wo
>
>
> On Sun, Jul 3, 2022 at 7:42 PM Riguz Lee <[email protected]> wrote:
>
>> Hi Tsz Wo,
>>
>>
>> Thanks for your explaination, it now become clear that we may use the
>> following steps to scale up:
>>
>>
>> 1. Change the cluster (N1 -N5) configuration and add listeners(N6-N11) by
>> admin api. Since the configuration itself is also a raft log, this
>> operation only need to be executed once in any of N1-N5.
>>
>> 2. Start new nodes as listener
>>
>> 3. Wait until the new nodes catch up with the origin cluter. We can check
>> the commit info by getGroupManagementApi.getCommitInfos to make sure all
>> nodes has same committed log index.
>>
>> 4. Change the configuration to swithch N6-N11 to peers. Also, this only
>> need to be executed in any of N1-N5.
>>
>>
>> After that, the cluster should be able to elect a new leader. The step to
>> create new nodes as listeners is optional, so a simplified flow would be:
>>
>>
>> 1. Do nothing with the original cluter N1-N5
>>
>> 2. Start new nodes(N6-N11) using new configuration
>>
>> 3. Change configuration of previous nodes to add new peers
>>
>> I've tried this approach and seems working. But I could not found a way
>> to start a new node as listener, since the listener support is introduced
>> in 2.3.0, is't still working in progress?
>>
>>
>> Thanks,
>>
>> Riguz
>>
>>
>>
>> Original Email
>>
>> Sender:"Tsz Wo Sze"< [email protected] >;
>>
>> Sent Time:2022/7/2 1:31
>>
>> To:"user"< [email protected] >;
>>
>> Subject:Re: How to correctly scale the raft cluster?
>>
>> > * It's possible to update the old configuration first by using
>> client.admin().setConfiguration(), let's say set N=11 first, then start new
>> nodes. However, since 5 < 11/2, the cluster won't be able to elect leader
>> until at least 1 new node join.
>> Yes, you are right.  Also, even if one node has joined, the group has to
>> wait for it to catch up with the previous log entries in order to obtain a
>> majority for committing new entries.
>> > * Or may be we should limit the count when scaling? From N=5 -> N=7 ->
>> N=9 -> N=11.
>>
>> We may start the 6 new nodes as listeners first.  Listeners receive log
>> entries but they are not voting members and they won't be counted for
>> majority.  When the listeners catch up, we may change them to normal nodes
>> so that they become voting members.
>> Tsz-Wo
>>
>> On Fri, Jul 1, 2022 at 10:23 AM Tsz Wo Sze <[email protected]
>> <http://undefined>> wrote:
>>
>>> Hi Riguz,
>>> > Start 6 new nodes with new configuration N=11, while keeping the
>>> previous nodes running
>>> This step probably won't work as expected since it will create a new
>>> group but not adding nodes to the original group.  We must use the
>>> setConfiguration API to change configuration (add/remove nodes); see
>>> https://github.com/apache/ratis/blob/bd83e7d7fd41540c8bda6bd92a52ac99ccec2076/ratis-client/src/main/java/org/apache/ratis/client/api/AdminApi.java#L35
>>> <http://undefined>
>>> Hope it helps.  Thanks a lot for trying Ratis!
>>> Tsz-Wo
>>>
>>> On Fri, Jul 1, 2022 at 12:30 AM Riguz Lee <[email protected]
>>> <http://undefined>> wrote:
>>>
>>>>
>>>>
>>>> Hi,
>>>>
>>>>
>>>> I'm testing scaling up/down the raft cluster, but ratis is not working
>>>> as expected in new cluster. My steps are:
>>>>
>>>>
>>>> * Initialize a cluster with 5 nodes, the size and peers of the cluster
>>>> is configured in a configuration file, let's say N=5. The cluster works
>>>> perfectly, raft logs are synchronized across the cluster.
>>>>
>>>> * Start 6 new nodes with new configuration N=11, while keeping the
>>>> previous nodes running
>>>>
>>>> * Recreate the previous nodes with N=11 one by one
>>>>
>>>>
>>>> According the raft paper, raft should be able to handle configuration
>>>> change by design, but after the above steps, what I've found is that:
>>>>
>>>>
>>>> - New nodes not able to join the cluster
>>>>
>>>> - Old nodes still has a size of 5(by
>>>> *client.getGroupManagementApi(peerId).info(groupId)*)
>>>>
>>>>
>>>> So how should I scale the cluster correctly? A few thoughts of mine:
>>>>
>>>>
>>>> * Definitely the old cluster should not be stopped while starting new
>>>> nodes, otherwise new nodes might be able to elect new leader(eg. N=11 with
>>>> 6 new nodes)  and raft logs in old nodes will be overriden.
>>>>
>>>> * It's possible to update the old configuration first by using
>>>> client.admin().setConfiguration(), let's say set N=11 first, then start new
>>>> nodes. However,  since 5 < 11/2, the cluster won't be able to elect leader
>>>> until at least 1 new node join.
>>>>
>>>> * Or may be we should limit the count when scaling? From N=5 -> N=7 ->
>>>> N=9 -> N=11.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Riguz Lee
>>>>
>>>>
>>>>
>>>>

Reply via email to