Hello William,
<< please find my inline answers >>
Thanks
Mahesh Reddy.
From: William Song <[email protected]>
Date: Monday, December 11, 2023 at 6:35 PM
To: [email protected] <[email protected]>
Subject: {EXTERNAL} Re: Assistance Needed with Node Initialization and Role
Changes in Ratis Cluster
Hi Mahesh, Thanks for exploring Ratis! Unfortunately the arithmetic example
does not support role changes currently ;( For membership change examples,
please refer to example/membership in latest 3. 0 version. You can find the
relevant code
Hi Mahesh,
Thanks for exploring Ratis! Unfortunately the arithmetic example does not
support role changes currently ;(
<< With the help of Tsz, I created append.java(attached) and able to update the
Ratis node cluster >>
For membership change examples, please refer to example/membership in latest
3.0 version. You can find the relevant code here
https://urldefense.com/v3/__https://github.com/apache/ratis/blob/53831534c69309688ce379006363e645bf42b654/ratis-examples/src/main/java/org/apache/ratis/examples/membership/server/RaftCluster.java*L68__;Iw!!NDdRaFrjhKsg!r-vVzYJp-wYyEcDzkCI8HX2lf84hLyO_WLw18MT2GtsI4AIxn1_ERJXhwurxqnJ_qqYfAfUo6g83HTToeRwjV7406O0$<https://urldefense.com/v3/__https:/github.com/apache/ratis/blob/53831534c69309688ce379006363e645bf42b654/ratis-examples/src/main/java/org/apache/ratis/examples/membership/server/RaftCluster.java*L68__;Iw!!NDdRaFrjhKsg!r-vVzYJp-wYyEcDzkCI8HX2lf84hLyO_WLw18MT2GtsI4AIxn1_ERJXhwurxqnJ_qqYfAfUo6g83HTToeRwjV7406O0$>
When changing from (n0, n1, n2) to (n0, n1, n3), you can follow these steps
1. Start n3 with an EMPTY group configuration.
2. Send a setConfiguration([n0, n1, n3]) request to the existing group.
3. Once 2 is executed successfully, you can now safely shutdown n2 and clean up
the data.
<<
In-fact , I tried this approach, but it didn’t worked as excepted
Started 3 nodes n0,n1, n2 with the following commands
PEERS=n0:127.0.0.1:6000,n1:127.0.0.1:6001,n2:127.0.0.1:6002
ID=n0; ${BIN}/server.sh arithmetic server --id ${ID} --storage /tmp/ratis/${ID}
--peers ${PEERS}
ID=n1; ${BIN}/server.sh arithmetic server --id ${ID} --storage /tmp/ratis/${ID}
--peers ${PEERS}
ID=n2; ${BIN}/server.sh arithmetic server --id ${ID} --storage /tmp/ratis/${ID}
--peers ${PEERS}
Started n3 node with the following command ID=n3; ${BIN}/server.sh arithmetic
server --id ${ID} --storage /tmp/ratis/${ID} --peers n3:127.0.0.1:6003
setConfiguration as follows but it is failed (attached append-n3.txt file for
logs)
PEERS=n0:127.0.0.1:6000,n1:127.0.0.1:6001,n2:127.0.0.1:6002
OLDPEERS=n0:127.0.0.1:6000,n1:127.0.0.1:6001,n2:127.0.0.1:6002,n3:127.0.0.1:6003
{BIN}/client.sh arithmetic append --peers ${PEERS} --oldPeers ${OLDPEERS}
>>
In arithmetic example, n3 is started with the intended new group(n0, n1, n3)
but Step 2 is not executed. Without this, the original group (n0, n1, n2)
remains unaware of the new configuration and continues to operate under the
assumption that the group consists of (n0, n1, n2). This lead to n3 being
rejected and shutdown.
Best Regards,
William
> 2023年12月12日 04:54,Mahesh Garlapati <[email protected]>
> 写道:
>
> Hello team,
>
> I initiated three nodes, namely n0, n1, and n2, using the following commands:
>
> ID=n0; ${BIN}/server.sh arithmetic server --id ${ID} --storage
> /tmp/ratis/${ID} --peers ${PEERS}
> ID=n1; ${BIN}/server.sh arithmetic server --id ${ID} --storage
> /tmp/ratis/${ID} --peers ${PEERS}
> ID=n2; ${BIN}/server.sh arithmetic server --id ${ID} --storage
> /tmp/ratis/${ID} --peers ${PEERS}
>
> Based on the logs, it appears that n2 is the leader:
>
> INFO StateMachine:166 - LEADER:n2-1: a = 10 = 10
>
> However, when I attempted to stop the n2 node (leader) and start a new node,
> n3, with peers n0, n1, and n3, the n3 server failed to start. Attached are
> the log files for review.
>
> I am seeking assistance in understanding the reason for this error preventing
> the successful start of the n3 node.
>
> In another scenario:
>
> I started three nodes, n0, n1, and n2, using similar commands, and logs
> indicated that n1 was the leader:
>
> INFO StateMachine:166 - LEADER:n1-1: a = 10 = 10
>
> Upon stopping the non-leader node, n2, I attempted to start a new node, n3,
> with peers n0, n1, and n3. The n3 server started but generated errors such as:
>
> 2023-12-11 12:43:13 INFO RaftServer$Division:377 - n3@group-6F7570313233:
> changes role from CANDIDATE to FOLLOWER at term 1 for REJECTED
>
> These errors were resolved by running the append command using the admin
> client.
>
> I appreciate your assistance in resolving this issues.
>
> Thanks,
> Mahesh Reddy
>
> CONFIDENTIALITY NOTICE This e-mail message and any attachments are only for
> the use of the intended recipient and may contain information that is
> privileged, confidential or exempt from disclosure under applicable law. If
> you are not the intended recipient, any disclosure, distribution or other use
> of this e-mail message or attachments is prohibited. If you have received
> this e-mail message in error, please delete and notify the sender
> immediately. Thank you. <log.txt>
e156510@D0YQ455XC4 ~ % BIN=ratis-examples/src/main/bin
e156510@D0YQ455XC4 ~ %
e156510@D0YQ455XC4 ~ %
PEERS=n0:127.0.0.1:6000,n1:127.0.0.1:6001,n2:127.0.0.1:6002
OLDPEERS=n0:127.0.0.1:6000,n1:127.0.0.1:6001,n2:127.0.0.1:6002,n3:127.0.0.1:6003
e156510@D0YQ455XC4 ~ % ${BIN}/client.sh arithmetic append --peers ${PEERS}
--oldPeers ${OLDPEERS}
zsh: no such file or directory: ratis-examples/src/main/bin/client.sh
e156510@D0YQ455XC4 ~ %
e156510@D0YQ455XC4 ~ % pwd
/Users/e156510
e156510@D0YQ455XC4 ~ % cd /Users/e156510/IdeaProjects/raft/ratis/
e156510@D0YQ455XC4 ratis % claer
zsh: command not found: claer
e156510@D0YQ455XC4 ratis %
e156510@D0YQ455XC4 ratis % clear
e156510@D0YQ455XC4 ratis %
e156510@D0YQ455XC4 ratis %
e156510@D0YQ455XC4 ratis %
PEERS=n0:127.0.0.1:6000,n1:127.0.0.1:6001,n2:127.0.0.1:6002
OLDPEERS=n0:127.0.0.1:6000,n1:127.0.0.1:6001,n2:127.0.0.1:6002,n3:127.0.0.1:6003
e156510@D0YQ455XC4 ratis %
e156510@D0YQ455XC4 ratis % ${BIN}/client.sh arithmetic append --peers
${PEERS} --oldPeers ${OLDPEERS}
Found
/Users/e156510/IdeaProjects/raft/ratis/ratis-examples/target/ratis-examples-3.0.0-SNAPSHOT.jar
Old Peers : [n0|127.0.0.1:6000, n1|127.0.0.1:6001, n2|127.0.0.1:6002,
n3|127.0.0.1:6003]
New Peers :[n0|127.0.0.1:6000, n1|127.0.0.1:6001, n2|127.0.0.1:6002]
2023-12-11 19:00:49 INFO MetricRegistries:64 - Loaded MetricRegistries class
org.apache.ratis.metrics.impl.MetricRegistriesImpl
2023-12-11 19:00:49 DEBUG RaftClient:367 - client-D4962196F46F: suggested new
leader: n1. Failed
SetConfigurationRequest:client-D4962196F46F->n0@group-6F7570313233, cid=1,
seq=null, RW, null, COMPARE_AND_SET, servers:[n0|127.0.0.1:6000,
n1|127.0.0.1:6001, n2|127.0.0.1:6002], listeners:[] with {}
org.apache.ratis.protocol.exceptions.NotLeaderException: Server
n0@group-6F7570313233 is not the leader n1|127.0.0.1:6001
at
org.apache.ratis.client.impl.ClientProtoUtils.toRaftClientReply(ClientProtoUtils.java:431)
at
org.apache.ratis.grpc.client.GrpcClientRpc.sendRequest(GrpcClientRpc.java:102)
at
org.apache.ratis.client.impl.BlockingImpl.sendRequest(BlockingImpl.java:139)
at
org.apache.ratis.client.impl.BlockingImpl.sendRequestWithRetry(BlockingImpl.java:104)
at
org.apache.ratis.client.impl.AdminImpl.setConfiguration(AdminImpl.java:46)
at
org.apache.ratis.examples.arithmetic.cli.Append.operation(Append.java:36)
at org.apache.ratis.examples.arithmetic.cli.Client.run(Client.java:34)
at org.apache.ratis.examples.common.Runner.main(Runner.java:62)
2023-12-11 19:00:49 DEBUG RaftClient:386 - client-D4962196F46F: oldLeader=n0,
curLeader=n0, newLeader=n1
2023-12-11 19:00:49 DEBUG RaftClient:392 - group-6F7570313233
client-D4962196F46F: client change Leader from n0 to n1
ex=org.apache.ratis.protocol.exceptions.NotLeaderException
2023-12-11 19:00:49 DEBUG RaftClient:367 - client-D4962196F46F: suggested new
leader: null. Failed
SetConfigurationRequest:client-D4962196F46F->n1@group-6F7570313233, cid=1,
seq=null, RW, null, COMPARE_AND_SET, servers:[n0|127.0.0.1:6000,
n1|127.0.0.1:6001, n2|127.0.0.1:6002], listeners:[] with {}
org.apache.ratis.protocol.exceptions.SetConfigurationException: Failed to set
configuration: current configuration 0: peers:[n0|127.0.0.1:6000,
n1|127.0.0.1:6001, n2|127.0.0.1:6002]|listeners:[], old=null is different than
the request SetConfigurationRequest:client-D4962196F46F->n1@group-6F7570313233,
cid=1, seq=null, RW, null, COMPARE_AND_SET, servers:[n0|127.0.0.1:6000,
n1|127.0.0.1:6001, n2|127.0.0.1:6002], listeners:[]
at
org.apache.ratis.server.impl.RaftServerImpl.setConfigurationAsync(RaftServerImpl.java:1302)
at
org.apache.ratis.server.impl.RaftServerProxy.lambda$null$24(RaftServerProxy.java:621)
at org.apache.ratis.util.JavaUtils.callAsUnchecked(JavaUtils.java:117)
at
org.apache.ratis.server.impl.RaftServerImpl.lambda$executeSubmitServerRequestAsync$10(RaftServerImpl.java:877)
at
java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
2023-12-11 19:00:49 DEBUG RaftClient:386 - client-D4962196F46F: oldLeader=n1,
curLeader=n1, newLeader=n0
2023-12-11 19:00:49 DEBUG RaftClient:392 - group-6F7570313233
client-D4962196F46F: client change Leader from n1 to n0
ex=org.apache.ratis.protocol.exceptions.SetConfigurationException
Failed to add to the
clusterorg.apache.ratis.protocol.exceptions.SetConfigurationException: Failed
to set configuration: current configuration 0: peers:[n0|127.0.0.1:6000,
n1|127.0.0.1:6001, n2|127.0.0.1:6002]|listeners:[], old=null is different than
the request SetConfigurationRequest:client-D4962196F46F->n1@group-6F7570313233,
cid=1, seq=null, RW, null, COMPARE_AND_SET, servers:[n0|127.0.0.1:6000,
n1|127.0.0.1:6001, n2|127.0.0.1:6002], listeners:[]
e156510@D0YQ455XC4 ratis %