[ 
https://issues.apache.org/jira/browse/RATIS-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784582#comment-17784582
 ] 

Tsz-wo Sze commented on RATIS-1625:
-----------------------------------

bq. Start 2 new nodes with 5x configuration

bq. The original design was to call setConf first and then start the nodes. ...

Another way is to start the 2 new nodes with an empty group (with the correct 
group id and an empty peer list). Then the new nodes will wait for the Leader 
after the Leader got the setConf request from a client.

> client.admin().setConfiguration fails due to ReconfigurationTimeoutException
> ----------------------------------------------------------------------------
>
>                 Key: RATIS-1625
>                 URL: https://issues.apache.org/jira/browse/RATIS-1625
>             Project: Ratis
>          Issue Type: Bug
>            Reporter: Riguz Lee
>            Priority: Major
>
> As has been discussed in 
> [https://lists.apache.org/thread/tt1j3jkogh71k2hvq5gtltwmphxfy736]
> , the problem is that:
>  * New nodes will be stopped by the leader because it's not in the old 
> configuration
>  * setConfiguration won't success because it cannot communicate to new nodes, 
> since they got shutdown.
> Steps to repdoduce:
>  * Start a cluster with 3x nodes
>  * Start 2 new nodes with 5x configuration
>  * Call api to change the configuration in old nodes
> Logs when calling admin api:
> {noformat}
> org.apache.ratis.protocol.exceptions.ReconfigurationTimeoutException: 
> 10.19.26.23-6002@group-0242AC120002-CotionStagingState: Fail to set 
> configuration 
> [10.19.26.23-6004|rpc:10.19.26.23:6004|admin:|client:|dataStreamity:0, 
> 10.19.26.23-6003|rpc:10.19.26.23:6003|admin:|client:|dataStream:|priority:0, 
> 10.19.26.23-6002|rpc:10.3:6002|admin:|client:|dataStream:|priority:0, 
> 10.19.26.23-6001|rpc:10.19.26.23:6001|admin:|client:|dataStrearity:0, 
> 10.19.26.23-6005|rpc:10.19.26.23:6005|admin:|client:|dataStream:|priority:0] 
> due to NOPROGRESS
>     at 
> org.apache.ratis.server.impl.LeaderStateImpl$ConfigurationStagingState.fail(LeaderStateImpl.java:[ratis-server-2.3.0.jar!/:2.3.0]
>     at 
> org.apache.ratis.server.impl.LeaderStateImpl.checkStaging(LeaderStateImpl.java:704)
>  ~[ratis-serve.jar!/:2.3.0]
>     at 
> org.apache.ratis.server.impl.LeaderStateImpl.access$500(LeaderStateImpl.java:95)
>  ~[ratis-server-2r!/:2.3.0]
>     at 
> org.apache.ratis.server.impl.LeaderStateImpl$EventProcessor.run(LeaderStateImpl.java:636)
>  ~[ratis-2.3.0.jar!/:2.3.0]{noformat}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to