[ https://issues.apache.org/jira/browse/RATIS-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18021590#comment-18021590 ]
Tsz-wo Sze edited comment on RATIS-2261 at 9/20/25 7:34 PM: ------------------------------------------------------------ [~jianghuazhu], thanks for your help on debugging! bq. 1. ... After starting s3 and s4 subsequently, ... This is a bug in the test. It starts with 1 server s0 and then adds two new servers s3 and s4. The change is a majority-add which should be disallowed; see RATIS-1930. We could simply start with 3 servers in the beginning. Then adding 2 new servers is fine. bq. 2. When the client sends the setConfiguration() command, it cannot establish a connection with s3 or s4. ... It seems that the servers were starting and RPC ports were not yet ready. The client would retry so it should not be a problem. was (Author: szetszwo): [~jianghuazhu], thanks for your help on debugging! bq. 1. ... After starting s3 and s4 subsequently, ... This is a bug in the test. It starts with 1 server s0 and then adds two new servers s3 and s4. The change is a majority-add which should be disallowed; see RATIS-1930. bq. 2. When the client sends the setConfiguration() command, it cannot establish a connection with s3 or s4. ... It seems that the servers were starting and RPC ports were not yet ready. The client would retry so it should not be a problem. > Intermittent failure in > TestRaftSnapshotWithGrpc.testInstallSnapshotDuringBootstrap > ----------------------------------------------------------------------------------- > > Key: RATIS-2261 > URL: https://issues.apache.org/jira/browse/RATIS-2261 > Project: Ratis > Issue Type: Bug > Components: gRPC, test > Reporter: Attila Doroszlai > Priority: Major > Attachments: > org.apache.ratis.grpc.TestRaftSnapshotWithGrpc-output.txt, > testInstallSnapshotDuringBootstrap.log > > > {code} > Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 114.816 s <<< > FAILURE! - in org.apache.ratis.grpc.TestRaftSnapshotWithGrpc > org.apache.ratis.grpc.TestRaftSnapshotWithGrpc.testInstallSnapshotDuringBootstrap > Time elapsed: 101.468 s <<< ERROR! > java.util.concurrent.TimeoutException: testInstallSnapshotDuringBootstrap() > timed out after 100 seconds > at java.util.ArrayList.forEach(ArrayList.java:1259) > at java.util.ArrayList.forEach(ArrayList.java:1259) > Suppressed: java.io.InterruptedIOException: retry > policy=RetryForeverWithSleep(sleepTime = 100ms) > at > org.apache.ratis.client.impl.BlockingImpl.sendRequestWithRetry(BlockingImpl.java:138) > at > org.apache.ratis.client.impl.AdminImpl.setConfiguration(AdminImpl.java:46) > at > org.apache.ratis.client.api.AdminApi.setConfiguration(AdminApi.java:51) > at > org.apache.ratis.client.api.AdminApi.setConfiguration(AdminApi.java:45) > at > org.apache.ratis.server.impl.MiniRaftCluster.setConfiguration(MiniRaftCluster.java:836) > at > org.apache.ratis.statemachine.RaftSnapshotBaseTest.lambda$testInstallSnapshotDuringBootstrap$6(RaftSnapshotBaseTest.java:309) > at > org.apache.ratis.server.impl.RaftServerTestUtil.runWithMinorityPeers(RaftServerTestUtil.java:231) > at > org.apache.ratis.statemachine.RaftSnapshotBaseTest.testInstallSnapshotDuringBootstrap(RaftSnapshotBaseTest.java:308) > {code} > Faeild in 2/100 runs: > https://github.com/adoroszlai/ratis/actions/runs/13901407901 -- This message was sent by Atlassian Jira (v8.20.10#820010)