Hi William and others, I understand more about the failures/errors. All of them were test problems; see below:
- [*ERROR*] * TestLeaderInstallSnapshot.testInstallSnapshotLeaderSwitch:53->InstallSnapshotFromLeaderTests.testInstallSnapshotLeaderSwitch:94->InstallSnapshotFromLeaderTests.testInstallSnapshotDuringLeaderSwitch:170 Unexpected exception type thrown, expected: <org.apache.ratis.protocol.exceptions.RaftRetryFailureException> but was: <org.apache.ratis.protocol.exceptions.ReconfigurationTimeoutException>* This is a ParameterizedTest with separateHeartbeat = false. This case is okay since the exception is changed as shown in the message above: RaftRetryFailureException vs ReconfigurationTimeoutException. - [*ERROR*] * TestLeaderInstallSnapshot.testInstallSnapshotLeaderSwitch:53->InstallSnapshotFromLeaderTests.testInstallSnapshotLeaderSwitch:94->InstallSnapshotFromLeaderTests.testInstallSnapshotDuringLeaderSwitch:141 » IllegalState* This is the same ParameterizedTest with separateHeartbeat = true. This IllegalStateException was for "No leader yet". The test blocked the peers so they could not vote for each other. They kept getting 0 responses as shown below. This case is a test problem. 2024-06-17 00:13:26,950 [s2@group-F9E5FA170112-LeaderElection6] INFO impl.LeaderElection (LeaderElection.java:logAndReturn(89)) - s2@group-F9E5FA170112-LeaderElection6: PRE_VOTE TIMEOUT received 0 response(s) and 0 exception(s): 2024-06-17 00:13:26,978 [s1@group-F9E5FA170112-LeaderElection7] INFO impl.LeaderElection (LeaderElection.java:logAndReturn(89)) - s1@group-F9E5FA170112-LeaderElection7: PRE_VOTE TIMEOUT received 0 response(s) and 0 exception(s): - [*ERROR*] * TestLeaderInstallSnapshot.testSeparateSnapshotInstallPath(Boolean)[1] » Timeout* - [*ERROR*] * TestLeaderInstallSnapshot.testSeparateSnapshotInstallPath(Boolean)[2] » Timeout* The test started with only 1 server. Both test cases can pass after changing to 3 servers. - [*ERROR*] * TestRetryCacheWithNettyRpc.testRetryOnNewLeader » Timeout testRetryOnNewLeader...* The timeout was because setConf kept failing. The problem was that the setConf tried to remove 2 peers and add 2 peers at the same time. When changing the setConf to either (1) remove 1 peer and add 1 peer, or (2) remove 2 peers and add 0 peer, the test can pass. Tsz-Wo On Sat, Jun 15, 2024 at 8:54 AM Tsz Wo Sze <[email protected]> wrote: > Hi William, > > I have been running tests for the past few days. Unfortunately, I got > similar failures in rc1 as we found in rc0. I will dig deeper to see why > these tests are failing. > > [*ERROR*] *Failures: * > > [*ERROR*] * > TestLeaderInstallSnapshot.testInstallSnapshotLeaderSwitch:53->InstallSnapshotFromLeaderTests.testInstallSnapshotLeaderSwitch:94->InstallSnapshotFromLeaderTests.testInstallSnapshotDuringLeaderSwitch:170 > Unexpected exception type thrown, expected: > <org.apache.ratis.protocol.exceptions.RaftRetryFailureException> but was: > <org.apache.ratis.protocol.exceptions.ReconfigurationTimeoutException>* > > [*ERROR*] *Errors: * > > [*ERROR*] * > TestLeaderInstallSnapshot.testInstallSnapshotLeaderSwitch:53->InstallSnapshotFromLeaderTests.testInstallSnapshotLeaderSwitch:94->InstallSnapshotFromLeaderTests.testInstallSnapshotDuringLeaderSwitch:141 > » IllegalState* > > [*ERROR*] * > TestLeaderInstallSnapshot.testSeparateSnapshotInstallPath(Boolean)[1] » > Timeout* > > [*ERROR*] * > TestLeaderInstallSnapshot.testSeparateSnapshotInstallPath(Boolean)[2] » > Timeout* > > [*ERROR*] * TestRetryCacheWithNettyRpc.testRetryOnNewLeader » Timeout > testRetryOnNewLeader...* > > [*INFO*] > > [*ERROR*] *Tests run: 328, Failures: 1, Errors: 4, Skipped: 0* > > Tsz-Wo > > > On Wed, Jun 12, 2024 at 2:48 PM William Song <[email protected]> wrote: > >> Hi Community, >> >> I’m calling a vote For Apache Ratis Release 3.1.0 rc1. >> >> The git tag to be vote upon: >> https://github.com/apache/ratis/tree/ratis-3.1.0-rc1 >> >> The git commit hash: >> 9ed4e3eca792d96aafa4f43ba5dfe1b9650a522c >> >> The source and binary tarballs can be found at: >> https://dist.apache.org/repos/dist/dev/ratis/3.1.0/rc1 >> >> Fingerprint of the GPG key release artifacts are signed with: >> DCE2 C33D 41C6 2578 969D BAFE 37D6 ECF8 4E78 BC92 >> >> My public key to verify signatures can be found in: >> https://dist.apache.org/repos/dist/dev/ratis/KEYS >> >> Maven artifacts are staged at: >> https://repository.apache.org/content/repositories/orgapacheratis-1146 >> >> This vote will remain open for at least 72 hours. >> Please vote on releasing this ratis-3.1.0-rc1. Thanks in advance. >> >> [ ] +1 approve >> [ ] 0 no opinion >> [ ] -1 disapprove (and reason why) >> >> Starting with my +1(binding) >> - Verified checksums, signatures and git hash. >> - Checked LICENSE and NOTICE. >> - Compared the files in src tarball with the files at the given git tag. >> - Built from source. >> - Ran regular Ratis CI. [1] >> >> [1] https://github.com/apache/ratis/actions/runs/9477559508 > >
