adoroszlai opened a new pull request #3214: URL: https://github.com/apache/ozone/pull/3214
## What changes were proposed in this pull request? `testOMRestart` verifies that follower OM catches up to leader OM after it was restarted. It is flaky due to an assertion that after the restart follower is lagging behind leader. Passing case: ``` 2022-03-18 22:21:54,045 [Listener at 127.0.0.1/60277] INFO ratis.OzoneManagerRatisServer (OzoneManagerRatisServer.java:start(554)) - Starting OzoneManagerRatisServer omNode-3 at port 60274 ... 2022-03-18 22:21:54,341 [Listener at localhost/60277] INFO om.TestOzoneManagerHAWithData (TestOzoneManagerHAWithData.java:testOMRestart(477)) - ZZZ leader snapshot: 543 2022-03-18 22:21:54,341 [Listener at localhost/60277] INFO om.TestOzoneManagerHAWithData (TestOzoneManagerHAWithData.java:testOMRestart(482)) - ZZZ follower last applied after restart: 43 ... 2022-03-18 22:21:54,477 [grpc-default-executor-2] INFO server.RaftServer$Division (ServerState.java:setLeader(285)) - omNode-3@group-523986131536: change Leader from null to omNode-1 at term 1 for appendEntries, leader elected after 428ms ... 2022-03-18 22:21:54,578 [omNode-3@group-523986131536-StateMachineUpdater] INFO impl.StateMachineUpdater (StateMachineUpdater.java:lambda$new$0(89)) - omNode-3@group-523986131536-StateMachineUpdater: snapshotIndex: updateIncreasingly 43 -> 542 ``` Failing case: ``` 2022-03-18 20:52:24,821 [Listener at 127.0.0.1/58092] INFO ratis.OzoneManagerRatisServer (OzoneManagerRatisServer.java:start(554)) - Starting OzoneManagerRatisServer omNode-3 at port 58089 ... 2022-03-18 20:52:25,232 [grpc-default-executor-4] INFO server.RaftServer$Division (ServerState.java:setLeader(285)) - omNode-3@group-523986131536: change Leader from null to omNode-1 at term 1 for appendEntries, leader elected after 408ms ... 2022-03-18 20:52:25,376 [omNode-3@group-523986131536-StateMachineUpdater] INFO impl.StateMachineUpdater (StateMachineUpdater.java:lambda$new$0(89)) - omNode-3@group-523986131536-StateMachineUpdater: snapshotIndex: updateIncreasingly 43 -> 544 ... 2022-03-18 20:52:25,497 [Listener at localhost/58092] INFO om.TestOzoneManagerHAWithData (TestOzoneManagerHAWithData.java:testOMRestart(477)) - ZZZ leader snapshot: 543 2022-03-18 20:52:25,498 [Listener at localhost/58092] INFO om.TestOzoneManagerHAWithData (TestOzoneManagerHAWithData.java:testOMRestart(482)) - ZZZ follower last applied after restart: 544 ``` In both cases follower caught up after restart, but only after or even before the assertion, depending on timing. (Lines with `ZZZ` are temporary log messages before the assertion.) This PR simply removes the flaky assertion, which is not essential for the test. https://issues.apache.org/jira/browse/HDDS-6469 ## How was this patch tested? Repeated 100 times: https://github.com/adoroszlai/hadoop-ozone/runs/5607104552 Regular CI: https://github.com/adoroszlai/hadoop-ozone/runs/5607106443#step:4:8144 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
