[GitHub] [ignite-3] ascherbakoff commented on a change in pull request #481: IGNITE-15705 WIP

GitBox Thu, 09 Dec 2021 07:42:29 -0800


ascherbakoff commented on a change in pull request #481:
URL: https://github.com/apache/ignite-3/pull/481#discussion_r765846260




##########
File path: 
modules/raft/src/main/java/org/apache/ignite/raft/jraft/core/NodeImpl.java
##########
@@ -1171,6 +1218,9 @@ private void electSelf() {
             }
             resetLeaderId(PeerId.emptyPeer(), new 
Status(RaftError.ERAFTTIMEDOUT,
                 "A follower's leader_id is reset to NULL as it begins to 
request_vote."));
+            if (!options.getRaftOptions().isStepDownWhenVoteTimedout()) {

Review comment:
       Why can't we have a single place for adjustElectionTimeout, for example 
before a preVote ?
   Also, I see no tests using setStepDownWhenVoteTimedout(false), so this logic 
is unclear.

##########
File path: 
modules/raft/src/main/java/org/apache/ignite/raft/jraft/core/NodeImpl.java
##########
@@ -606,7 +610,50 @@ private void handleElectionTimeout() {
             }
         }
     }
-
+    
+    /**
+     * Method that adjusts election timeout after several consecutive 
unsuccessful leader elections.
+     *
+     * Notes about general algorithm: the main idea is that in a stable 
cluster election timeout should be relatively small, but when cluster

Review comment:
       Raft doesn't know a bit about some "membership layer". The javadoc 
should be rewritten in terms of a raft node stability, like long GC pauses or 
OS freezes, preventing elections from a completion. Also, I would rewrite this 
to using strategy pattern, similar Ignite 2 
org.apache.ignite.spi.TimeoutStrategy. The current election tmieout should be 
the function of unsuccessful election rounds. Also it would be good to print 
election rounds in the log. This strategy must be configurable by NodeOptions. 
By default, noop strategy must be used.

##########
File path: modules/raft/src/main/java/org/apache/ignite/internal/raft/Loza.java
##########
@@ -67,7 +67,7 @@
 
     /** Network timeout. */
     // TODO: IGNITE-15705 Correct value should be investigated
-    private static final int NETWORK_TIMEOUT = 3000;
+    private static final int RPC_TIMEOUT = 3000;

Review comment:
       Let's rename TIMEOUT -> RETRY_TIMEOUT

##########
File path: 
modules/raft/src/main/java/org/apache/ignite/raft/jraft/option/NodeOptions.java
##########
@@ -49,6 +49,15 @@
     // Default: 1200 (1.2s)
     private int electionTimeoutMs = 1200; // follower to candidate timeout
 
+    // The upper bound of the election timeout adjusting. Must be more than 
timeout of a membership protocol to remove failed node from
+    // the cluster. In our case, we may assume that 11s could be enough as far 
as 11s is greater than suspicion timeout
+    // for the 1000 nodes cluster with ping interval equals to 500ms.
+    // See NodeIml#adjustElectionTimeout for more details about adjusting 
election timeouts.
+    public static final int ELECTION_TIMEOUT_MS_MAX = 11_000;
+    
+    // Max number of consecutive unsuccessful elections after which election 
timeout is adjusted.
+    public static final int MAX_ELECTION_ROUNDS_WITHOUT_ADJUSTING = 3;
+

Review comment:
       This looks really bad and must be refactored to strategy options.

##########
File path: 
modules/raft/src/integrationTest/java/org/apache/ignite/raft/jraft/core/ItNodeTest.java
##########
@@ -3330,6 +3330,134 @@ public void testBlockedElection() throws Exception {
         leader = cluster.getLeader();
         LOG.info("Elect new leader is {}, curTerm={}", leader.getLeaderId(), 
((NodeImpl) leader).getCurrentTerm());
     }
+    
+    @Test
+    public void testElectionTimeoutAutoAdjustWhenBlockedAllMesssages() throws 
Exception {
+        List<PeerId> peers = TestUtils.generatePeers(3);
+        cluster = new TestCluster("unittest", dataPath, peers,  testInfo);
+        
+        
+        for (PeerId peer : peers)
+            assertTrue(cluster.start(peer.getEndpoint()));
+        
+        cluster.waitLeader();
+        
+        Node leader = cluster.getLeader();
+        
+        int initElectionTimeout = leader.getOptions().getElectionTimeoutMs();
+        
+        LOG.warn("Current leader {}, electTimeout={}", 
leader.getNodeId().getPeerId(), leader.getOptions().getElectionTimeoutMs());
+        
+        List<Node> followers = cluster.getFollowers();
+    
+        for (Node follower : followers) {
+            NodeImpl follower0 = (NodeImpl) follower;
+    
+            assertEquals(initElectionTimeout, 
follower0.getOptions().getElectionTimeoutMs());
+        }
+        
+        for (Node follower : followers) {
+            NodeImpl follower0 = (NodeImpl) follower;
+            DefaultRaftClientService rpcService = (DefaultRaftClientService) 
follower0.getRpcClientService();
+            RpcClientEx rpcClientEx = (RpcClientEx) rpcService.getRpcClient();
+            rpcClientEx.blockMessages((msg, nodeId) -> true);
+        }
+        
+        LOG.warn("Stop leader {}, curTerm={}", leader.getNodeId().getPeerId(), 
((NodeImpl) leader).getCurrentTerm());
+
+        assertTrue(cluster.stop(leader.getNodeId().getPeerId().getEndpoint()));
+        
+        assertNull(cluster.getLeader());
+        
+        assertTrue(waitForCondition(() -> followers.stream().allMatch(f -> 
f.getOptions().getElectionTimeoutMs() > initElectionTimeout),
+                (long) NodeOptions.MAX_ELECTION_ROUNDS_WITHOUT_ADJUSTING
+                        // need to multiply to 2 because stepDown happens 
after voteTimer timeout
+                        * (initElectionTimeout + 
followers.get(0).getOptions().getRaftOptions().getMaxElectionDelayMs()) * 2));
+        
+        for (Node follower : followers) {
+            NodeImpl follower0 = (NodeImpl) follower;
+            DefaultRaftClientService rpcService = (DefaultRaftClientService) 
follower0.getRpcClientService();
+            RpcClientEx rpcClientEx = (RpcClientEx) rpcService.getRpcClient();
+            rpcClientEx.stopBlock();
+        }
+        
+        // elect new leader
+        cluster.waitLeader();
+        leader = cluster.getLeader();
+        
+        LOG.info("Elect new leader is {}, curTerm={}", leader.getLeaderId(), 
((NodeImpl) leader).getCurrentTerm());
+    
+        assertTrue(waitForCondition(() -> followers.stream().allMatch(f -> 
f.getOptions().getElectionTimeoutMs() == initElectionTimeout),
+                3_000));
+    }
+    
+    @Test
+    public void testElectionTimeoutAutoAdjustWhenBlockedRequestVoteMessages() 
throws Exception {

Review comment:
       This test must be rewritten to avoid copy&paste




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [ignite-3] ascherbakoff commented on a change in pull request #481: IGNITE-15705 WIP

Reply via email to