adoroszlai opened a new pull request #581:
URL: https://github.com/apache/ratis/pull/581


   ## What changes were proposed in this pull request?
   
   `RaftServerImpl` creates a new `FollowerState` for each vote request 
received.  The old `FollowerState` may trigger change to candidate role even 
after it was shutdown by the server.  Both `RaftServerImpl#requestVote` and the 
change to candidate by `FollowerState` requires lock on `server` 
(`RaftServerImpl` object).  The order of events which triggers the problem:
   
    * `RaftServerImpl` executing `requestVote` [with lock on 
itself](https://github.com/apache/ratis/blob/b57be797fe97c00abc0e3d7353d30e1b2961714d/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L1099)
    * `FollowerState` [checks conditions, then waits for lock on 
`server`](https://github.com/apache/ratis/blob/b57be797fe97c00abc0e3d7353d30e1b2961714d/ratis-server/src/main/java/org/apache/ratis/server/impl/FollowerState.java#L126-L130)
    * `RaftServerImpl` [shuts down old `FollowerState`, starts new 
one](https://github.com/apache/ratis/blob/b57be797fe97c00abc0e3d7353d30e1b2961714d/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L475-L477)
    * `RaftServerImpl` [sets last RPC time on new 
`FollowerState`](https://github.com/apache/ratis/blob/b57be797fe97c00abc0e3d7353d30e1b2961714d/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L1116-L1117)
    * `RaftServerImpl` [releases 
lock](https://github.com/apache/ratis/blob/b57be797fe97c00abc0e3d7353d30e1b2961714d/ratis-server/src/main/java/org/apache/ratis/server/impl/RaftServerImpl.java#L1127)
    * *old* `FollowerState` [enters block with lock on `server`, triggers 
change to 
candidate](https://github.com/apache/ratis/blob/b57be797fe97c00abc0e3d7353d30e1b2961714d/ratis-server/src/main/java/org/apache/ratis/server/impl/FollowerState.java#L130-L138)
   
   Let `FollowerState` check if it is still running after entering the 
`synchronized` block.
   
   https://issues.apache.org/jira/browse/RATIS-1490
   
   ## How was this patch tested?
   
   Tweaked `FollowerState`, `RaftServerImpl` and `LeaderElectionTests` with 
code injection.  It required too much test-specific code, even in 
`synchronized` blocks, so I think it should not be part of this patch.  Ideas 
for unit tests welcome.
   
   Regular CI:
   https://github.com/adoroszlai/incubator-ratis/runs/4845475132


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to