Hi, Currently I’m integrating Ratis as our consensus backbone in Apache IoTDB, and I encountered weird situation that cause the system into livelock:
My original configuration contains a single Group(1) with a single member(1), which is certainly the leader. Now I want to add a new member (follower 2) into this group, and I implement it as follows: Client.getGroupManaApi(2).add(group(1)); Client.admin().setConfiguration([1,2]); Then I observed event sequence which causes the livelock: 1. addGroup successes and follower 2 lifecycle is STARTING 2. Leader 1 send the latest snapshot to follower 2, which contains the **old conf [1]** 3. Follower 2 successfully install snapshot, discovered itself excluded in the conf, and turns the lifecycle into CLOSE 4. Leader 1 recv installSnapshot reply, add new conf [1,2] to the log and applies this conf 5. Since Follower 2 is closed, Leader 1 step down to follower for LOST_MAJORITY_HEARTBEATS, and this group can’t serve anymore. Am I use the groupManagementApi or adminApi wrong? How can I solve this problem? William Song Apache IoTDB
