To clarify: there are always ways to block in/out of leader manually or by script.
Do you think whether Ratis can offer something to make applications write tests for network partitioning case easier? Should Ratis offer such mechanism? -Rui On Wed, Sep 9, 2020 at 11:22 AM Rui Wang <[email protected]> wrote: > Hi community, > > The Ozone SCM HA [1] is happening. Ozone SCM HA utilizes Ratis to build > its consensus on states. When working on it, one of the hard problems I > found is split-brian in which two leaders co-exists so SCM HA needs to deal > with stale commands from the old leader. > > One of the challenges is how to simulate network partitioning so we can > write meaningful tests to verify the implementation of dealing with stale > commands. This probably will require: > > 1. Have a config to make the old leader never turn to candidate (e.g. > increase the timeout of re-election) > 2. Has a way to block the in/out communication of the leader so creating a > network partitioning case. > > The 1 should easily work. Do you know how to tackle the 2? > > > [1]: https://issues.apache.org/jira/browse/HDDS-2823 > > > -Rui >
