To clarify: there are always ways to block in/out of leader manually or by
script.

Do you think whether Ratis can offer something to make applications write
tests for network partitioning case easier? Should Ratis offer such
mechanism?


-Rui


On Wed, Sep 9, 2020 at 11:22 AM Rui Wang <[email protected]> wrote:

> Hi community,
>
> The Ozone SCM HA [1] is happening. Ozone SCM HA utilizes Ratis to build
> its consensus on states. When working on it, one of the hard problems I
> found is split-brian in which two leaders co-exists so SCM HA needs to deal
> with stale commands from the old leader.
>
> One of the challenges is how to simulate network partitioning so we can
> write meaningful tests to verify the implementation of dealing with stale
> commands. This probably will require:
>
> 1. Have a config to make the old leader never turn to candidate (e.g.
> increase the timeout of re-election)
> 2. Has a way to block the in/out communication of the leader so creating a
> network partitioning case.
>
> The 1 should easily work. Do you know how to tackle the 2?
>
>
> [1]: https://issues.apache.org/jira/browse/HDDS-2823
>
>
> -Rui
>

Reply via email to