Re: brain split causes both nodes to be LIVE in a replicated-failback-static setup

Justin Bertram Wed, 06 Mar 2024 08:49:05 -0800

Do you have any mitigation in place for split brain? Typically you'd use
ZooKeeper with a single primary/backup pair of brokers. Otherwise you'd
need 3 primary/backup pairs to establish a proper quorum.


To be clear, once split brain occurs administrative intervention is
required to resolve the situation. The brokers by themselves can't
determine which broker has more up-to-date data so they can't automatically
decide which broker should take over.


Justin

On Wed, Mar 6, 2024 at 8:11 AM Simon Valter <si...@valter.info> wrote:

> like to hear your thoughts on this.
>
> My setup is as follows:
>
> I have a setup similar to the replicated-failback-static example
>
> I run the following version: apache-artemis-2.30.0
>
> JDK is java 17
>
> It's on 2 nodes running windows 2022 (i have 3 environments, it
> happened across them all at different times. currently i have kept 1
> environment in this state, sadly it's not in DEBUG)
>
> ssl transport is in use
>
> nodes are placed in the same subnet on vmware infrastructure
>
> ntp/time is in sync on the nodes
>
> activemq service has not been restarted for 84 days, after 2 days uptime
> this happened:
>
> After a split brain replication stopped and both are LIVE and can see each
> other and are connected again but failback did not happen.
>
> I have tested and seen failback happen previously but this exact scenario
> seems to have caused some bad state?
>
> logs and screenshots showcasing the issue has been attached.
>

Re: brain split causes both nodes to be LIVE in a replicated-failback-static setup

Reply via email to