[ 
https://issues.apache.org/jira/browse/ARTEMIS-473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416643#comment-15416643
 ] 

Miroslav Novak commented on ARTEMIS-473:
----------------------------------------

I've created new jira for the description - ARTEMIS-679 - Activate most up to 
date server from master-slave(live-backup) pair. 

If split brain happens then there is not much Artemis can do about it. Still it 
can recover from quite common cases. Basically 3 situation can happen when 
split brain happens (=master and slave are active at the same time):

a) Clients do not loose connection to master and stay connected to master.
b) Clients loose connection to master and failover backup. 
c) Clients loose connection to master and slave at same time. They will try to 
reconnect to master or slave pair. 

I believe that for situations a) and b) Artemis can recover when network is 
reconnected. In the moment when master and slave notice that they're active at 
the same time, they will check who has external (no in-vm) connections. Server 
without external client connections will restart. Only server with the clients 
has the up-to-date journal. 

Option c) is problematic as clients can connect to master or slave so in this 
case there is nothing Artemis can do. wdyt?

> Resolve split brain data after split brains scenarios.
> ------------------------------------------------------
>
>                 Key: ARTEMIS-473
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-473
>             Project: ActiveMQ Artemis
>          Issue Type: New Feature
>          Components: Broker
>    Affects Versions: 1.2.0
>            Reporter: Miroslav Novak
>            Priority: Critical
>
> If master-slave pair is configured using replicated journal and there are no 
> other servers in cluster then if network between master and slave is broken 
> then slave will activate. Depending on whether clients were disconnected from 
> master or not there might be or might not be failover to slave. Problem 
> happens in the moment when network between master and slave is restored. 
> Master and slave are active at the same time which is the split brain 
> syndrom. Currently there is no recovery mechanism to solve this situation.
> Suggested improvement: If clients failovered to slave then master will 
> restart itself so failback occurs (if configured). If clients did not 
> failover and stayed connected to master then backup will restart itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to