Bob Mitchell created ARTEMIS-2568:
-------------------------------------

             Summary: Race condition between failover processing and master 
restart can cause split brain
                 Key: ARTEMIS-2568
                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2568
             Project: ActiveMQ Artemis
          Issue Type: Bug
    Affects Versions: 2.10.1
            Reporter: Bob Mitchell


We have seen split brain in the following sequence of events when using 
replicating backups with failback:
 # Master fails or is shutdown
 # Backup detects failure and starts to failover
 # Master is restarted before Backup becomes "live"
 # It's check for a "duplicate" server fails because backup is not live yet
 # Master and backup both become live.

At the very least, we would like to see the window for this to occur to be 
reduced, possibly by having the backup check again for the master to be 
available just before going live.  It might also be necessary to have the 
master check for a duplicate server as a last step before going live as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to