[
https://issues.apache.org/jira/browse/ARTEMIS-2568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17072483#comment-17072483
]
Francesco Nigro edited comment on ARTEMIS-2568 at 4/3/20, 6:31 AM:
-------------------------------------------------------------------
> Sounds a bit more alarming to me to be honest, but this is likely not the
> correct ticket to have that discussion on.
Better to create a new issue for this I suppose: anyway, if there is a
connectivity loss and you're not using at least 3 nodes cluster, the quorum
vote on slave (while backup) cannot work as expect and you will risk split
brain, because none can legit the failover.
was (Author: [email protected]):
> Sounds a bit more alarming to me to be honest, but this is likely not the
> correct ticket to have that discussion on.
Better to create a new issue for this I suppose: anyway, if there is a
connectivity loss and you're not using at least 3 lives, the quorum vote on
slave while backup won't work as expect and you will risk split brain, because
none can legit the failover.
> Race condition between failover processing and master restart can cause split
> brain
> -----------------------------------------------------------------------------------
>
> Key: ARTEMIS-2568
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2568
> Project: ActiveMQ Artemis
> Issue Type: Bug
> Affects Versions: 2.10.1
> Reporter: Bob Mitchell
> Priority: Major
>
> We have seen split brain in the following sequence of events when using
> replicating backups with failback:
> # Master fails or is shutdown
> # Backup detects failure and starts to failover
> # Master is restarted before Backup becomes "live"
> # It's check for a "duplicate" server fails because backup is not live yet
> # Master and backup both become live.
> At the very least, we would like to see the window for this to occur to be
> reduced, possibly by having the backup check again for the master to be
> available just before going live. It might also be necessary to have the
> master check for a duplicate server as a last step before going live as well.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)