[ https://issues.apache.org/jira/browse/ARTEMIS-2716?focusedWorklogId=606017&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-606017 ]
ASF GitHub Bot logged work on ARTEMIS-2716: ------------------------------------------- Author: ASF GitHub Bot Created on: 03/Jun/21 15:49 Start Date: 03/Jun/21 15:49 Worklog Time Spent: 10m Work Description: franz1981 commented on pull request #3555: URL: https://github.com/apache/activemq-artemis/pull/3555#issuecomment-853974348 I believe the epoch/version that should both mark the journal (or replicated journal) and a live role acquisition (hence should be something on ZK, indexed on the NodeID?) to be key to save any journal misalignment to happen, that's something even more important then classic split-brain: as @michaelpearce-gain already mentioned, the distributed lock should already prevent it to happen. Happy that both have raised a similar/same idea: this make me more convinced that something like that is the way to go. Re this PR: I wanted to provide the above "enhancement" in a later phase really, as a separate improvement, while here I'm "just" trying to abstract away the quorum algorithm + providing a RI that allow a single pair to work fine + addressing some dark corners of the state diagram in the "classic" native quorum replication. The loops I've mentioned above are my fault while trying to fill some of the gaps in the existing replication state transitions. A brief summary here: - *classic*: master's start by searching for existing live node with the same node ID to pair with and, if none are around, just starting straight as live -> this one is an easy cause of split brain with network partition - *now*: primary's start involves a loop, first searching for existing live node with the same node ID to pair with and, if none is around, try just once to acquire the live lock before starting as live: if the live lock acquisition fail, retry to search existing live nodes etc etc I think that this loop isn't needed, but could be performed just once ie search for other lives, if not found, try acquire live lock (for some time) and if failed, just exit. It saves a completely isolated broker to keep on retrying to acquire the lock and/or search other lives forever. wdyt? Another "hidden" loop: - *classic*: during a master fail-back failure (see `SharedNothingBackupQuorum.BACKUP_ACTIVATION.FAILURE_REPLICATING` on the code) the broker is restarted as pure backup (no able to failback anymore!) deleting it's data folder too, including the node ID. That's dangerous because deleting the data folder will cause the master to loose it's identity (NodeID is going to change or restart) and if the node is manually restarted it will search for an non-existing node ID and able to start (SPLIT BRAIN served!). - *new*: during a primary fail-back failure the broker is restarted as primary with its node ID preserved (so, looping to search for existing lives to pair with or trying to acquire live lock if not found any, in the loop mentioned above). The new approach is better because node ID is preserved and there's no risk a primary node restart with a backup acting as a live will ever cause a split brain to happen, but still, restarting as a primary while isolated can still cause the start process to last forever, while a failback should be a all-or-nothing mechanism IMO: the primary should try to failback to the existing live and just failing if not, without "restarting", wdyt? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 606017) Time Spent: 12h (was: 11h 50m) > Implements pluggable Quorum Vote > -------------------------------- > > Key: ARTEMIS-2716 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2716 > Project: ActiveMQ Artemis > Issue Type: New Feature > Reporter: Francesco Nigro > Assignee: Francesco Nigro > Priority: Major > Attachments: backup.png, primary.png > > Time Spent: 12h > Remaining Estimate: 0h > > This task aim to ideliver a new Quorum Vote mechanism for artemis with the > objectives: > # to make it pluggable > # to cleanly separate the election phase and the cluster member states > # to simplify most common setups in both amount of configuration and > requirements (eg "witness" nodes could be implemented to support single > master-slave pairs) > Post-actions to help people adopt it, but need to be thought upfront: > # a clean upgrade path for current HA replication users > # deprecate or integrate the current HA replication into the new version -- This message was sent by Atlassian Jira (v8.3.4#803005)