[jira] [Commented] (ARTEMIS-2716) Implements pluggable Quorum Vote
[ https://issues.apache.org/jira/browse/ARTEMIS-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17455302#comment-17455302 ] ASF subversion and git services commented on ARTEMIS-2716: -- Commit de7a1805a4918c18dd911083f4f44972adb4645a in activemq-artemis's branch refs/heads/main from gtully [ https://gitbox.apache.org/repos/asf?p=activemq-artemis.git;h=de7a180 ] ARTEMIS-2716 - fix up test regression in OpenWireProtocolManagerTest > Implements pluggable Quorum Vote > > > Key: ARTEMIS-2716 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2716 > Project: ActiveMQ Artemis > Issue Type: New Feature >Reporter: Francesco Nigro >Assignee: Francesco Nigro >Priority: Major > Fix For: 2.18.0 > > Attachments: backup.png, primary.png > > Time Spent: 28h 50m > Remaining Estimate: 0h > > This task aim to ideliver a new Quorum Vote mechanism for artemis with the > objectives: > # to make it pluggable > # to cleanly separate the election phase and the cluster member states > # to simplify most common setups in both amount of configuration and > requirements (eg "witness" nodes could be implemented to support single > master-slave pairs) > Post-actions to help people adopt it, but need to be thought upfront: > # a clean upgrade path for current HA replication users > # deprecate or integrate the current HA replication into the new version -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ARTEMIS-2716) Implements pluggable Quorum Vote
[ https://issues.apache.org/jira/browse/ARTEMIS-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17394249#comment-17394249 ] ASF subversion and git services commented on ARTEMIS-2716: -- Commit 536271485f1b19df9c1c71089fe1e0814a309e0e in activemq-artemis's branch refs/heads/main from Francesco Nigro [ https://gitbox.apache.org/repos/asf?p=activemq-artemis.git;h=5362714 ] ARTEMIS-2716 Pluggable Quorum Vote > Implements pluggable Quorum Vote > > > Key: ARTEMIS-2716 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2716 > Project: ActiveMQ Artemis > Issue Type: New Feature >Reporter: Francesco Nigro >Assignee: Francesco Nigro >Priority: Major > Attachments: backup.png, primary.png > > Time Spent: 28h 40m > Remaining Estimate: 0h > > This task aim to ideliver a new Quorum Vote mechanism for artemis with the > objectives: > # to make it pluggable > # to cleanly separate the election phase and the cluster member states > # to simplify most common setups in both amount of configuration and > requirements (eg "witness" nodes could be implemented to support single > master-slave pairs) > Post-actions to help people adopt it, but need to be thought upfront: > # a clean upgrade path for current HA replication users > # deprecate or integrate the current HA replication into the new version -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARTEMIS-2716) Implements pluggable Quorum Vote
[ https://issues.apache.org/jira/browse/ARTEMIS-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363673#comment-17363673 ] Francesco Nigro commented on ARTEMIS-2716: -- I've decided that a failing fail-back is going to fail-fast (because it's supposed to be an admin operation), but a proper fail during a proper failover will keep the primary broker (acting as backup) alive and retry to sync with another (or the same) live. This should be cover the need to have a symmetric behaviour between a natural born-backup and a primary acting as a backup (with a backup acting as live that has configured {{allow-failback == false}}). > Implements pluggable Quorum Vote > > > Key: ARTEMIS-2716 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2716 > Project: ActiveMQ Artemis > Issue Type: New Feature >Reporter: Francesco Nigro >Assignee: Francesco Nigro >Priority: Major > Attachments: backup.png, primary.png > > Time Spent: 16h > Remaining Estimate: 0h > > This task aim to ideliver a new Quorum Vote mechanism for artemis with the > objectives: > # to make it pluggable > # to cleanly separate the election phase and the cluster member states > # to simplify most common setups in both amount of configuration and > requirements (eg "witness" nodes could be implemented to support single > master-slave pairs) > Post-actions to help people adopt it, but need to be thought upfront: > # a clean upgrade path for current HA replication users > # deprecate or integrate the current HA replication into the new version -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARTEMIS-2716) Implements pluggable Quorum Vote
[ https://issues.apache.org/jira/browse/ARTEMIS-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362898#comment-17362898 ] Francesco Nigro commented on ARTEMIS-2716: -- I'm going to: # remove the initial loop on primary start: a primary start that check for live server isn't supposed to be a non-admin automatic restart, especially if the broker was stopped because of some failure: need inspection of the state of the journal/machine and an admin would likely prefer an all-or-nothing (re)start operation ie start broker and await failback (or just become a backup) to happen # deprecate allow-failback: allow-failback == false turn a failing-back primary into a backup that can just error out on failover errors NOTE: For the latter, in the classic replication a master acting as a backup just forget its Node ID if any error happen during a failover and restart as an empty backup. It's dangerous because on broker restart, the broker got a different NodeID and can always succeed to become live (!!). > Implements pluggable Quorum Vote > > > Key: ARTEMIS-2716 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2716 > Project: ActiveMQ Artemis > Issue Type: New Feature >Reporter: Francesco Nigro >Assignee: Francesco Nigro >Priority: Major > Attachments: backup.png, primary.png > > Time Spent: 16h > Remaining Estimate: 0h > > This task aim to ideliver a new Quorum Vote mechanism for artemis with the > objectives: > # to make it pluggable > # to cleanly separate the election phase and the cluster member states > # to simplify most common setups in both amount of configuration and > requirements (eg "witness" nodes could be implemented to support single > master-slave pairs) > Post-actions to help people adopt it, but need to be thought upfront: > # a clean upgrade path for current HA replication users > # deprecate or integrate the current HA replication into the new version -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARTEMIS-2716) Implements pluggable Quorum Vote
[ https://issues.apache.org/jira/browse/ARTEMIS-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17332115#comment-17332115 ] Francesco Nigro commented on ARTEMIS-2716: -- I've just noticed an issue with the existing replication (so it affects partially the new one too), during the failover phase: I see that a slave, failing over, will connect to itself(!) and is going to send periodic PING to itself -> it looks like a bug. I'm going to open a separate issue for this. > Implements pluggable Quorum Vote > > > Key: ARTEMIS-2716 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2716 > Project: ActiveMQ Artemis > Issue Type: New Feature >Reporter: Francesco Nigro >Assignee: Francesco Nigro >Priority: Major > Attachments: backup.png, primary.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > This task aim to ideliver a new Quorum Vote mechanism for artemis with the > objectives: > # to make it pluggable > # to cleanly separate the election phase and the cluster member states > # to simplify most common setups in both amount of configuration and > requirements (eg "witness" nodes could be implemented to support single > master-slave pairs) > Post-actions to help people adopt it, but need to be thought upfront: > # a clean upgrade path for current HA replication users > # deprecate or integrate the current HA replication into the new version -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARTEMIS-2716) Implements pluggable Quorum Vote
[ https://issues.apache.org/jira/browse/ARTEMIS-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17332071#comment-17332071 ] Francesco Nigro commented on ARTEMIS-2716: -- Part of the changes to make this to work better then the existing replication, is to change the default value of [https://activemq.apache.org/components/artemis/documentation/latest/connection-ttl.html] in order to speedup a backup while detecting an unresponsive live broker and attempt to failover to it > Implements pluggable Quorum Vote > > > Key: ARTEMIS-2716 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2716 > Project: ActiveMQ Artemis > Issue Type: New Feature >Reporter: Francesco Nigro >Assignee: Francesco Nigro >Priority: Major > Attachments: backup.png, primary.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > This task aim to ideliver a new Quorum Vote mechanism for artemis with the > objectives: > # to make it pluggable > # to cleanly separate the election phase and the cluster member states > # to simplify most common setups in both amount of configuration and > requirements (eg "witness" nodes could be implemented to support single > master-slave pairs) > Post-actions to help people adopt it, but need to be thought upfront: > # a clean upgrade path for current HA replication users > # deprecate or integrate the current HA replication into the new version -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARTEMIS-2716) Implements pluggable Quorum Vote
[ https://issues.apache.org/jira/browse/ARTEMIS-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17331909#comment-17331909 ] Francesco Nigro commented on ARTEMIS-2716: -- These are the drafts of the detailed state diagram of both activation roles (primary/backup) that really are 1:1(ish) with the existing master-slave. Primary: !primary.png! And backup (that includes the failback case too): !backup.png! These will change over time, but gives a broad idea of how the process have changed if compared with the existing replication mechanism. It's similar but won't let some of the states to be undefined or un-handled. > Implements pluggable Quorum Vote > > > Key: ARTEMIS-2716 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2716 > Project: ActiveMQ Artemis > Issue Type: New Feature >Reporter: Francesco Nigro >Assignee: Francesco Nigro >Priority: Major > Attachments: backup.png, primary.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > This task aim to ideliver a new Quorum Vote mechanism for artemis with the > objectives: > # to make it pluggable > # to cleanly separate the election phase and the cluster member states > # to simplify most common setups in both amount of configuration and > requirements (eg "witness" nodes could be implemented to support single > master-slave pairs) > Post-actions to help people adopt it, but need to be thought upfront: > # a clean upgrade path for current HA replication users > # deprecate or integrate the current HA replication into the new version -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ARTEMIS-2716) Implements pluggable Quorum Vote
[ https://issues.apache.org/jira/browse/ARTEMIS-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17084722#comment-17084722 ] Francesco Nigro commented on ARTEMIS-2716: -- The previous objectives could be translated, with more details, in: # preserve compatibility at configuration level (ok to add new things, trying hard to not deprecate anything) # it should cover the same cases of the original quorum vote # it should cover some additional case ie 1 pair (+ 1 witness "node" - not a full broker) At a lower level: # clients shouldn't change: it means that at a lower level it will reuse the same "gossip protocol" based on Topology propagation used now # the mechanism used to perform the backup pairing with live (that include both the discovery process and the synchronization with it) shouldn't change A proposal of the next steps to be followed to get this is: # abstract away the current quorum vote: it requires extra-care because the logic is mixed together with the replication/clustering behaviour # refactor it in order to separate election phase and cluster member states # design a new generic API that should just cover the higher-level functionalities provided by the previous implementation # choose/discuss a possible (consensus algorithm) provider that could be used to implement a first POC (maybe as first step to get a RI) # implement a RI version > Implements pluggable Quorum Vote > > > Key: ARTEMIS-2716 > URL: https://issues.apache.org/jira/browse/ARTEMIS-2716 > Project: ActiveMQ Artemis > Issue Type: New Feature >Reporter: Francesco Nigro >Priority: Major > > This task aim to ideliver a new Quorum Vote mechanism for artemis with the > objectives: > # to make it pluggable > # to cleanly separate the election phase and the cluster member states > # to simplify most common setups in both amount of configuration and > requirements (eg "witness" nodes could be implemented to support single > master-slave pairs) > Post-actions to help people adopt it, but need to be thought upfront: > # a clean upgrade path for current HA replication users > # deprecate or integrate the current HA replication into the new version -- This message was sent by Atlassian Jira (v8.3.4#803005)