[
https://issues.apache.org/jira/browse/QPID-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alan Conway resolved QPID-4360.
-------------------------------
Resolution: Fixed
Comitted on trunk
------------------------------------------------------------------------
r1394706 | aconway | 2012-10-05 14:21:45 -0400 (Fri, 05 Oct 2012) | 10 lines
QPID-4360: Non-ready HA broker can be incorrectly promoted to primary
A joining broker now attempts to contact all known members of the cluster and
check their status. If any brokers are in a state other than "joining" the
broker will refuse to promote. This will allow rgmanager to continue to try
addresses till it finds a ready brokers.
Note this reqiures ha-brokers-url to be a list of all known brokers, not a
virtual IP. ha-public-url can still be a VIP.
------------------------------------------------------------------------
> Non-ready HA broker can be incorrectly promoted
> ------------------------------------------------
>
> Key: QPID-4360
> URL: https://issues.apache.org/jira/browse/QPID-4360
> Project: Qpid
> Issue Type: Bug
> Components: C++ Clustering
> Affects Versions: 0.18
> Reporter: Alan Conway
> Assignee: Alan Conway
>
> Description of problem:
> rgmanager can promote a non-ready backup HA broker to primary when other
> backup brokers are available in the ready state. This can result in loss of
> messages and broker configuration. Additionally, this can cause the
> previously ready backups to throw exceptions when connecting to the new
> primary:
> Sep 20 10:17:18 itcm12 qpidd[10871]: 2012-09-20 10:17:18 [HA] critical Backup
> queue Queue1: Replication failed: Invalid position move, preceeds messages
> Sep 20 10:17:18 itcm12 qpidd[10871]: 2012-09-20 10:17:18 [Protocol] error
> Unexpected exception: Invalid position move, preceeds messages
> Sep 20 10:17:18 itcm12 qpidd[10871]: 2012-09-20 10:17:18 [Broker] error
> Connection 10.3.100.12:43837-10.3.100.105:9006 closed by error: Invalid
> position move, preceeds messages(501)
> Version-Release number of selected component (if applicable):
> Qpid 0.18
> How reproducible:
> 100%
> Steps to Reproduce:
> 1. Start a primary and backup broker
> 2. Inject messages into the primary and ensure messages replicate to backup
> 3. Restart the primary broker and manually re-promote to primary
>
> Actual results:
> Restarted broker becomes primary
> Expected results:
> Restarted broker refuses to become primary since at least one ready backup
> was discovered within some timeout
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]