[ 
https://issues.apache.org/jira/browse/QPID-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Conway resolved QPID-4360.
-------------------------------

    Resolution: Fixed

Comitted on trunk

------------------------------------------------------------------------
r1394706 | aconway | 2012-10-05 14:21:45 -0400 (Fri, 05 Oct 2012) | 10 lines

QPID-4360: Non-ready HA broker can be incorrectly promoted to primary

A joining broker now attempts to contact all known members of the cluster and
check their status. If any brokers are in a state other than "joining" the
broker will refuse to promote. This will allow rgmanager to continue to try
addresses till it finds a ready brokers.

Note this reqiures ha-brokers-url to be a list of all known brokers, not a
virtual IP.  ha-public-url can still be a VIP.

------------------------------------------------------------------------

                
>  Non-ready HA broker can be incorrectly promoted
> ------------------------------------------------
>
>                 Key: QPID-4360
>                 URL: https://issues.apache.org/jira/browse/QPID-4360
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Clustering
>    Affects Versions: 0.18
>            Reporter: Alan Conway
>            Assignee: Alan Conway
>
> Description of problem:
> rgmanager can promote a non-ready backup HA broker to primary when other 
> backup brokers are available in the ready state.  This can result in loss of 
> messages and broker configuration.  Additionally, this can cause the 
> previously ready backups to throw exceptions when connecting to the new 
> primary:
> Sep 20 10:17:18 itcm12 qpidd[10871]: 2012-09-20 10:17:18 [HA] critical Backup 
> queue Queue1: Replication failed: Invalid position move, preceeds messages
> Sep 20 10:17:18 itcm12 qpidd[10871]: 2012-09-20 10:17:18 [Protocol] error 
> Unexpected exception: Invalid position move, preceeds messages
> Sep 20 10:17:18 itcm12 qpidd[10871]: 2012-09-20 10:17:18 [Broker] error 
> Connection 10.3.100.12:43837-10.3.100.105:9006 closed by error: Invalid 
> position move, preceeds messages(501)
> Version-Release number of selected component (if applicable):
> Qpid 0.18
> How reproducible:
> 100%
> Steps to Reproduce:
> 1. Start a primary and backup broker
> 2. Inject messages into the primary and ensure messages replicate to backup
> 3. Restart the primary broker and manually re-promote to primary
>   
> Actual results:
> Restarted broker becomes primary
> Expected results:
> Restarted broker refuses to become primary since at least one ready backup 
> was discovered within some timeout

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to