[ 
https://issues.apache.org/jira/browse/QPID-2220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12850901#action_12850901
 ] 

Alan Conway commented on QPID-2220:
-----------------------------------

As of r916475, the last survivor in a cluster automatically marks its store as 
clean, so the only way we can end up with no clean store is if N>1 members fail 
so close together that none of them receives a config-change showing them to be 
the last member.  For that we need a counter from the store as described above.

> Assisting manual recovery from a complete persistent cluster crash.
> -------------------------------------------------------------------
>
>                 Key: QPID-2220
>                 URL: https://issues.apache.org/jira/browse/QPID-2220
>             Project: Qpid
>          Issue Type: Improvement
>          Components: C++ Broker
>    Affects Versions: 0.5
>            Reporter: Alan Conway
>            Assignee: Alan Conway
>
> If every member of a persistent cluster crashes then manual intervention is 
> required to identify which store is most up-to-date, so it can be used to 
> recover. We need to provide tools to assist in this identification.
> The cluster can save a config-change counter with each config change (cluster 
> membership change). In recovery, the broker with the highest config-change 
> counter has the best store. 
> However if the last brokers in the cluster crash so close together that none 
> can record a config-change we need an additional decider.
> The store at http://qpidcomponents.org/download.html#persistence maintains a 
> global Persistence ID, a 64 bit value that is incremented for each enqueue, 
> dequeue. If the cluster stores  (config-change,PID) pairs then in recovery we 
> can use actual-PID - config-change PID as a tiebreaker.
> Proposed change to MessageStore API:
>   /** Returns a monotonically increasing value reflecting changes to the 
> store.
>   * The value can wrap-around to 0.
>   * Stores need not implement this function, they can simply return 0.
>   */
>   uint64_t getChangeCounter();
> The default implementation just returns 0  and the cluster must fall back to 
> relying on config-change counts.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org

Reply via email to