[ 
https://issues.apache.org/jira/browse/QPID-2220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Conway updated QPID-2220:
------------------------------

    Description: 
If every member of a persistent cluster crashes then manual intervention is 
required to identify which store is most up-to-date, so it can be used to 
recover. We need to provide tools to assist in this identification.

The cluster can save a config-change counter with each config change (cluster 
membership change). In recovery, the broker with the highest config-change 
counter has the best store. 

However if the last brokers in the cluster crash so close together that none 
can record a config-change we need an additional decider.
The store at http://qpidcomponents.org/download.html#persistence maintains a 
global Persistence ID, a 64 bit value that is incremented for each enqueue, 
dequeue. If the cluster stores  (config-change,PID) pairs then in recovery we 
can use actual-PID - config-change PID as a tiebreaker.

Proposed change to MessageStore API:
  /** Returns a monotonically increasing value reflecting changes to the store.
  * The value can wrap-around to 0.
  * Stores need not implement this function, they can simply return 0.
  */
  uint64_t getChangeCounter();

The default implementation just returns 0  and the cluster must fall back to 
relying on config-change counts.

  was:
If every member of a persistent cluster crashes then manual intervention is 
required to identify which store is most up-to-date, so it can be used to 
recover. We need to provide tools to assist in this identification.

The cluster can save a config-change counter with each config change (cluster 
membership change). In recovery, the broker with the highest config-change 
counter has the best store. However if the last brokers in the cluster crash so 
close together that none can record a config-change we need an additional 
decider.

The store at http://qpidcomponents.org/download.html#persistence maintains a 
global Record Identifier (RID), a 64 bit value that is incremented for each 
enqueue and dequeue. If the cluster stores  (config-change,RID) pairs then in 
recovery we can use actual-RID - RID at config-change as a tiebreaker.

Proposed change to MessageStore API:
  /** Returns a monotonically increasing value reflecting the number of changes 
to the store.
  * The value can wrap-around to 0.
  * Stores need not implement this function, they can simply return 0.
  */
  uint64_t getChangeCounter();

The default implementation just returns 0  and the cluster must fall back to 
relying on config-change counts.


> Assisting manual recovery from a complete persistent cluster crash.
> -------------------------------------------------------------------
>
>                 Key: QPID-2220
>                 URL: https://issues.apache.org/jira/browse/QPID-2220
>             Project: Qpid
>          Issue Type: Improvement
>          Components: C++ Broker
>    Affects Versions: 0.5
>            Reporter: Alan Conway
>            Assignee: Alan Conway
>
> If every member of a persistent cluster crashes then manual intervention is 
> required to identify which store is most up-to-date, so it can be used to 
> recover. We need to provide tools to assist in this identification.
> The cluster can save a config-change counter with each config change (cluster 
> membership change). In recovery, the broker with the highest config-change 
> counter has the best store. 
> However if the last brokers in the cluster crash so close together that none 
> can record a config-change we need an additional decider.
> The store at http://qpidcomponents.org/download.html#persistence maintains a 
> global Persistence ID, a 64 bit value that is incremented for each enqueue, 
> dequeue. If the cluster stores  (config-change,PID) pairs then in recovery we 
> can use actual-PID - config-change PID as a tiebreaker.
> Proposed change to MessageStore API:
>   /** Returns a monotonically increasing value reflecting changes to the 
> store.
>   * The value can wrap-around to 0.
>   * Stores need not implement this function, they can simply return 0.
>   */
>   uint64_t getChangeCounter();
> The default implementation just returns 0  and the cluster must fall back to 
> relying on config-change counts.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscr...@qpid.apache.org

Reply via email to