[ https://issues.apache.org/jira/browse/QPID-2220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12850901#action_12850901 ]
Alan Conway commented on QPID-2220: ----------------------------------- As of r916475, the last survivor in a cluster automatically marks its store as clean, so the only way we can end up with no clean store is if N>1 members fail so close together that none of them receives a config-change showing them to be the last member. For that we need a counter from the store as described above. > Assisting manual recovery from a complete persistent cluster crash. > ------------------------------------------------------------------- > > Key: QPID-2220 > URL: https://issues.apache.org/jira/browse/QPID-2220 > Project: Qpid > Issue Type: Improvement > Components: C++ Broker > Affects Versions: 0.5 > Reporter: Alan Conway > Assignee: Alan Conway > > If every member of a persistent cluster crashes then manual intervention is > required to identify which store is most up-to-date, so it can be used to > recover. We need to provide tools to assist in this identification. > The cluster can save a config-change counter with each config change (cluster > membership change). In recovery, the broker with the highest config-change > counter has the best store. > However if the last brokers in the cluster crash so close together that none > can record a config-change we need an additional decider. > The store at http://qpidcomponents.org/download.html#persistence maintains a > global Persistence ID, a 64 bit value that is incremented for each enqueue, > dequeue. If the cluster stores (config-change,PID) pairs then in recovery we > can use actual-PID - config-change PID as a tiebreaker. > Proposed change to MessageStore API: > /** Returns a monotonically increasing value reflecting changes to the > store. > * The value can wrap-around to 0. > * Stores need not implement this function, they can simply return 0. > */ > uint64_t getChangeCounter(); > The default implementation just returns 0 and the cluster must fall back to > relying on config-change counts. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- Apache Qpid - AMQP Messaging Implementation Project: http://qpid.apache.org Use/Interact: mailto:dev-subscr...@qpid.apache.org