Actually, ZK is very good in this regard. The lifetime of a single leader is denoted by an epoch number. Transactions are identified by an epoch and a sequence number assigned by the leader. Since there is only one leader and because all transactions are executed serially, this combination of epoch and transaction id uniquely specifies a transaction and provides a complete ordering.
As transactions are committed, members of the committing quorum record the latest epoch and transaction. When you restart a cluster, the members of the cluster negotiate to determine who has the latest transaction and then start from there. As such, it is probably a good idea to backup more than just one log+snapshot so that you have a better chance of having a later copy. On Mon, Jan 3, 2011 at 12:58 PM, Sergei Babovich <[email protected]>wrote: > It is also understood about DR strategy. What is the mechanism for ZK to > resolve conflicts in such case? Let's say we have a primitive backup > strategy of shipping logs every hour. In theory it means (assuming the worst > case) that on DR site all servers will have snapshots of the data made at > different point in time. When I bring the DR cluster up what is a protocol > of resolving inconsistencies? That was a reason of my question - it felt > (may be naively) that recovering by replicating from the single node data > (snapshot+log) would be safer and more consistent approach - it is easier to > make guaranties about result. > >
