Re: DR policies/HA setup in production - best practices

Ted Dunning Mon, 03 Jan 2011 13:44:03 -0800

Actually, ZK is very good in this regard.

The lifetime of a single leader is denoted by an epoch number.  Transactions
are identified by an epoch and a sequence number assigned by the leader.
 Since there is only one leader and because all transactions are executed
serially, this
combination of epoch and transaction id uniquely specifies a transaction and
provides a complete ordering.

As transactions are committed, members of the committing quorum record the
latest epoch and transaction.

When you restart a cluster, the members of the cluster negotiate to
determine who has the latest transaction and then start from there.  As
such, it is probably a good idea to backup more than just one log+snapshot
so that you have a better chance of having a later copy.

On Mon, Jan 3, 2011 at 12:58 PM, Sergei Babovich
<[email protected]>wrote:

> It is also understood about DR strategy. What is the mechanism for ZK to
> resolve conflicts in such case? Let's say we have a primitive backup
> strategy of shipping logs every hour. In theory it means (assuming the worst
> case) that on DR site all servers will have snapshots of the data made at
> different point in time. When I bring the DR cluster up what is a protocol
> of resolving inconsistencies? That was a reason of my question - it felt
> (may be naively) that recovering by replicating from the single node data
> (snapshot+log) would be safer and more consistent approach - it is easier to
> make guaranties about result.
>
>

Re: DR policies/HA setup in production - best practices

Reply via email to