On Feb 4, 2007, at 1:36 PM, Jan Wieck wrote:
On 2/4/2007 10:53 AM, Theo Schlossnagle wrote:
As the clock must be incremented clusterwide, the need for it to
be insync with the system clock (on any or all of the systems)
is obviated. In fact, as you can't guarantee the synchronicity
means that it can be confusing -- one expects a time-based clock
to be accurate to the time. A counter-based clock has no such
For the fourth time, the clock is in the mix to allow to continue
during a network outage. All your arguments seem to assume 100%
network uptime. There will be no clusterwide clock or clusterwide
increment when you lose connection. How does your idea cope with that?
That's exactly what a quorum algorithm is for.
Obviously the counters will immediately drift apart based on the
transaction load of the nodes as soon as the network goes down. And
in order to avoid this "clock" confusion and wrong expectation,
you'd rather have a system with such a simple, non-clock based
counter and accept that it starts behaving totally wonky when the
cluster reconnects after a network outage? I rather confuse a few
people than having a last update wins conflict resolution that
basically rolls dice to determine "last".
If your cluster partition and you have hours of independent action
and upon merge you apply a conflict resolution algorithm that has
enormous effect undoing portions of the last several hours of work on
the nodes, you wouldn't call that "wonky?"
For sane disconnected (or more generally, partitioned) operation in
multi-master environments, a quorum for the dataset must be
established. Now, one can consider the "database" to be the
dataset. So, on network partitions those in "the" quorum are allowed
to progress with data modification and others only read. However,
there is no reason why the dataset _must_ be the database and that
multiple datasets _must_ share the same quorum algorithm. You could
easily classify certain tables or schema or partitions into a
specific dataset and apply a suitable quorum algorithm to that and a
different quorum algorithm to other disjoint data sets.
// Theo Schlossnagle
// CTO -- http://www.omniti.com/~jesus/
// OmniTI Computer Consulting, Inc. -- http://www.omniti.com/
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly