Jan Wieck wrote:
Then let me give you a little puzzle just for the fun of it.
A database containing customer contact information (among other things)
is a two node multimaster system. One is serving the customer web
portal, the other is used by the company staff including the call
center. At 13:45 the two servers lose connectivity to each other, yet
the internal staff can access the internal server while the web portal
is accessible from the outside. At 13:50 customer A updates their credit
card information through the web portal, while customer B does the same
through the call center. At 13:55 both customers change their mind to
use yet another credit card, now customer A phones the call center while
customer B does it via the internet.
Phew, a mind twister... one customer would already be enough to trigger
that sort of conflict...
At 14:00 the two servers reconnect and go through the conflict
resolution. How do you intend to solve both conflicts without using any
"clock", because that seems to be a stopword causing instant rejection
of whatever you propose. Needless to say, both customers will be
dissatisfied if you charge the "wrong" credit card during your next
Correct. But do these cases satisfy storing timestamps to each and every
transaction you do? That's what I doubt, not the usefulness of time
based conflict resolution for certain cases.
You can always add a time based conflict resolution, by adding a
timestamp column and decide upon that one. I'd guess that the overall
costs are lower that way.
But you've withdrawn that proposal already, so...
Which is a good discussion because one of the reasons why I stopped
looking into Postgres-R is the fact that is based on the idea to push
all the replication information through a system that generates a global
serialized message queue. That by itself isn't the problem, but the fact
that implementing a global serialized message queue has serious
throughput issues that are (among other details) linked to the speed of
Agreed. Nevertheless, there are use cases for such systems, because they
put less limitations to the application. One could even argue, that your
above example would be one ;-)
I am trying to start with a system, that doesn't rely on such a
mechanism for everything. I do intend to add an option later, that
allows to declare a UNIQUE NOT NULL constraint to be synchronous. What
that means is, that any INSERT, UPDATE, DELETE and SELECT FOR UPDATE
will require the node to currently be a member of the (quorum or
priority defined) majority of the cluster.
An advisory lock system,
based on a total order group communication, will grant the lock to the
unique key values on a first come, first serve base. Every node in the
cluster will keep those keys as "locked" until the asynchronous
replication stream reports the locking transaction as ended. If another
remote transaction in the meantime requires updating such key, the
incoming stream from that node will be on hold until the lock is
cleared. This is to protect agains node B replicating a transaction from
node A and a later update on node B arrives on C before C got the first
event from A. A node that got disconnected from the cluster must rebuild
the current advisory lock list upon reconnecting to the cluster.
Yeah, this is a convenient way to replicate sequences via a GCS.
I think that this will be a way to overcome Postgres-R's communication
bottleneck, as well as allowing limited update activity even during a
completely disconnected state of a node. Synchronous or group
communication messages are reduced to the cases, where the application
cannot be implemented in a conflict free way, like allocating a natural
primary key. There is absolutely no need to synchronize for example
creating a sales order.
Agreed, such cases can easily be optimized. But you have to be aware of
he limitations these optimizations cause. Postgres-R is much more
targeted at very general use cases.
An application can use global unique ID's for
the order number. And everything possibly referenced by an order (items,
customers, ...) is stored in a way that the references are never
updated. Deletes to those possibly referenced objects are implemented in
a two step process, where they are first marked obsolete, and later on
things that have been marked obsolete for X long are deleted. A REPLICA
TRIGGER on inserting an order will simply reset the obsolete flag of
referenced objects. If a node is disconnected longer than X, you have a
problem - hunt down the guy who defined X.
Yeah, that's another very nice optimization. Again, as long as you know
the limitations, that's all well and fine.
Merging certain ideas to come up with an async/sync hybrid? Seems to me
we have similar enough ideas to need conflict resolution, because we had
them simultaneously but communicate them asynchronously.
Huh? Sorry, I didn't get what you're trying to say here.
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not