Andres, nice job on the writeup.

I think one aspect you are missing is that there must be some way for the multi-masters to re-stabilize their data sets and quantify any data loss. You cannot do this without some replication intelligence in each row of each table so that no matter how disastrous the hardware/internet failure in the cloud, the system can HEAL itself and keep going, no human beings involved.

I am laying down a standard design pattern of columns for each row:

MKEY - Primary key guaranteed unique across ALL nodes in the CLOUD with NODE information IN THE KEY. (A876543 vs B876543 or whatever)(network link UP or DOWN)
CSTP - create time stamp on unix time stamp
USTP - last update time stamp based on unix time stamp
UNODE - Node that updated this record

Many applications already need the above information, might as well standardize it so external replication logic processing can self heal.

Postgresql tables have optional 32 bit int OIDs, you may want consider having a replication version of the ROID, replication object ID and then externalize the primary
key generation into a loadable UDF.

Of course, ALL the nodes must be in contact with each other not allowing signficant drift on their clocks while operating. (NTP is a starter)

I just do not know of any other way to add self healing without the above information, regardless of whether you hold up transactions for synchronous or let them pass thru asynch. Regardless if you are getting your replication data from the WAL stream or thru the client libraries.

Also, your replication model does not really discuss busted link replication operations, where is the intelligence for that in the operation diagram?

Everytime you package up replication into the core, someone has to tear into that pile to add some extra functionality, so definitely think about providing sensible hooks for that extra bit of customization to override the base function.

Cheers,

marco

On 9/22/2012 11:00 AM, Andres Freund wrote:
This time I really attached both...



Reply via email to