Re: [HACKERS] Some questions about mammoth replication

Alexey Klyukin Fri, 12 Oct 2007 02:43:54 -0700

Hannu Krosing wrote:

> > We have hooks in executor calling our own collecting functions, so we
> > don't need the trigger machinery to launch replication.
> 
> But where do you store the collected info - in your own replication_log
> table, or do reuse data in WAL you extract it on master befor
> replication to slave (or on slave after moving the WAL) ?


We don't use either a log table in database or WAL. The data to
replicate is stored in disk files, one per transaction. As Joshua said,
the WAL is used to ensure that only those transactions that are recorded
as committed in WAL are sent to slaves.

> 
> > > Do you make use of snapshot data, to make sure, what parts of WAL log
> > > are worth migrating to slaves , or do you just apply everything in WAL
> > > in separate transactions and abort if you find out that original
> > > transaction aborted ?
> > 
> > We check if a data transaction is recorded in WAL before sending
> > it to a slave. For an aborted transaction we just discard all data 
> > collected 
> > from that transaction.
> 
> Do you duplicate postgresql's MVCC code for that, or will this happen
> automatically via using MVCC itself for collected data ?

Every transaction command that changes data in a replicated relation is
stored on disk. PostgreSQL MVCC code is used on a slave in a natural way
when transaction commands are replayed there.

> 
> How do you handle really large inserts/updates/deletes, which change say 10M 
> rows in one transaction ?

We produce really large disk files ;). When a transaction commits - a
special queue lock is acquired and transaction is enqueued to a sending
queue. Since the locking mode for that lock is exclusive a commit of a
very large transaction would delay commits of other transactions until
the lock is held. We are working on minimizing the time of holding this
lock in the new version of Replicator.

 
> > > Do you extract / generate full sql DML queries from data in WAL logs, or
> > > do you apply the changes at some lower level ?
> > 
> > We replicate the binary data along with a command type. Only the data
> > necessary to replay the command on a slave are replicated.
> 
> Do you replay it as SQL insert/update/delete commands, or directly on
> heap/indexes ?

We replay the commands directly using heap/index functions on a slave.

Regards,

-- 
Alexey Klyukin                         http://www.commandprompt.com/
The PostgreSQL Company - Command Prompt, Inc.


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Re: [HACKERS] Some questions about mammoth replication

Reply via email to