Re: [HACKERS] Some questions about mammoth replication

Hannu Krosing Fri, 12 Oct 2007 03:51:22 -0700

Ühel kenal päeval, R, 2007-10-12 kell 12:39, kirjutas Alexey Klyukin:
> Hannu Krosing wrote:
> 
> > > We have hooks in executor calling our own collecting functions, so we
> > > don't need the trigger machinery to launch replication.
> > 
> > But where do you store the collected info - in your own replication_log
> > table, or do reuse data in WAL you extract it on master befor
> > replication to slave (or on slave after moving the WAL) ?
> 
> We don't use either a log table in database or WAL. The data to
> replicate is stored in disk files, one per transaction.


Clever :)

How well does it scale ? That is, at what transaction rate can your
replication keep up with database ?

>  As Joshua said,
> the WAL is used to ensure that only those transactions that are recorded
> as committed in WAL are sent to slaves.

How do you force correct commit order of applying the transactions ?

> > 
> > > > Do you make use of snapshot data, to make sure, what parts of WAL log
> > > > are worth migrating to slaves , or do you just apply everything in WAL
> > > > in separate transactions and abort if you find out that original
> > > > transaction aborted ?
> > > 
> > > We check if a data transaction is recorded in WAL before sending
> > > it to a slave. For an aborted transaction we just discard all data 
> > > collected 
> > > from that transaction.
> > 
> > Do you duplicate postgresql's MVCC code for that, or will this happen
> > automatically via using MVCC itself for collected data ?
> 
> Every transaction command that changes data in a replicated relation is
> stored on disk. PostgreSQL MVCC code is used on a slave in a natural way
> when transaction commands are replayed there.

Do you replay several transaction files in the same transaction on
slave ?

Can you replay several transaction files in parallel ?

> > How do you handle really large inserts/updates/deletes, which change say 
> > 10M 
> > rows in one transaction ?
> 
> We produce really large disk files ;). When a transaction commits - a
> special queue lock is acquired and transaction is enqueued to a sending
> queue. 
> Since the locking mode for that lock is exclusive a commit of a
> very large transaction would delay commits of other transactions until
> the lock is held. We are working on minimizing the time of holding this
> lock in the new version of Replicator.

Why does it take longer to queue a large file ? dou you copy data from
one file to another ?

> > > > Do you extract / generate full sql DML queries from data in WAL logs, or
> > > > do you apply the changes at some lower level ?
> > > 
> > > We replicate the binary data along with a command type. Only the data
> > > necessary to replay the command on a slave are replicated.
> > 
> > Do you replay it as SQL insert/update/delete commands, or directly on
> > heap/indexes ?
> 
> We replay the commands directly using heap/index functions on a slave.

Does that mean that the table structures will be exactly the same on
both master slave ? That is, do you replicate a physical table image
(maybe not including transaction ids on master) ?

Or you just use lower-level versions on INSERT/UPDATE/DELETE ?

---------------------
Hannu




---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

Re: [HACKERS] Some questions about mammoth replication

Reply via email to