Re: [Sequoia] Re: [DOCS] [HACKERS] Replication documentation addition

Emmanuel Cecchet Wed, 22 Nov 2006 02:48:45 -0800

Bruce,

OK, I made it more open-ended:


    If queries are simply broadcast unmodified, functions like
    <function>random()</>, <function>CURRENT_TIMESTAMP</>, and
    sequences would have different values on different servers.
    This is because each server operates independently, and because
    SQL queries are broadcast (and not actual modified rows).  If
    this is unacceptable, either the middleware or the application
    must query such values from a single server and then use those
    values in write queries.  Also, care must be taken that all
    transactions either commit or abort on all servers, perhaps
    using two-phase commit (<xref linkend="sql-prepare-transaction"
    endterm="sql-prepare-transaction-title"> and <xref
    linkend="sql-commit-prepared" endterm="sql-commit-prepared-title">.
    Pgpool is an example of this type of replication.

Actually, you don't necessarily have to query the value from thedatabase in the middleware case. The middleware can generate thesevalues (it is even better for timestamps so that you don't have to dealwith possible unsynchronized clocks between the different nodes).Usually 2-phase commit is an overkill for performance. Another approachis to optimistically commit transactions and if one node fails tocommit, it is just removed from the cluster and will need to beresynchronized later on. If all transactions are sent in the sameserializable order to all cluster nodes, they will always behave thesame (i.e. commit or abort) unless something really bad happens(partition is full, misconfiguration between nodes, ...).Note that if the middleware only broadcasts queries as they come withoutunderstanding anything about transactions, you will never getconsistency between your cluster nodes. From what I got from pgpool,either you use strict mode (queries are just sent 1 by 1) in which caseyou lose any form of concurrency and do not allow conflicting accessesto the same resource. If you don't use strict mode, you will getinconsistencies. Therefore the usage of pgpool seems restricted to veryspecific workloads.On the failure side, pgpool only works with a master and a slave. Theswitchover process seems to be manual so it is not clear what wouldhappen if the master fails to commit but the slave succeeds.


BTW, congratulations for your documentation effort.
Emmanuel

--
Emmanuel Cecchet
Chief Scientific Officer, Continuent

Blog: http://emanux.blogspot.com/
Open source: http://www.continuent.org
Corporate: http://www.continuent.com
Skype: emmanuel_cecchet
Cell: +33 687 342 685


_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia

Re: [Sequoia] Re: [DOCS] [HACKERS] Replication documentation addition

Reply via email to