Bruce,

OK, I made it more open-ended:

    If queries are simply broadcast unmodified, functions like
    <function>random()</>, <function>CURRENT_TIMESTAMP</>, and
    sequences would have different values on different servers.
    This is because each server operates independently, and because
    SQL queries are broadcast (and not actual modified rows).  If
    this is unacceptable, either the middleware or the application
    must query such values from a single server and then use those
    values in write queries.  Also, care must be taken that all
    transactions either commit or abort on all servers, perhaps
    using two-phase commit (<xref linkend="sql-prepare-transaction"
    endterm="sql-prepare-transaction-title"> and <xref
    linkend="sql-commit-prepared" endterm="sql-commit-prepared-title">.
    Pgpool is an example of this type of replication.
Actually, you don't necessarily have to query the value from the database in the middleware case. The middleware can generate these values (it is even better for timestamps so that you don't have to deal with possible unsynchronized clocks between the different nodes). Usually 2-phase commit is an overkill for performance. Another approach is to optimistically commit transactions and if one node fails to commit, it is just removed from the cluster and will need to be resynchronized later on. If all transactions are sent in the same serializable order to all cluster nodes, they will always behave the same (i.e. commit or abort) unless something really bad happens (partition is full, misconfiguration between nodes, ...). Note that if the middleware only broadcasts queries as they come without understanding anything about transactions, you will never get consistency between your cluster nodes. From what I got from pgpool, either you use strict mode (queries are just sent 1 by 1) in which case you lose any form of concurrency and do not allow conflicting accesses to the same resource. If you don't use strict mode, you will get inconsistencies. Therefore the usage of pgpool seems restricted to very specific workloads. On the failure side, pgpool only works with a master and a slave. The switchover process seems to be manual so it is not clear what would happen if the master fails to commit but the slave succeeds.

BTW, congratulations for your documentation effort.
Emmanuel

--
Emmanuel Cecchet
Chief Scientific Officer, Continuent

Blog: http://emanux.blogspot.com/
Open source: http://www.continuent.org
Corporate: http://www.continuent.com
Skype: emmanuel_cecchet
Cell: +33 687 342 685


_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia

Reply via email to