On 18/01/2008, James Henstridge <[EMAIL PROTECTED]> wrote: > So is there any recommendations for what a two-phase commit API should > look like?
I did a bit of investigation into a few databases, and came up with a proposal for an extension to the DB-API. I know that there are a few incomplete portions of the proposal, so I'd appreciate feedback. If you have knowledge of a database not covered here, please comment on whether the proposed API would be workable in that context. Re: the confusion between "prepared transactions" vs. "prepared statements" support, this probably won't conflict since the prepared statement extensions I saw used Cursor.prepare() rather than Connection.prepare(). If it is a problem though, the proposal could be modified to use prepare_transaction() or similar. James.
= DB-API Two-Phase Commit = Many databases have support for two-phase commit. Adapters for some of these databases expose this support, but often through mutually incompatible extensions to the DB-API standard. Standardising the API for two-phase commit would make it easier for applications and libraries to support two-phase commit with multiple databases. == Operations == The flow of a two-phase commit transaction looks something like this: 1. begin the transaction 2. do some work 3. prepare the transaction for commit 4. commit or rollback the prepared transaction (depending on successful preparation of the other transactions). For some databases, a transaction identifier must be provided. Some databases expect the identifier to be structured in two parts: a global transaction identifier and a branch qualifier. As it is possible for a failure to occur between preparing a transaction and committing it, there is often a way to discover those transactions and issue a commit or rollback. == Database Support == === PostgreSQL === PostgreSQL has supported two-phase commit since 8.1. To start a two-phase commit transaction, no special command need be issued other than the standard "BEGIN" statement. The transaction can be prepared with the "PREPARE TRANSACTION" statement as described here: http://www.postgresql.org/docs/current/static/sql-prepare-transaction.html A transaction identifier must be provided that is less than 200 bytes long. The transaction can then be committed or rolled back using the "COMMIT PREPARED" or "ROLLBACK PREPARED" statements. http://www.postgresql.org/docs/current/static/sql-commit-prepared.html http://www.postgresql.org/docs/current/static/sql-rollback-prepared.html Uncommitted prepared transactions can be recovered by listing the pg_prepared_xacts table: http://www.postgresql.org/docs/current/static/view-pg-prepared-xacts.html None of the PostgreSQL database adapters expose two-phase commit at this time. == MySQL == MySQL has supported two-phase commit since 5.0.3. http://dev.mysql.com/doc/refman/6.0/en/xa.html http://dev.mysql.com/doc/refman/6.0/en/xa-statements.html http://dev.mysql.com/doc/refman/6.0/en/xa-states.html To start a two-phase commit transaction, the "XA START" statement must be issued. This statement takes a transaction identifier as an argument, which consists of a global identifier with optional branch qualifier and format ID. The transaction can be prepared by issuing the "XA END" and "XA PREPARE" statements. Finally, the prepared transaction can be committed or rolled back with "XA COMMIT" or "XA ROLLBACK" respectively. Uncommitted prepared transactions can be listed using the "XA RECOVER" statement. The MySQLdb adapter does not appear to expose this functionality. === Oracle === I'm not familiar with Oracle, but the cx_Oracle adapter does expose two-phase commit. The interface is covered here: http://cx-oracle.sourceforge.net/html/module.html http://cx-oracle.sourceforge.net/html/connobj.html To use two-phase commit, the twophase=True must be passed to the cx_Oracle.connect() factory function. The documentation states that this option is not the default due to bugs in older versions of Oracle. A two-phase commit transaction can be started with the begin() method: Connection.begin(formatId, transactionId, branchId) The transaction is prepared with the Connection.prepare() method, and can then be committed or rolled back with the standard commit() or rollback() methods. ==== Questions ==== 1. What happens when you call prepare() on a transaction that was begun implicitly. 2. What happens if you call close() after calling prepare()? 3. Can you recover uncommitted prepared transactions? === Interbase / Firebird === The kinterbasdb adapter exposes two-phase commit. It does not appear to need any special setup for a two-phase commit transaction -- simply call Connection.prepare() prior to Connection.commit() or Connection.rollback(). There does not appear to be any API for recovering uncommitted prepared transactions, but perhaps they don't survive longer than the connection. == Summary == When looking at these databases, there appear to be a few details where they differ: 1. do they require the application to provide a transaction ID? 2. do they need to know in advance that a given transaction will use two-phase commit? 3. do they provide a recovery API? To cover difference (1), we have two choices: 1. Require the user to provide a transaction ID. Databases that don't require a transaction ID can ignore it. 2. Don't expose transaction IDs, and require adapters that need them to generate them (use UUIDs maybe?) 3. A combination of the two, making it optional to provide transaction IDs in the API. Due to difference (2), it is probably necessary to have a method to start a two-phase commit transaction. It should take a transaction ID (assuming that we are exposing transaction IDs). As for recovery, it is probably best to leave this out of scope initially. == Proposal == I think it'd be worth adopting an interface similar to cx_Oracle's and kinterbasdb's. That is: 1. Add a Connection.begin(...) method that explicitly starts a transaction. Some argument (possibly the transaction ID) causes the transaction to use two-phase commit. May raise NotSupportedError if two-phase commit is not supported. [As some adapters currently have a no-argument begin() to start a normal transaction, an argument will be necessary to begin in two-phase mode]. 2. Add a Connection.prepare() method that peforms the first stage of two-phase commit. May raise NotSupportedError if two-phase commit is not supported, or the transaction was not started in two-phase mode. 3. Calling commit() or rollback() on the connection after prepare() performs the second stage of the commit. 4. Calling commit() or rollback() on the connection prior to prepare() performs a one-phase commit or rollback. 5. Executing statements after prepare() but before commit() or rollback() results in an error (ProgrammingError?) 6. Closing a connection with a prepared but uncommitted transaction rolls back that transaction. 7. Some module-level attribute is present that indicates support for this API.
_______________________________________________ DB-SIG maillist - DB-SIG@python.org http://mail.python.org/mailman/listinfo/db-sig