On 23/01/2008, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > On 2008-01-23 02:18, James Henstridge wrote: > >> [XID format used in XA] > >> So, essentially, only the global transaction id and the branch id > >> are relevant and both are represented in the data string. > > > > One interesting part of that is the "If OSI CCR naming is used, then > > the XID's formatID element should be set to 0; if some other format is > > used, then the formatID element should be greater than 0." > > > > I took a quick look at a few J2EE servers (which use XA), to see what > > they do for transaction managers. Neither JBoss or Geronimo seem to > > use formatID=0, but instead use magic numbers that I presume are > > intended to determine if they created the transaction ID. > > > > That said, the selection of format identifiers seems a bit ad-hoc: > > Geronimo uses 0x4765526f, which has a byte representation of "GeRo". > > > > It seems that you could do pretty much the same thing by getting TMs > > to check the global ID itself ... > > So we do need to store the "formatID" as well ? > > >> BTW, there's a nice extension module that let's you hook Python > >> between the TM and RM using XA: > >> > >> http://www.hare.demon.co.uk/pyxasw/ > > > > > > > >>> I do see a use for the branch qualifier though. In a distributed > >>> transaction, each resource should have a different transaction ID that > >>> share a common global transaction ID but separate branch qualifiers. > >>> > >>> As transaction IDs are global within database clusters for some > >>> backends (PostgreSQL, MySQL and probably others), the branch qualifier > >>> is necessary if two databases from the cluster are used in the global > >>> transaction. > >>> > >>> I think it is worth making the API such that it is easy to program to > >>> best practices. > >> The DB-API has always tried to not get in the way of how > >> a particular backends needs its configuration data, so > >> I think we can still have a single string using a database > >> backend specific format. This could then include one or more > >> of the above id parts. > >> > >> The implementation can then decode the string representation > >> of the transaction id components into whatever format is > >> needed by the backend. > > > > The two reasons I see for using an object to represent transactions > > that contains a global part and branch part are: > > > > 1. round tripping a transaction ID from xa_recover() to > > xa_commit()/xa_rollback(). > > 2. Reduced restrictions on the contents of the transaction ID. > > > > For (1), using a database adapter defined object means that it can > > represent transactions that originated elsewhere, or expose more > > information about those transactions. > > > > For (2), if a database is using specially formatted transaction IDs at > > the Python level that get decoded into the various components, does > > that mean that the application or transaction manager glue needs to > > know how to format the IDs. > > > > In contrast, it is pretty easy for e.g. a Postgres adapter to > > serialise/deserialise a multi-part ID (and this is what the JDBC > > driver does). > > I have no objections against using an object for this anymore, > but let's please use an already existing object such as a > tuple instead of having each database module implement its own > new type. > > Given that the formatID is used for some purpose as well (probably > just as identification of the TM itself), I guess we'd have > to use a 3-tuple (format id, global transaction id, branch id). > > Modules should only expect to find an object that behaves like > a 3-sequence, they should accept whatever object is passed to > them and return it for the recover method. > > This leaves the door open for extensions used by the TM for XID > objects.
I've had a bit more time to think about this, and have two proposals on how to handle transaction IDs. I think they offer equivalent functionality, so the choice comes down to what we want the API to look like. Proposal 1: * Plain string IDs should work fine as transaction identifiers for applications built from scratch with that assumption: they would need to identify the global and branch parts in their own way. * A plain string can be stuffed inside an XA style transaction identifier, even if it isn't making use of all the different components. * Therefore, all methods accepting transaction IDs should accept strings. * As some transaction IDs in the database might not match this simple form, there are two options for the recover() method: 1. return a special object that represents the transaction, which will be accepted by commit()/rollback(). How string-like must these objects be? 2. omit such transaction IDs from the result. * For databases that support more structured transaction IDs (such as those used by XA), the 2PC methods may accept objects other than strings. Proposal 2: * Many databases follow the XA specification, so it makes sense to use transaction identifiers structured in the same way. * For databases that do not use XA-style transaction IDs, it is usually possible to serialise such an ID into a form that it can work with. * Therefore, all methods accepting transaction IDs should accept 3-sequences of the form (formatID, gtrid, bqual). * For databases using non-XA transaction IDs, it is possible that some transaction IDs might exist that do not match the serialised form. The recover() method has two options: 1. return a special object representing the ID that will be accepted by commit()/rollback(). Such an object should act like a 3-sequence. 2. omit such transaction IDs from the result. * For databases not using XA-style transactions, the 2PC methods may accept objects other than 3-sequences as transaction IDs. Both of these proposals seem to get rid of the main points of contention: * removes the xid() constructor from the spec. * allow use of simple objects (strings or tuples) as transaction IDs * provides an obvious way to expose database-specific transaction IDs. James. _______________________________________________ DB-SIG maillist - DB-SIG@python.org http://mail.python.org/mailman/listinfo/db-sig