On Sat, 2007-04-21 at 10:52 -0400, Michael Bayer wrote: > On Apr 21, 2007, at 3:09 AM, Carsten Haese wrote: > > > Hi All, > > > > I have taken the time to write out my type mapping proposal in a > > slightly more structured form, with some hopefully enlightening > > examples > > of how this proposal might be useful. > > > > Please see http://www.uniqsys.com/~carsten/typemap.html > > > > Any comments are welcome, and I'll do my best to incorporate > > constructive criticism into future revisions of this proposal. > > > > heres some thoughts:
Thanks for taking the time to read the proposal. You are making some good points. If you don't want to read all my responses, skip to the summary at the bottom. > - using a class-level approach, i.e. SQLData, makes it inconvenient > to establish custom types that are independent of a particular DBAPI, > since the SQLData class itself like everything else in DBAPI only > exists within implementations. its impossible to define the class > until you've imported some DBAPI. SQLData's origins in JDBC dont > have this issue since SQLData is part of the abstract JDBC api and > classes can be built against it independently of any database driver > being available. It's not impossible. You could always mix-in the somedb.SQLData base class into a generic Python object after importing somedb. Also, since the point is for an application to define type mappings for a particular database, I don't see the limitation in making the SQLData class specific to the API module that the application will usually have imported already anyway. In my opinion, making the SQLData class specific to the API module is necessary since the ToDB method for translating a particular object to the database may differ from database to database. > - because SQLData's state is the data itself, SQLData is not really a > "type" at all, its a value object which includes its own database > translation function. That greatly limits what kinds of types > SQLData can be realistically used for, and in fact it can only be > used for datatypes that are explicitly aware that they are being > stored in a database - and only a specific DBAPI too. Isn't that the point of defining a bidirectional type mapping from/to the database? > For example, its impossible to use SQLData to directly represent > Decimal instances or datetime instances; neither of them subclass > <mydbapi>.SQLData. True, but on one hand, datetime instances and Decimal instances should be handled by the API's canonical mapping already, and on the other hand the somedb API could always choose to return datetimes as objects that derive from both datetime and somedb.SQLData. > If the answer is that we'd just use typemaps > for those, then what would we use SQLData for ? I can use a typemap > for my SpatialData objects just as easily, without my SpatialData > object being welded to a specific persistence scheme and specific DBAPI. Of course you could, but "welding" the object to a specific DB-API is what allows the object to be passed transparently as an input parameter into queries against that specific database. A corollary of the principle of least surprise is that it should always be possible to take the result of a select query and insert that object into the column that it was read from. Inheriting from SQLData is what allows this seamless select-insert round-trip. Having said all that, I'm not married to the idea of requiring the application-side objects to derive from a particular SQLData class. For the purpose of input binding, it would be enough, and more in line with the idea of duck-typing, if the object provided an agreed-upon method, e.g. "ToDB" that the DB-API can call to translate between application type and canonical database type. Essentially, the proposed input translation could change to if hasattr(in_param, "ToDB"): in_param = in_param.ToDB() or something like that. It may be beneficial to allow this call to pass more parameters in order to tell the object something about the context in which the conversion is occurring, including but not limited to the name of the API module, the active connection/cursor, and, if available, the descriptor of the database-side column type and name that the object is destined for. > Also because SQLData is not stateful with regards to its type, its > not possible for a single SQLData class to represent variants of a > particular type, such as strings that should be truncated to length > 50 versus strings that are truncted to length 100; youd have to use > more subclassing. As proposed, SQLData *is* stateful with regards to the type. By default, it's not stateful with regards to subtype, length, and precision, but this can, and should, be added. If the constructor is given the complete cursor.description entry that goes along with the value, it has everything it needs to remember this information. > - per-column typemaps: here is a common use case. I am receiving a > row which contains two BLOB coluimns. one BLOB is image data > representing a JPEG image, one BLOB is a pickled instance of a Python > class. I would like to register type converters so that the second > column is run through the pickle.loads() function but not the > first. If we are registering various type-handling callables at the > cursor level, it should be easy enough to add an optional integer > parameter which will bind that type converter to only a specific > column position in a result set. the use case is more obvious in the > bind parameter direction. Yes, I already suggested this (passing the column number to the outbound adapter) as a possible extension. However, the use case is convincing enough that we should probably allow for a more convenient per-column mapping that allows dispatching the conversion to a different adapter callable altogether, rather than having to define one adapter that returns one thing or another depending on which column it's converter. To handle this, the cursor could grow a coltypemap attribute, which is a mapping of typemaps, keyed on the column number or, maybe more conveniently, column name. In summary, I am open to making the following revisions: * The SQLData class would become optional or be eliminated. Inbound type conversions between Python objects and the database will be performed by a well-defined ToDB method that the object may implement regardless of its inheritance tree. If an inbound Python object doesn't define a ToDB method, it'll be mapped by the canonical mapping for the particular database. * The outbound conversion call will receive additional parameters, such as the cursor.description tuple, that will allow the adapter to make the resulting object stateful with respect to all of its database type properties. * Add an optional coltypemap attribute to the cursor for defining a column-specific typemap. Unless I'm missing something, these revisions should address all the points you have brought up. -Carsten _______________________________________________ DB-SIG maillist - DB-SIG@python.org http://mail.python.org/mailman/listinfo/db-sig