On Thu, 2002-04-04 at 13:07, Edmund Lian wrote: > > Ian wrote: > > >> 2. Support for transactions > > > >I didn't do this, but it would be nice... what kind of interface do you > >use for this? In a multi-threaded environment, I'd be unsure about how > >to lump together transactions... unless, I suppose, you have a > >connection object associated with each thread's accesses. > > Yes. This seems to be the only way to do it since the DB API does > transactions at the connection object level rather than say, the cursor > level. > > > >I started with the thinest of wrappers and have filled it out some as I > >feel inspired... I haven't felt inspired enough to do joins, but I'm > >getting closer. I'd want to do a lot of caching, but keeping the cache > >valid is a significant problem. The many-to-many joins, or other > >non-obvious joins, are a bit more difficult to properly abstract... but > >then those joins maybe should be left up to the implementor. > > I think the cacheing problem is insurmountable in a DBMS independent way. > With PostgreSQL, you could have the table notify you when it is updated, > but no such facility exists with say, Oracle unless you use a trigger > coupled to a stored procedure.
Yes, for the most part I've only approached a situation when my code is the only code accessing the database -- in fact, that it is the only process accessing the database. This is a significant limitation. If I was to do large imports from another tool, I'd add something to lock access and invalidate all caches until the external process was finished. This is, of course, rather inefficient. I suppose it would be sufficient to invalidate all caches when the external process is finished, without locking access... but I'll try to avoid it. This does make the normal schema less important, though. I still feel like the database becomes a good, portable storage format -- nearly as good as XML for archiving purposes, if not casual information interchange. I'm starting to realize that I'm mirroring MK in many ways :) If it hadn't been for the CSV database descriptions, maybe I'd never have come this direction... well, I prefer my style of instantiation as well, but again, it's probably a more aesthetic than structural distinction. > After struggling with the semantics of a join, I've come to the conclusion > that joins cannot be handled using the same kind of object as one > representing a single table. In any case, an object representing two or > more joined tables (really just a view) can't accept row modification or > deletion methods since current DBMS products don't allow you to > write-through views. Well, maybe Oracle EE does since it has materialized > views, but who can afford this anyway? > > >Have you looked at the query-generation wrapper I wrote? I'm not > >entirely sure how successful the idea was, but it could create queries > >from Python expressions. You had to use & and | instead of "and" and > >"or", since and/or can't be overloaded, but it was otherwise fairly > >similar to regular Python expressions -- unfortunately, & and | don't > >have the right order of precedence. > > Hmmm... never got the message containing the code, and SF doesn't store > attachments (your post is here: > http://www.geocrawler.com/mail/msg.php3?msg_id=8240549&list=3854) > > Could you email them to me directly so I can look at them? The code I wrote recently is at: http://www.colorstudy.net/software/webware/DBWrapper-0.0.tar.gz But the query generation is separate (I haven't actually used it, it was just an experiment). It's at: http://www.colorstudy.net/software/webware/SQLBuilder.py > >Is each row a distinct object, with accessors and all? How do you add > >extra methods to these objects? > > No, in the examples I gave, x = DBTable('table_name'), which just > instantiates an object to mediate between the table in question and my > code. Whether the .delete(), .update() methods operate on an individual row > or a bunch of rows depends on the specificity of the conditions named in > the .update() and .delete() methods. > > I did it this way because if you treat each row as a distinct object with > accessors and methods, you will devolve down to procedural code when you > need to modify a bunch of similar rows (e.g., update user_table set > group_name = 'apple' where group_name = 'orange') When you do this, you > lose the advantages of using declarative languages like SQL to begin with. How do you add extra methods to your results? For instance, I keep extra information associated with a row in files quite often -- for instance, image files which are inefficient to store in the database. However, I'd like to treat that extra information the same way I treat the in-DB information. All the wrappers that use dictionary-like access bugged me because they made me make that distinction explicitly in the API. I've forgone the advantages of grouped updates, but I don't think my code would use a grouped update in most cases -- or, if it does, it's only occasionally, and the groups are on the range of 2-5, so the overhead of repetitive updates is relatively low. It's just so incredibly convenient to be able to pass database objects around. > >> # Return just col_one and col_two of a certain subset: > >> x.columnList = ['col_one', 'col_two'] > >> x.select(where="id=1 and name='apple'") > > > >Is x like a connection, and columnList effects the next .select > >statement? That seems rather unfunctional -- why not just put it in the > >select method call? > > x is just an object representing one table. columnList does affect the next > .select statement right now. I did this because x, as an object, has state > information. It knows what columns it has, and what columns to retrieve. I > want to separate the "I know what columns to retrieve" from the "I know > which rows to retrieve" behavior to ease the syntax in making calls--I'm > trying to make it as painless as possible to interact with the tables. Maybe it's that I started programming in functional languages, or that I like the set-like analogies in an RDBMS, but I can't stand stateful APIs, especially with databases -- ADO is like this, as I remember, as well as DBAPI and others. I just don't understand why you'd want to do it that way -- perhaps if you have another object, like a selection/querying object, which holds this state separate from the connection. So, if you have a connection object conn, you might do: sel = conn.selection() ## sel.rows() will now be all rows sel.where = "id=1 and name='apple'" sel.columnList = ['col_one', 'col_two'] ## sel.rows() now would be equivalent to: ## sel.rows(where="id=1 and name='apple'", columnList=['col_one', 'col_two'] ## but you can still do... sel2 = conn.selection() sel2.columnList = ['col_one', 'col_two'] ## And then you can still access sel.rows() and get that result set. If you did happen to use the SQLBuilder code, you could potentially treat the where clause more dynamically, like: sel.where = column.id == 1 sel.where &= column.name == "apple" I find this is actually a fairly common task. Ian _______________________________________________ Webware-discuss mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/webware-discuss
