At 05:27 PM 10/9/2007 -0700, Andi Vajda wrote:
Not throw out. Migrate to a new schema. Just like in a relational database.
If you change the low-level layout (format), core schema, or app schema (table layout) someone needs to migrate the data. It might be apparently easier in a relational schema but not so once you've carefully optimized it and duplicated stuff left and right to get the desired performance. Essentially, it becomes harder once the 1-1 correspondance between programmer's view (kind/class) and SQL table is broken.

Have a look at Hibernate, which is used by Cosmo: it uses an XML file that specifies the mapping between objects and database. The contents of this file are never known to the application, which simply uses its own object model.

Hibernate maps object retrieval and queries to SQL, and applications use either the collections defined by the mapping, or use "HQL", which is an SQL-like query language that queries in terms of the *object* schema, rather than the relational one. And it takes care of all the non-1-1-ness in the mapping.

Now, if you add new types to the application schema, of course you have to add to the XML file. But in principle you could generate the XML in a logical fashion from the new piece of application schema, so that even that step is not necessary when you are first adding to the application.

Now, Hibernate is not available for Python (although I suppose you could make it so with JCC!) but it illustrates the point that is possible to separate things in this fashion. I believe there is at least one Python ORM that claims to be inspired by or to work like Hibernate, though. I also seem to recall that SQLAlchemy for Python also has a great deal of flexibility in mapping between different relational schemas, such that your code can deal with a logical schema rather than an actual one.

There is also the possibility of just rolling Yet Another Python ORM, perhaps based on EIM. But these things don't matte as much as layering the application in such a way that it does not *care* how things actually get stored. Chandler's domain model objects should not be subclasses of a storage type, for example. (i.e., they should not be repository.Items).

That way, we will be able to experiment with different mappings and different back ends for optimum performance. For that matter, we could use more than one back end if we chose, such that email bodies might be stored in mbox files, while their headers get indexed in SQLite. (While all being dumpable and reloadable, of course.)

And, it is likely that for some period, we will still back-end to the repository -- we just would go through a mapping layer of some sort first. (And that would mean that we could do some physical schema tuning there, without needing to mess with the application layer.)

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Open Source Applications Foundation "chandler-dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/chandler-dev

Reply via email to