Again, me with the pointed hat in the corner with "IMO" written on it. Large grains of salt with this response.

On 23 Jan 2006, at 14:45, Mike Wannamaker wrote:
I totally agree with your points and I believe for the most part the
model/schema will already be defined by the application that you are working on. However there are times that perhaps a component may want to store
things using castor into a database.  We won't know the database it's
configured at runtime.

Yes - but your application can always run preflight on startup to verify that the DDL is in the state that it should be; that provides a good, solid mechanism for performing updates later.

We would have to write our own code to create all the tables etc for all the databases supported. Every component that wants to do it would have to do
this.

Yes - but now imagine that you change the mapping; will you also expect castor to maintain backwards compatibility? And what if the new mapping has a new foreign key constraint? Evolution of the application, and its database, is every bit as important as its first run; if you're going to tackle one problem, as an application developer, you have to tackle both.

Nothing saves you, ultimately - you either need to manage this process as a part of installation or first run, or manage it dynamically within the application.

The former is easily manageable with installation and upgrade scripts; and provides you a way of shipping a product which can sense and perform the necessary upgrades. (I say "easily" only as a comparison against the other option)

For example: One way of managing this would be via an ETL tool like Octopus.
http://www.enhydra.org/tech/octopus/index.html

That would allow you to describe existing data structures and perform the transformations without having to load each object in one form and write it out in another into a new table structure.

The fundamental problems are bigger than even Octopus solves - the problem of how to get an initial database is based on domain-specific knowledge, while the problem of upgrading databases over time is based on data transformation and upgrading DDL.

Ultimately, the scope on the fundamental problem is huge; and the only patches that applications like Castor and Hibernate can give are "create me a database that holds this stuff" - not necessarily quickly, not necessarily with the right kinds of structures, and not necessarily with the same decisions that anyone with a rudimentary understanding of the problem domain
would choose.

So I was thinking more of a runtime thing were components may want to store
things into the database that a customer has already configured.

1) Detect your DDL versions by storing versioning information into a metadata table in your database; 2) Perform tasks to perform upgrades, through, for example, ANT tasks and scripting, to perform those tasks in a repeatable and testable manner that is then safe to deploy to customers. 3) Upgrade the DDL version information as updates to table structures are made.

  If tables
don't exist create them. Also people would have to realize that the tables
created would not be optimized to the best they could.

Nobody solves this problem well - even Hibernate does this with an external SchemaExport tool that generates DDL on the initial generation, though its SchemaUpdate tool can do upgrading.

And the cost of this functionality? Not small. Now add in the limitations: Not all JDBC drivers support the necessary metadata functionality to get at this data, and it would expose a raft of interoperability problems not just between drivers, but between versions of those drivers. All in an environment where, today, we struggle to get enough coverage of releases and development time from the community of users.

No part of it makes sense; it doesn't make sense for Hibernate, and they *have* the functionality. It doesn't make sense for us, either. Admittedly, I'm a hardliner - I'm one of those people who thinks that the advent of MySQL has led to the development of a lot of poorly written applications that rely on MySQL's amazing read performance to cover up for laziness or bad programming practice; that MySQL is the best and worst thing to ever happen to the developer community, resulting in a raft of applications which are not only tied to MySQL's dialect of SQL, but which can never have a hope of running on any other database due to the unique performance characteristics of the average non-ACID deploy of that server. (All users of PhpAds, raise your hand.)

It's not just something I feel that we can't solve easily - it's something I feel one shouldn't even bother trying to do. I'm all for a really good open-source ETL project that would let people who knew what they were doing provide a good way of installing and upgrading databases as a part of a deploy - I just don't feel that a mapping file is a good source of that information, or that the problem is anywhere near as platform agnostic as one might wish it to be.

Databases, when simplified to that level, never perform well. The streets are littered with the dead projects of those who have pretended the world was otherwise. :)

Even incorporating into a mapping file structural information to assist in DDL construction won't solve the migration problem - an ETL tool is a better home for this.



-------------------------------------------------
If you wish to unsubscribe from this list, please send an empty message to the following address:

[EMAIL PROTECTED]
-------------------------------------------------

Reply via email to