Again, me with the pointed hat in the corner with "IMO" written on
it. Large grains of salt with this response.
On 23 Jan 2006, at 14:45, Mike Wannamaker wrote:
I totally agree with your points and I believe for the most part the
model/schema will already be defined by the application that you
are working
on. However there are times that perhaps a component may want to
store
things using castor into a database. We won't know the database it's
configured at runtime.
Yes - but your application can always run preflight on startup to
verify that the DDL is in the state that it should be; that provides
a good, solid mechanism for performing updates later.
We would have to write our own code to create all the tables etc
for all the
databases supported. Every component that wants to do it would
have to do
this.
Yes - but now imagine that you change the mapping; will you also
expect castor to maintain backwards compatibility? And what if the
new mapping has a new foreign key constraint? Evolution of the
application, and its database, is every bit as important as its first
run; if you're going to tackle one problem, as an application
developer, you have to tackle both.
Nothing saves you, ultimately - you either need to manage this
process as a part of installation or first run, or manage it
dynamically within the application.
The former is easily manageable with installation and upgrade
scripts; and provides you a way of shipping a product which can sense
and perform the necessary upgrades. (I say "easily" only as a
comparison against the other option)
For example: One way of managing this would be via an ETL tool like
Octopus.
http://www.enhydra.org/tech/octopus/index.html
That would allow you to describe existing data structures and perform
the transformations without having to load each object in one form
and write it out in another into a new table structure.
The fundamental problems are bigger than even Octopus solves - the
problem of how to get an initial database is based on domain-specific
knowledge, while the problem of upgrading databases over time is
based on data transformation and upgrading DDL.
Ultimately, the scope on the fundamental problem is huge; and the
only patches that applications like Castor and Hibernate can give are
"create me a database that holds this stuff" - not necessarily
quickly, not necessarily with the right kinds of
structures, and not necessarily with the same decisions that anyone
with a rudimentary understanding of the problem domain
would choose.
So I was thinking more of a runtime thing were components may want
to store
things into the database that a customer has already configured.
1) Detect your DDL versions by storing versioning information into a
metadata table in your database;
2) Perform tasks to perform upgrades, through, for example, ANT tasks
and scripting, to perform those tasks in a repeatable and testable
manner that is then safe to deploy to customers.
3) Upgrade the DDL version information as updates to table structures
are made.
If tables
don't exist create them. Also people would have to realize that
the tables
created would not be optimized to the best they could.
Nobody solves this problem well - even Hibernate does this with an
external SchemaExport tool that generates DDL on the initial
generation, though its SchemaUpdate tool can do upgrading.
And the cost of this functionality? Not small. Now add in the
limitations: Not all JDBC drivers support the necessary metadata
functionality to get at this data, and it would expose a raft of
interoperability problems not just between drivers, but between
versions of those drivers. All in an environment where, today, we
struggle to get enough coverage of releases and development time from
the community of users.
No part of it makes sense; it doesn't make sense for Hibernate, and
they *have* the functionality. It doesn't make sense for us,
either. Admittedly, I'm a hardliner - I'm one of those people who
thinks that the advent of MySQL has led to the development of a lot
of poorly written applications that rely on MySQL's amazing read
performance to cover up for laziness or bad programming practice;
that MySQL is the best and worst thing to ever happen to the
developer community, resulting in a raft of applications which are
not only tied to MySQL's dialect of SQL, but which can never have a
hope of running on any other database due to the unique performance
characteristics of the average non-ACID deploy of that server. (All
users of PhpAds, raise your hand.)
It's not just something I feel that we can't solve easily - it's
something I feel one shouldn't even bother trying to do. I'm all for
a really good open-source ETL project that would let people who knew
what they were doing provide a good way of installing and upgrading
databases as a part of a deploy - I just don't feel that a mapping
file is a good source of that information, or that the problem is
anywhere near as platform agnostic as one might wish it to be.
Databases, when simplified to that level, never perform well. The
streets are littered with the dead projects of those who have
pretended the world was otherwise. :)
Even incorporating into a mapping file structural information to
assist in DDL construction won't solve the migration problem - an ETL
tool is a better home for this.
-------------------------------------------------
If you wish to unsubscribe from this list, please
send an empty message to the following address:
[EMAIL PROTECTED]
-------------------------------------------------