I loath to enter into this discussion, but feel that I need to put in a bit of 
experience that I have observed when dealing with databases and migration 
issues.

There are a couple of truisms that I have come to accept, grudgingly:

1) Database schemas will change over time.
2) Database schema changes are a pain to deal with.  Migration issue are 
difficult.
3) Unlike changes in code, that have no "history" to them (other than an 
interface specification to external api), database changes, have a long time 
horizon.  Decisions that were rational that lead to a schema design at the 
time, but new use cases, etc.. result in re-factoring or enhancing cause you to 
continually deal with the implications of this "old" design.
4) It's impossible to satisfy all schema API revision constraints at the same 
time.
5) You need to have a formal way to manage migration of data in your framework.
6) Only support the newest schema, don't leave legacy artifacts around in 
schema, tends to increase duplication, increased confusion, and increased "code 
entropy".
7) Avoid schema as an API point in your system if at all possible... can really 
tie your hands from a development point of view.  Means other systems will need 
to be updated when merging...

So, Ofbiz doesn't really have any formal migration process. checkdb() adds 
missing stuff, but is incomplete in some changes (column type changes, index 
changes, pkey changes, drops, null/not null changes), but more importantly, the 
data migration framework is meant to be outside, and manual... leading to 
problems, inconstancies, and more importantly, difficulties.

The "rename" the old entity, create a migration service, then manually run it, 
is a "weak" migration framework.

I have had success when trying to keep a tighter lid on things:

1- automatically create .sql migration files representing the differences in 
schemas. (rev<n> - rev<n-1> differences).
2- automatically add insert/update/delete commands representing the differences 
in seed into these files.
3- manually edit these files, or generate new ones to represent migration of 
data, where the "simple" cases are not handled, such as moving data from one 
entity, to another, etc...
4- these sql migration commands become formally part of the codebase.  
Migration scripts automatically run on older schema versions (trick is the 
detect what version the schema actually is.... can't rely on version number, 
etc... too easy to be inaccurate, and run all "needed" migration commands).
5- Only supporting the "current" schema in the code base.  This means that as 
the schema evolves, code that isn't in the project will need to be upgraded 
when merging in these changes.

Obviously the above series of steps are meant to move all "legacy" databases 
through an upgrade process to the current revision, such that they should be 
indistinguishable from being created from a newly minted schema.

I have done this type of support in my past life, with another project, other 
than ofbiz and it helped tame the affects of schema changes.  We also have 
developed this approach at Emforium to manage the upgrade of our customer's 
ofbiz instances.  It has been VERY DIFFICULT to keep these instances consistent 
with one another, as the schema has changed over time.... I won't say that this 
is 100% nailed down, but we are on our way.

So, I'd be interested in discussing how we can add a more formal migration 
declaration into ofbiz, with forward and backward migrations formally put into 
place (like ruby on rails, etc...). 

We can talk about using our developed framework for this, but there are 
limitations; such as .sql files are not likely portable, and certainly can't be 
easily split up between datasources.

We've also generated a number of self tests, to test that a db at rev<n-1> can 
be upgraded to rev<n> with the seed to be identical, this is CRUCIAL to ensure, 
that the migration chain is unbroken...


Marc


Reply via email to