Marc Morin wrote: > I loath to enter into this discussion, but feel that I need to put in a bit > of experience that I have observed when dealing with databases and migration > issues. > > There are a couple of truisms that I have come to accept, grudgingly: > > 1) Database schemas will change over time. > 2) Database schema changes are a pain to deal with. Migration issue are > difficult. > 3) Unlike changes in code, that have no "history" to them (other than an > interface specification to external api), database changes, have a long time > horizon. Decisions that were rational that lead to a schema design at the > time, but new use cases, etc.. result in re-factoring or enhancing cause you > to continually deal with the implications of this "old" design. > 4) It's impossible to satisfy all schema API revision constraints at the same > time. > 5) You need to have a formal way to manage migration of data in your > framework. > 6) Only support the newest schema, don't leave legacy artifacts around in > schema, tends to increase duplication, increased confusion, and increased > "code entropy". > 7) Avoid schema as an API point in your system if at all possible... can > really tie your hands from a development point of view. Means other systems > will need to be updated when merging... > > So, Ofbiz doesn't really have any formal migration process. checkdb() adds > missing stuff, but is incomplete in some changes (column type changes, index > changes, pkey changes, drops, null/not null changes), but more importantly, > the data migration framework is meant to be outside, and manual... leading to > problems, inconstancies, and more importantly, difficulties. > > The "rename" the old entity, create a migration service, then manually run > it, is a "weak" migration framework. > > I have had success when trying to keep a tighter lid on things: > > 1- automatically create .sql migration files representing the differences in > schemas. (rev<n> - rev<n-1> differences). > 2- automatically add insert/update/delete commands representing the > differences in seed into these files. > 3- manually edit these files, or generate new ones to represent migration of > data, where the "simple" cases are not handled, such as moving data from one > entity, to another, etc... > 4- these sql migration commands become formally part of the codebase. > Migration scripts automatically run on older schema versions (trick is the > detect what version the schema actually is.... can't rely on version number, > etc... too easy to be inaccurate, and run all "needed" migration commands). > 5- Only supporting the "current" schema in the code base. This means that as > the schema evolves, code that isn't in the project will need to be upgraded > when merging in these changes. > > Obviously the above series of steps are meant to move all "legacy" databases > through an upgrade process to the current revision, such that they should be > indistinguishable from being created from a newly minted schema. > > I have done this type of support in my past life, with another project, other > than ofbiz and it helped tame the affects of schema changes. We also have > developed this approach at Emforium to manage the upgrade of our customer's > ofbiz instances. It has been VERY DIFFICULT to keep these instances > consistent with one another, as the schema has changed over time.... I won't > say that this is 100% nailed down, but we are on our way. > > So, I'd be interested in discussing how we can add a more formal migration > declaration into ofbiz, with forward and backward migrations formally put > into place (like ruby on rails, etc...). > > We can talk about using our developed framework for this, but there are > limitations; such as .sql files are not likely portable, and certainly can't > be easily split up between datasources. > > We've also generated a number of self tests, to test that a db at rev<n-1> > can be upgraded to rev<n> with the seed to be identical, this is CRUCIAL to > ensure, that the migration chain is unbroken...
I've got code started to handle this, from a few months ago. <UpgradeScript id="FooBarBaz" location="component://$name/path/file.ext"/> Then, at startup, a separate, special delegator is created very early, which uses a private datastore(derby, postgres, or whatever). Any newly imported UpgradeScripts are then run just this once. My code had support for script dependencies, ala debian, and could order things appropriately. The idea is based on another package I saw developed, which did database stuff the old fashion way, with a bunch of sql scripts individually developed for each database vendor. The package stored the current version somewhere, and it just ran all upgrade scripts in series from the last installed version to the new version. This would handle multi-year upgrades, but it required keeping those upgrade scripts around forever. However, the scripts were isolated, so the mainline code was kept clean.
