At 12:05 AM 2/3/2006 -0800, Morgen Sagen wrote:
Although I had always assumed we would have code within each parcel
that knew how to upgrade its data from the previous version to the
current version, a more elegant solution would be to have a log of
schema changes coupled with some transformation code that could apply
those changes, as you described in your October email. So how do
those two methods compare? You say:
"With sufficient care and infrastructure support, we can relatively
easily
support manual schema upgrades, in the sense of having installParcel () make
the changes, if we entirely forbid certain classes of schema change
that
could not be implemented in this way. However, the amount of
developer
care required currently appears prohibitive, in the sense that it's
going
to seriously impede our flexibility to refactor."
Do you mean that if we did take the route of putting hand-built
transformation code into installParcel( ), the amount of
transformation code would be unwieldy?
I don't know. I don't know what kind of changes we're going to have. The
biggest question of all is, when does this discipline begin? If we don't
need to support upgrades before the release of 0.7, then there's a lot less
to be done, and it's not certain that we need to provide any significant
evolution infrastructure until 0.8. For one thing, we can try to complete
major moves before then, and we can make an effort to document and prepare
for the freezing of key schemas.
If we go the schema-change- log route, developers would still have to
create log entries for each
change, or are you saying we could automatically build that log?
That all depends. :) So far, my observation has been that there are
approaches that can manage detection of simple kinds of changes, but
serious changes are harder to deal with.
In all honesty, the best recommendations I've seen suggest that having a
well-defined externalization of your data (e.g. read/write XML, iCal,
tab-delimited, etc.) is usually the best way to ensure upgradeability. In
effect, the object oriented approach to databases is essentially a really
bad idea, because it tends to couple internal implementation details to
your schema. I wish I'd known that a few years sooner. :)
But clearly, that insight doesn't help us much anyway. :) Our current
situation is more of a chicken-and-egg problem. I don't know what
evolution features we really need, because we haven't yet upgraded users'
data. We haven't yet upgraded users' data because we haven't decided as of
what point we will be *keeping* their data. And we haven't done that,
because we don't know what kind of schema evolution we can support, and so
on. :)
I suppose one way to investigate the issue might be to study the revision
logs of the schema version number, to see what kinds of things it has been
changed for in the past.
There's always the chance that if we do the log-based transformation
system, some parcel writer will want to be able to perform an upgrade
that the transformation system doesn't support. In that case could
they have custom upgrade code in their parcel, or would all upgrading
need to go through the transformation system?
That I don't know. One question that's in my mind about this is whether
these transforms can be incremental and "just-in-time" or whether they have
to be all-at-once. If they're all-at-once, they're going to need a
progress UI, which means there needs to be more of an API structure than
just "we call some code and stuff happens". It'll need to be organized in
a way that allows progress updates to take place.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Open Source Applications Foundation "Dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/dev