At 10:59 AM 5/12/2006 -0700, Morgen Sagen wrote:
On May 11, 2006, at 5:17 PM, Phillip J. Eby wrote:
By "stable external format", I mean a format that does not change
significantly from one Chandler release to the next, and which
allows for version detection of the format itself, as well as
providing version and schema information for the parcels whose data
is contained in the format.
We're also talking about defining a new "sharing format" which has
some of the same requirements that you spell out: needing a
relatively stable format, needing to maintain ref-collection order
even if the format structure is simple and non-nested, avoiding being
bitten by onValueChanged( ) calls during sync, etc. So perhaps there
is an opportunity for re-use here.
Certainly both systems would benefit from getting rid of
sequence-to-sequence birefs.
At this point I haven't covered much actual API detail, or anything
at all about the actual external format. I don't actually care
much about the external format, since it's not a requirement that
it be processed by other programs, and parcel writers will never
see it directly. The API will only expose streams of records of
elementary types, and provide a way for parcel writers to transform
individual records as the streams go by, and to do pre- and post-
processing on the repository contents.
Ah, well, the sharing format is intended to be processed by Cosmo and
other apps, so perhaps that doesn't fit in with your goals. However,
I am hoping that the new sharing format can be something as simple as
a series of RDF triples (could be represented in XML, or not, doesn't
really matter as long as we have the equivalent of namespaces to
handle attribute name collisions). What were you thinking your dump/
restore records might look like?
From the POV of the dump/reload API and framework, data will essentially
be composed of tuples of elementary types. You can think of this as being
conceptually equivalent to a collection of relational database tables; in
fact the main difference between the format and a relational database is
that everything is read-only sequential access with no rewinding. That is,
processing will occur on a stream of records, which may at certain points
be interleaved.
That's the information model; how the data is actually stored on disk isn't
particularly important, except that it be reasonably efficient in time and
space (which probably means XML is out ;-)).
The overlap between sharing and dump/reload does worry me a bit. It would
suck for parcels to have to write *two* sets of code to do schema upgrades.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Open Source Applications Foundation "chandler-dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/chandler-dev