On Wed, October 13, 2010 9:10 am, Tom Morris wrote:

> I thought the whole point of having fields in a database record was to
> avoid having to do string parsing, with all its problems, to recover
> your original data.

Well, that's one of the side effects of good database design, but it's not
the whole point. The main point of database design is to collect similar
data into a single structure, and then connect it to other, related data
structures in such a way that related data can be pulled together quickly,
and errors caused by different data being entered for the same fields
(data anomalies) are minimized.

The OpenLibrary catalog "database" is not a database in the commonly
understood sense (which is why I usually refer to it as a "repository" and
not a "database"). Rather, it is a collection of name/value pairs (the
values of which can be another collection of name/value pairs) all munged
together into a single string.

> If multiple fields are going to be munged together into a single
> string, what are the escaping rules for delimiters contained in the
> original strings?  What are the parsing rules?

The strings are encoded using JSON (JavaScript Object Notation). A quick
Google search should tell you more than you ever wanted to know about
JSON. Conceptually, it is virtually identical to XML encoding, although
the syntax is different which is why it's only conceptually identical.
Data can easily be converted to and from XML losslessly, and with the
ascendancy of AJAX, JSON may (hopefully) become obsolete.

When parsing catalog data furnished by OpenLibrary be aware that the
schema is not fixed; you may (probably would) encounter fields which are
not documented anywhere. This may be data which was collected by a user
interface at some point in the past but which is no longer being
collected, or it may be data that is now being collected but the purpose
(and constraints) of which are not yet documented.

So long as you only use the provided web interface, and do not expect data
integrity, you should be fine.

_______________________________________________
Ol-discuss mailing list
Ol-discuss@archive.org
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
ol-discuss-unsubscr...@archive.org

Reply via email to