> In any case, the trick in my mind is how to represent MARC in JSON
> (disclaimer: I haven't tried to do this yet). Breaking it into pieces that
> index well but which also can be recombined without going through
> contortions doesn't sound easy because the obvious solution of converting
> each field into an object strikes me as more awkward than it should be. My
> gut reaction would be to store the entire MARC record in  MARCXML, and
> normalize and index field values to facilitate search/retrieval.
> 
> JSON maybe a great data exchange format,  but it's not a markup language
> like XML so doing things like preserving  field order or just getting a
> bird's eye view of content across multiple fields or subfields becomes more
> complex.

This is exactly my feeling — and I've been struggling with the same idea of 
"storing MARC" in the context of a NoSQL-type (or wide-column or BigTable, 
or...) implementation.  I think this runs squarely up against the data 
structure-vs-serialization issue [1] — MARC being indelibly fused, which is 
limiting.

As someone who's spent a decade learning, defending, and loving the intricacies 
of all that is XML (and XPath, and XSLT, and XSL-FO, and XLink, and…), I used 
to snub JSON because of the things that Kyle mentions.  But JSON is not XML.  
It's a simpler data structure, and in many ways that can be very freeing.

For example, the DCTERMS element set can be represented as a hierarchy from the 
original DCMES (though nobody seems to do this).  The fact that DC is 
data-structure-agnostic means that it can be stored in either XML or JSON 
equally well (and, with some common practice, serialized between the two), 
based on your needs.  You can do this precisely because the data model is 
format-independent.  You can't do this easily (or, possibly, at all) with MARC.

Sometimes, worse is better[2].  But, hey, I always catch flack for dissing 
MARC.  ;-)

MJ

PS.  For those RDF-ites among us, I also happen to think that JSON makes a 
great data structure for a triple store, eg. [3] — but I think storing absolute 
URLs as predicates like the N2 spec does is stupid.  

1. http://robotlibrarian.billdueber.com/data-structures-and-serializations/
2. http://en.wikipedia.org/wiki/Worse_is_better
3. http://n2.talis.com/wiki/RDF_JSON_Specification

Reply via email to