I'd just like to say a word of thanks for everyone who has contributed so far 
on this thread.  The viewpoints raised certainly help clarify at least my 
understanding of some of the issues and concepts involved.

> MARCXML is a step in the right direction. MODS goes even further. Neither 
> really go far enough.


And that succinctly, Eric manages to summarize my (and I strongly suspect, many 
others') sentiment on the issue at hand.  Of course, the natural follow-on 
question is "go far enough for *what* exactly", and this is where my original 
question came from.

It sounds like once again we have the issue that our current tools (MODS, 
DCTERMS) "aren't good enough", which means we either have to:

a) stop doing things while we build new, better tools like Karen's 
MARC-in-triples (which seems like a really interesting idea)
or
b) start building imperfect — perhaps highly flawed — things with our current, 
imperfect tools

I'm not nearly smart enough to do a) so my intent is to take a stab at b), or 
else sit back and consider a new line of work entirely (which happens 
distressingly often, usually after reading enough discouraging statements from 
librarians in a given day).

> I think there's a fundamental difference between MODS and DCTERMS that make 
> this nearly impossible. I've sometimes described this as the difference 
> between "metadata as record format" (MARC, oai_dc, MODS, etc) and "metadata 
> as vocabulary" (DCTERMS, DCAM, & RDF Vocabs in general).

This is a great clarification, and one of the main frustrations I have with 
MODS: it is bound nearly inseparably to XML as a format (and this is coming 
from someone who knows and loves XML dearly).  The idea of DCTERMS/DC/etc as a 
format-independent model seems like a step in the right direction, IMO.

> RDF's grammar comes from the RDF Data Model, and DC's comes from DCAM as well 
> as directly from RDF. The process that Karen Coyle describes is really the 
> only way forward in making a good faith effort to "put" MARC (the 
> bibliographic data) onto the Semantic Web.

Fair enough.  But I would contend that "putting MARC / bib data on the Semantic 
Web" is just one use case; even though I realize that to Semantic Web advocates 
that it's the *only* use case worth considering.

I find it difficult to imagine that "building a record format from just a list 
of words" is completely useless, especially given that right now there's next 
to *zero* access to bibliographic data from libraries.  Maybe the way to go is 
to just make the MARCXML available via OAI-PMH and OpenSearch and leave it at 
that.

> A more rational approach, IMO, would create a general description set 
> (probably numbering 20-50), then expanding that for more detail and for 
> different materials. Users of the sets could define the "zones" they wish to 
> use in an application profile, so no one would have to carry around data 
> elements that they are sure they will not use. It would also provide a simple 
> but compatible set for folks who don't want to do the whole "library 
> description" bit.

I agree with this 100%, and conceptually that's what DC and DCTERMS seemed to 
be the basis of, at least to me.  This seems to parallel the MARC approach to 
refinement, which can be expressed as either a hierarchy or a set of 
independent assertions.  Moreover, it's format-independent, so it could be 
serialized as XML, or RDF, or JSON for that matter.  Is this what the RDA 
entities are supposed to achieve?

Let me give another example: the Open Library API returns a JSON tree, eg. 
http://openlibrary.org/books/OL1M.json

But what schema is this?  And if it doesn't conform to a standard schema, does 
that make it useless? If it were based on DCTERMS, at least I'd have a 
reference at http://dublincore.org/documents/dcmi-terms/ to define the 
semantics being used (and an RDF namespace at http://purl.org/dc/terms/ to 
boot).

MJ

Reply via email to