Thanks, Ross. For SRU, this is an opportune time to reconcile these differences. Opportune, because we are approaching standardization of SRU/CQL within OASIS, and there will be a number of areas that need to change.

Some observations.

1. the 'ofi' namespace of 'info' has the advantage that the name, "ofi", isn't necessarily tied to a community or application (I suppose one could claim that the acronym "ofi" means "openURL <something starting with 'f'> for Identifiers" but it doesn't say so anywhere that I can find.) However, the namespace itself (if not the name) is tied to OpenURL. "Namespace of Registry Identifiers used by the NISO OpenURL Framework Registry". That seems like a simple problem to fix. (Changing that title would not cause any technical problems. )

2. In contrast, with the srw namespace, the actual name is "srw". So at least in name, it is tied to an application.

3. On the other side, the srw namespace has the distinct advantage of built-in extensibility. For the URI: info:srw/schema/1/onix-v2.0, the "1" is an authority. There are (currently) 15 such authorities, they are listed in the (second) table at http://www.loc.gov/standards/sru/resources/infoURI.html

Authority "1" is the SRU maintenance agency, and the objects registered under that authority are, more-or-less, "public". But objects can be defined under the other authorities with no registration process required.

4.  ofi does not offer this sort of extensibility.


So, if we were going to unify these two systems (and I can't speak for the SRU community and commit to doing so yet) the extensibility offered by the srw approach would be an absolute requirement. If it could somehow be built in to ofi, then I would not be opposed to migrating the srw identifiers. Another approach would be to register an entirely new 'info:' URI namespace and migrating all of these identifiers to the new namespace.

--Ray


----- Original Message ----- From: "Ross Singer" <rossfsin...@gmail.com>
To: <z...@listserv.loc.gov>
Sent: Thursday, April 30, 2009 2:59 PM
Subject: One Data Format Identifier (and Registry) to Rule Them All


Hello everybody.  I apologize for the crossposting, but this is an
area that could (potentially) affect every one of these groups.  I
realize that not everybody will be able to respond to all lists,
but...

First of all, some back story (Code4Lib subscribers can probably skip ahead):

Jangle [1] requires URIs to explicitly declare the format of the data
it is transporting (binary marc, marcxml, vcard, DLF
simpleAvailability, MODS, EAD, etc.).  In the past, it has used it's
own URI structure for this (http://jangle.org/vocab/formats#...) but
this was always been with the intention of moving out of the
jangle.org into a more "generic" space so it could be used by other
initiatives.

This same concept came up in UnAPI [2] (I think this thread:
http://old.onebiglibrary.net/yale/cipolo/gcs-pcs-list/2006-March/thread.html#682
discusses it a bit - there is a reference there that it maybe had come
up before) although was rejected ultimately in favor of an (optional)
approach more in line with how OAI-PMH disambiguates metadata formats.
That being said, this page used to try to set sort of convention
around the UnAPI formats:
http://unapi.stikipad.com/unapi/show/existing+formats
But it's now just a squatter page.

Jakob Voss pointed out that SRU has a schema registry and that it
would make sense to coordinate with this rather than mint new URIs for
things that have already been defined there:
http://www.loc.gov/standards/sru/resources/schemas.html

This, of course, made a lot of sense.  It also made me realize that
OpenURL *also* has a registry of metadata formats:
http://alcme.oclc.org/openurl/servlet/OAIHandler?verb=ListRecords&metadataPrefix=oai_dc&set=Core:Metadata+Formats

The problem here is that OpenURL and SRW are using different info URIs
to describe the same things:

info:srw/schema/1/marcxml-v1.1

info:ofi/fmt:xml:xsd:MARC21

or

info:srw/schema/1/onix-v2.0

info:ofi/fmt:xml:xsd:onix

The latter technically isn't the same thing since the OpenURL one
claims it's an identifier for ONIX 2.1, but if I wasn't sending this
email now, eventually SRU would have registered
info:srw/schema/1/onix-v2.1

There are several other examples, as well (MODS, ISO20775, etc.) and
it's not a stretch to envision more in the future.

So there are a couple of questions here.

First, and most importantly, how do we reconcile these different
identifiers for the same thing?  Can we come up with some agreement on
which ones we should really use?

Secondly, and this gets to the reason why any of this was brought up
in the first place, how can we coordinate these identifiers more
effectively and efficiently to reuse among various specs and
protocols, but not:
1) be tied to a particular community
2) require some laborious and lengthy submission and review process to
just say "hey, here's my FOAF available via UnAPI"
3) be so lax that it throws all hope of authority out the window
?

I would expect the various communities to still maintain their own
registries of "approved" data formats (well, OpenURL and SRU, anyway
-- it's not as appropriate to UnAPI or Jangle).

Does something like this interest any of you?  Is there value in such
an initiative?

Thanks,
-Ross.

Reply via email to