Re: [CODE4LIB] XML Schema vs Library APIs (OAI-PMH/SRU/unAPI)

2011-02-25 Thread Jakob Voss

Hi Rob,

 This is just a rehash of a previous discussion on this list, between
 us:

 http://www.mail-archive.com/code4lib@listserv.nd.edu/msg05309.html

 So I guess I'm wasting my time ;)

Thanks, I added a link to the previous discussion. You wrote:


Referring to your blog post, you can say how the four inter-relate:

Schema Identifier uniquely identifies the format.
Schema Location is a non-unique description of the format.
Schema Name is a short, human readable, non-unique name for the format
and Namespace is a non-unique namespace used by the format.


These definitions can help to clarify things, but they are of little 
practical value. The practical question is how to refer to a particular 
format. If you have to manually look at each particular server and 
collection to find out what format is *actually* served, then names and 
identifiers are of little help to code against. Both schema identifiers 
and schema names only help you to guess a format. A precise format needs 
an authoritative reference that you can validate against. If there 
exists an official XML Schema, this and only this schema defines the 
format (or the commonly agreed upon subset that you can work with 
without manually adopting each single data source).


 A single schema may contain multiple namespaces, and there isn't a
 unique identifier for a schema.  For example, any simple Dublin Core
 based syntax must have at least two Namespaces, Dublin Core and the
 wrapper element. SchemaLocation is not unique as there can be many
 copies of the same schema.  A single schema may define multiple root
 elements, such as MODS does with both item and collection level
 elements.

A unique identifier for a schema is helpful because you do not need to 
actually look up a schema that you already know by its identifier. But 
it's not a must. If there is no single root namespace, you just should 
not use a namespace to point to a particular format.


Ok, enough :-)

Jakob

--
Jakob Voß jakob.v...@gbv.de, skype: nichtich
Verbundzentrale des GBV (VZG) / Common Library Network
Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
+49 (0)551 39-10242, http://www.gbv.de


Re: [CODE4LIB] XML Schema vs Library APIs (OAI-PMH/SRU/unAPI)

2011-02-24 Thread Robert Sanderson
That is (still) incorrect.

A single schema may contain multiple namespaces, and there isn't a
unique identifier for a schema.  For example, any simple Dublin Core
based syntax must have at least two Namespaces, Dublin Core and the
wrapper element. SchemaLocation is not unique as there can be many
copies of the same schema.  A single schema may define multiple root
elements, such as MODS does with both item and collection level
elements.

Referring to your blog post, you can say how the four inter-relate:

Schema Identifier uniquely identifies the format.
Schema Location is a non-unique description of the format.
Schema Name is a short, human readable, non-unique name for the format
and Namespace is a non-unique namespace used by the format.

This is just a rehash of a previous discussion on this list, between us:

http://www.mail-archive.com/code4lib@listserv.nd.edu/msg05309.html

So I guess I'm wasting my time ;)

Rob Sanderson

On Thu, Feb 24, 2011 at 9:44 AM, Jakob Voss jakob.v...@gbv.de wrote:
 Hi,

 We are developing a general API management tool to provide different APIs
 (unAPI, SRU, OAI-PMH...) with different record formats (MARC, MODS, DC...)
 to our databases. We now stumbled upon some confusion regarding XML formats.
 The basic question is what is a format and how do you refer to it?

 I came to the conclusion that at least SRU schema identifiers are useless.
 In addition you can extract XML namespace URIs from XML Schemas, so all you
 need to identify a format is a link to its XML Schema.

 I wrote a more detailed blog posting about this at
 http://jakoblog.de/2011/02/24/xml-schema-vs-library-apis-oai-pmhsruunapi/

 Does anyone of you relies on SRU schema identifiers when consuming SRU?
 I think at least for XML-based formats we should only use the XML Schema as
 authoritative reference. Sure there are different applications of variants
 of one schema, but then it makes no sense to use global identifiers in
 addition to local names.

 Jakob

 --
 Jakob Voß jakob.v...@gbv.de, skype: nichtich
 Verbundzentrale des GBV (VZG) / Common Library Network
 Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
 +49 (0)551 39-10242, http://www.gbv.de



Re: [CODE4LIB] XML Schema vs Library APIs (OAI-PMH/SRU/unAPI)

2011-02-24 Thread Jonathan Rochkind
I've thought about/messed with this stuff before, and come up with no 
good elegant solution. It is indeed kind of a mess.


On 2/24/2011 12:03 PM, Robert Sanderson wrote:

That is (still) incorrect.

A single schema may contain multiple namespaces, and there isn't a
unique identifier for a schema.  For example, any simple Dublin Core
based syntax must have at least two Namespaces, Dublin Core and the
wrapper element. SchemaLocation is not unique as there can be many
copies of the same schema.  A single schema may define multiple root
elements, such as MODS does with both item and collection level
elements.

Referring to your blog post, you can say how the four inter-relate:

Schema Identifier uniquely identifies the format.
Schema Location is a non-unique description of the format.
Schema Name is a short, human readable, non-unique name for the format
and Namespace is a non-unique namespace used by the format.

This is just a rehash of a previous discussion on this list, between us:

http://www.mail-archive.com/code4lib@listserv.nd.edu/msg05309.html

So I guess I'm wasting my time ;)

Rob Sanderson

On Thu, Feb 24, 2011 at 9:44 AM, Jakob Vossjakob.v...@gbv.de  wrote:

Hi,

We are developing a general API management tool to provide different APIs
(unAPI, SRU, OAI-PMH...) with different record formats (MARC, MODS, DC...)
to our databases. We now stumbled upon some confusion regarding XML formats.
The basic question is what is a format and how do you refer to it?

I came to the conclusion that at least SRU schema identifiers are useless.
In addition you can extract XML namespace URIs from XML Schemas, so all you
need to identify a format is a link to its XML Schema.

I wrote a more detailed blog posting about this at
http://jakoblog.de/2011/02/24/xml-schema-vs-library-apis-oai-pmhsruunapi/

Does anyone of you relies on SRU schema identifiers when consuming SRU?
I think at least for XML-based formats we should only use the XML Schema as
authoritative reference. Sure there are different applications of variants
of one schema, but then it makes no sense to use global identifiers in
addition to local names.

Jakob

--
Jakob Voßjakob.v...@gbv.de, skype: nichtich
Verbundzentrale des GBV (VZG) / Common Library Network
Platz der Goettinger Sieben 1, 37073 Göttingen, Germany
+49 (0)551 39-10242, http://www.gbv.de