On Mon, 2009-05-11 at 11:31 +0100, Jakob Voss wrote
> A format should be described with a schema (XML Schema, OWL etc.) or at 
> least a standard. Mostly this schema already has a namespace or similar 
> identifier that can be used for the whole format.

This is unfortunately not the case.


> For instance MODS Version 3 (currently 3.0, 3.1, 3.2, 3.4) has the XML 
> Namespace http://www.loc.gov/mods/v3 so this is the best identifier to 
> identify MODS. 

And this is a perfect example of why this is not the case.

The same mods schema (let alone namespace) defines TWO formats, mods and
modsCollection.


To quote from the schema:
------------------------------------------------
*****  An instance of this schema is 

 (1) a single MODS record:  
         -->
        <xsd:element name="mods" type="modsType"/>
        <!--  
or 

(2) a collection of MODS records: 
 -->
        <xsd:element name="modsCollection">
                <xsd:complexType>
                        <xsd:sequence>
                                <xsd:element ref="mods" maxOccurs="unbounded"/>
                        </xsd:sequence>
                </xsd:complexType>
        </xsd:element>
        <!--  

*****  End of "instance" definition
-------------------------------------------------

So you're using the same identifier to identify two different things at
the same time.

We discussed this a lot during the development of SRU and there simply
isn't an existing identifier for an XML 'format'.

Also consider the following more hypothetical, but perfectly feasible
situations:

* One namespace is used to define two _totally_ separate sets of
elements.  There's no reason why this can't be done.

* One namespace defines so many elements that it's meaningless to call
it a format at all.  Even though the top level tag might be the same,
the contents are so varied that you're unable to realistically process
it.


Rob

Reply via email to