Dear Larry, (copy to many DSpace users interested by Authority List)
I carefully studied your proposal. Our realized developments are in line
with the functional requests you provide at:
http://wiki.dspace.org/index.php/Authority_Control_of_Metadata_Values
We also provide some important additional features:
* I18n issues are addressed, non latin alphabets included
* the stored value (in text_value) is the code (Authority Key) of the
authority entry: this ensures that you can change the preferred terms,
the synonyms, the translations without having to update the DSpace records.
* the indexed values are all the authority entry variants: this ensures
that full-text retrieval remains (and becomes more) efficient
* special indexes can be defined which includes all the generics of an
authority entry: this ensures that searches on a generic concept (when
the authority is arranged along a hierarchy like in a thesaurus) are
retrieving all the specifics.
* Authority sources can be:
* Any SQL database (including a DSpace application: in
http://www.windmusic.org, keywords of the Index are coming from one of
the DSpace collections; authors are another one; publishers idem)
Statically (authority list loaded at start up time) or
dynamically (entries are retrieved and cached as needed: the authority
list may be enormous)
* Any CSV file
* XML files
* SKOS/RDF files will be implemented using RIO
* Ajax Autocomplete is used for Metadata updates and for Searches: the
user types a few letters and proposals are made with terms (preferred or
synonyms) in any language.
* We count the use of each entries: those counts are provided in
displays to give clues to users about how much data is linked to each
entries (we also provide horizontal searches whenever possible)
* The Authority Keys are already "hidden" (translated in their preferred
term in the user's language) in most places. We are finishing this.
* Each usage of an Authority key can be enriched by a prefix (to
precise, for instance, a "role") and a suffix (to precise, for instance,
a quantity, a level of importance): Lucene proximity search then allow
to find the use of an authority key with a given role and/or quantity.
Our Authority keys are very very simple: it is the Scheme code (a SKOS
Scheme is an Authority List), an underline character, and the authority
key itself (any letter, digit or underscores).
This ensures that one Key is seen by Lucene as ONE word and that precise
search is possible
(Precision is THE major request from our users) (The DSpace Lucene
tokenizer is modified to accept and leave untouched words containing an
underline).
This work was made on the basis of DSpace 1.4.1/2 JSP-UI. It is wide and
touching many DSpace classes: it is impossible to us to provide it to
the DSpace Community without the coaching of a DSpace commiter.
http://www.windmusic.org/dspace
The system is very efficient:
* Autocomplete is responsive. Try with authors in
http://www.windmusic.org/dspace/records-search
* Controlled vocabulary search is now based on a SKOS subject thesaurus:
http://www.windmusic.org/dspace/subject-search
The keywords can come from any source, for instance a DSpace
collection containing the keywords:
http://www.windmusic.org/dspace/handle/68502/22355
* Whole FAO Agrovoc is loaded in 7 seconds from an XML file (more than
20 languages). Our users need similar performance with the whole MESH in
4 languages.
* Dynamic SQL source allows to also manage relations between DSpace
records. For instance, in WindMusic, there are different collections for
CDs and tracks on CDs: the CDs collections are one SKOS source and the
tracks collections are another.
The relation.ispartof is controlled by an Authority list dynamically
coming from the CDs collections. Example:
http://www.windmusic.org/dspace/handle/68502/41328
* Static menus are dynamically created when an authority list is short:
http://www.windmusic.org/dspace/scores-search
The standard SKOS fields (Authority list entry) are provided in the
attached XSD file (look at the "concept" element). Any XML file
conforming to this XSD can be used as an Authority list in DSpace.
Our general aim is to ensure that any accessible XML/CSV file or SQL
database (and other future format to be developed) can be used by one or
multiple DSpace applications or other applications (we are also working
to integrate this tool in JSPWiki to control page names, relations
between pages and relations between pages and external applications like
DSpace).
I would be very happy to discuss this further with you: many DSpace
users are eager to see this question solved and made perennial within
the common DSpace project.
Wishing you a very nice weekend,
Christophe Dupriez
Date: Wed, 13 May 2009 21:38:48 -0400
From: Larry Stone <l...@mit.edu>
Subject: [Dspace-tech] authority control proposal
To: DSpace Tech <dspace-t...@lists.sourceforge.net>,
dspace-devel@lists.sourceforge.net
Message-ID: <a0d0002f-b0b0-4f96-9c0a-9a30c7525...@mit.edu>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
I have to add an authority control mechanism to DSpace for an
institutional repository, so I'm doing it as modification to the 1.5.2
source in the hope it will get adopted into 1.6.
To begin discussion, I put up a wiki page about the design:
http://wiki.dspace.org/index.php/Authority_Control_of_Metadata_Values
Since I have to get this into production locally in the fairly near
future, please read it and respond promptly so there is time to
consider your comments. There are also a few opportunities to fill in
work I will not have time to do (JSPUI support, for example) so let me
know if you're interested in volunteering to help.
-- Larry
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xs:schema version="1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="conceptScheme" type="conceptScheme" />
<xs:complexType name="conceptScheme">
<xs:complexContent>
<xs:extension base="noScheme">
<xs:sequence>
<xs:element name="notation" type="notationScheme"
nillable="true" minOccurs="0" maxOccurs="unbounded" />
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element name="concept" type="concept" />
<xs:element name="metadataProperty" type="metadataProperty" />
</xs:choice>
</xs:sequence>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="term">
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="lang" type="xs:string" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:complexType name="url">
<xs:simpleContent>
<xs:extension base="xs:anyURI">
<xs:attribute name="lang" type="xs:string" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:complexType name="notationScheme">
<xs:complexContent>
<xs:extension base="noScheme">
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="collectionScheme">
<xs:complexContent>
<xs:extension base="noScheme">
<xs:sequence>
<xs:element name="member" type="xs:string" minOccurs="0"
maxOccurs="unbounded" />
</xs:sequence>
<xs:attribute name="about" type="xs:ID" use="required" />
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="concept">
<xs:sequence>
<xs:element name="prefLabel" type="term" nillable="true"
minOccurs="0" maxOccurs="unbounded" />
<xs:element name="altLabel" type="term" nillable="true"
minOccurs="0" maxOccurs="unbounded" />
<xs:element name="scopeNote" type="term" nillable="true"
minOccurs="0" maxOccurs="unbounded" />
<xs:element name="editorialNote" type="xs:string"
minOccurs="0" />
<xs:element name="broader" type="xs:string" nillable="true"
minOccurs="0" maxOccurs="unbounded" />
<xs:element name="related" type="xs:string" nillable="true"
minOccurs="0" maxOccurs="unbounded" />
<xs:element name="broadMatch" type="xs:string" maxOccurs="unbounded"
minOccurs="0">
</xs:element>
<xs:element name="relatedMatch" type="xs:string"
maxOccurs="unbounded" minOccurs="0">
</xs:element>
<xs:element name="alias" type="xs:string" nillable="true"
minOccurs="0" maxOccurs="unbounded" />
<xs:element name="notation" type="notation" nillable="true"
minOccurs="0" maxOccurs="unbounded" />
<xs:element name="collection" type="xs:string" maxOccurs="unbounded"
minOccurs="0"></xs:element>
<xs:element name="usage" type="referringApplication"
nillable="true" minOccurs="0" maxOccurs="unbounded" />
<xs:element name="narrowerUsage" type="referringApplication"
nillable="true" minOccurs="0" maxOccurs="unbounded" />
<xs:element name="url" type="url" nillable="true"
minOccurs="0" maxOccurs="unbounded" />
</xs:sequence>
<xs:attribute name="about" type="xs:ID" use="required" />
</xs:complexType>
<xs:complexType name="notation">
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="scheme" type="xs:string" use="required" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:complexType name="referringApplication">
<xs:sequence>
<xs:element name="count" type="count" nillable="true"
minOccurs="0" maxOccurs="unbounded" />
</xs:sequence>
<xs:attribute name="application" type="xs:string" use="required" />
</xs:complexType>
<xs:complexType name="count">
<xs:simpleContent>
<xs:extension base="xs:int">
<xs:attribute name="role" type="xs:string" use="required" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:complexType name="metadataProperty">
<xs:complexContent>
<xs:extension base="concept">
<xs:sequence>
<xs:element name="languageCollection" type="xs:string"
maxOccurs="1" minOccurs="0">
</xs:element>
<xs:element name="language" type="xs:string" nillable="true"
minOccurs="0" maxOccurs="unbounded" />
<xs:element name="prefix" type="xs:string" nillable="true"
minOccurs="0" maxOccurs="unbounded" />
<xs:element name="suffix" type="xs:string" nillable="true"
minOccurs="0" maxOccurs="unbounded" />
<xs:element name="defaultScheme" type="xs:string"
minOccurs="0" />
<xs:element name="defaultPrefix" type="xs:string"
maxOccurs="1" minOccurs="0">
</xs:element>
<xs:element name="defaultSuffix" type="xs:string"
maxOccurs="1" minOccurs="0"></xs:element>
<xs:element name="conceptScheme" type="xs:string"
nillable="true" minOccurs="0" maxOccurs="unbounded" />
<xs:element name="notationScheme" type="xs:string"
nillable="true" minOccurs="0" maxOccurs="unbounded" />
<xs:element name="external" type="xs:string" nillable="true"
minOccurs="0" maxOccurs="unbounded" />
</xs:sequence>
<xs:attribute name="propertyClass" type="metadataPropertyClass"
use="required" />
<xs:attribute name="system" type="xs:boolean"></xs:attribute>
<xs:attribute name="mandatory" type="xs:boolean" use="required" />
<xs:attribute name="repeatable" type="xs:boolean"
use="required" />
<xs:attribute name="normalized" type="xs:boolean"></xs:attribute>
<xs:attribute name="minLength" type="xs:int" use="required" />
<xs:attribute name="maxLength" type="xs:int" use="required" />
<xs:attribute name="closed" type="xs:boolean" use="required" />
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="referringApplicationList">
<xs:complexContent>
<xs:extension base="linkedList">
<xs:sequence />
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="linkedList">
<xs:complexContent>
<xs:extension base="abstractSequentialList">
<xs:sequence />
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="abstractSequentialList" abstract="true">
<xs:complexContent>
<xs:extension base="abstractList">
<xs:sequence />
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="abstractList" abstract="true">
<xs:complexContent>
<xs:extension base="abstractCollection">
<xs:sequence />
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="abstractCollection" abstract="true">
<xs:sequence />
</xs:complexType>
<xs:complexType name="termList">
<xs:complexContent>
<xs:extension base="linkedList">
<xs:sequence />
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="urlList">
<xs:complexContent>
<xs:extension base="linkedList">
<xs:sequence />
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="countList">
<xs:complexContent>
<xs:extension base="linkedList">
<xs:sequence />
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="reflector">
<xs:sequence />
</xs:complexType>
<xs:complexType name="schemeUsage">
<xs:sequence>
<xs:element name="searchURL" type="xs:anyURI" minOccurs="0"/>
<xs:element name="previewURL" type="xs:anyURI" minOccurs="0"/>
<xs:element name="completeURL" type="xs:anyURI" minOccurs="0"/>
</xs:sequence>
<xs:attribute name="role" type="xs:string" use="required"/>
</xs:complexType>
<xs:simpleType name="metadataPropertyClass">
<xs:restriction base="xs:string">
<xs:enumeration value="CODE" />
<xs:enumeration value="URI" />
<xs:enumeration value="TEXT" />
<xs:enumeration value="NUMBER" />
<xs:enumeration value="DATE" />
<xs:enumeration value="NAME"></xs:enumeration>
<xs:enumeration value="EMAIL"></xs:enumeration>
</xs:restriction>
</xs:simpleType>
<xs:complexType name="noScheme">
<xs:sequence>
<xs:element name="title" type="term" nillable="true" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="namespace" type="xs:anyURI" minOccurs="0"/>
<xs:element name="help" type="url" nillable="true" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="icon" type="url" nillable="true" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="display" type="url" nillable="true" minOccurs="0" maxOccurs="unbounded"/>
<xs:element name="editorialNote" type="xs:string" minOccurs="0"/>
</xs:sequence>
<xs:attribute name="about" type="xs:ID" use="required"/>
</xs:complexType>
</xs:schema>
------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables
unlimited royalty-free distribution of the report engine
for externally facing server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel