Many thanks to Simon for this very useful roundup. It's good to be
assured that there are ways of coping with ordered values in the
representation languages.
So now we "only" need to adjust RDA. I still wonder whether this
apparent gap in the code (unless I've missed something important) was a
deliberate choice?
By the way, I checked the examples for more than one creator in RDA
19.2.1.3. Interestingly, the access points are always listed in the
order in which the correspondent persons appear in the statement of
responsibility - but that doesn't help, of course. A striking example
for the problems we'll have without a possibility to mark different
levels of responsibility is this one:
Beyard, Michael D.
Braun, Raymond E.
McLaughlin, Herbert
Phillips, Patrick L.
Rubin, Michael S.
Bald, Andre
Fader, Steven, 1951--
Jerschow, Oliver
Lassar, Terry J.
Mulvihill, David A.
Takesuye, David
Authorized access points representing the creators for: Developing
retail entertainment destinations / principal authors, Michael D.
Beyard, Raymond E. Braun, Herbert McLaughlin, Patrick L. Phillips,
Michael S. Rubin ; contributing authors, Andre Bald, Steven Fader,
Oliver Jerschow, Terry Lassar, David Mulvihill, David Takesuye
I've also looked again at the list of relationship designators for
creators (I.2.1), but couldn't find anything suitable. There is,
however, an option to introduce new relationship designators: "If none
of the terms listed in this appendix is appropriate or sufficiently
specific, use a term indicating the nature of the relationship as
concisely as possible." (I.1). But I don't believe this applies to
something like the missing "rank" aspect.
Heidrun
Am 06.06.2012 02:43, schrieb Simon Spero:
On Tue, Jun 5, 2012 at 11:53 AM, Karen Coyle <li...@kcoyle.net
<mailto:li...@kcoyle.net>> wrote:
Keeping an exact order is less intuitive in RDF. I'm not sure how
that would be done.
This would be good time to try and go over some of the ways that one
can represent this kind of ordered values using some different types
of knowledge representation languages.
*Introduction*
Instead of just focusing on the RDF, I'll also look at some related
languages for networked knowledge representation (KR) - primarily ISO
Common Logic and IKL (with maybe a little CyCL). These languages use
parentheses to mark the beginning and end of expressions. The first
thing inside the parentheses is the predicate. Thus, for example, to
say that Gene loves Jezebel one would write:
(loves Gene Jezebel)
In RDF/XML this corresponds to"
<rdf:Description about="#Gene">
<loves rdf:Resource="#Jezebel/>
</rdf:Description>
In N3 this can be written as:
<#Gene> :loves <#Jezebel> .
*Approach #1: Keep all the values in an ordered list.*
This is very easy to do in the Common Logic style KR languages, as
they support predicates (properties) that can take arbitrary numbers
of arguments.
(authors work1 JohnSmith FredBloggs PaulErdos Golgo13)
In RDF, predicates can only have two arguments (these arguments,
together with the predicate name, are the three parts of the triple).
However, the various syntaxes for RDF have special support for
handling lists or sets of arguments.
In RDF/XML we can build a list using the "Collection" syntax:
<rdf:Description about="#work1">
<authors rdf:parseType="Collection">
<rdf:Description about="#JohnSmith"/>
<rdf:Description about="#FredBloggs"/>
<rdf:Description about="#PaulErdos"/>
<rdf:Description about="#Golgo13"/>
</authors>
</rdf:Description>
This states that work1 has a value for authors that is a list of four
names.
However, because RDF only traffics in triples, this notation requires
some transformation. What happens is that the contents of the
"Collection" element is used to build an explicit rdf:List object.
For details see Appendix 1.
The drawback of using a single assertion to maintain the order is that
it becomes much harder to work with the data, as we are no longer
making statements about the relationship between an individual author
and a specific work
*Approach #2: Use multiple assertions, with the rank of the author
included. *
This approach is very simple to use in Common Logic et al. Since we
can use predicates with more than two arguments, we can define an
author predicate that takes as arguments a work, an author, and the
rank of this author for this work. For example:
(author work1 JohnSmith 1)
(author work1 FredBloggs 2)
(author work1 PaulErdos 3)
(author work1 Golgo13 4)
In RDF the situation is slightly more complicated, since we can only
use predicates with two arguments. However, the situation is not too
bad; we just need to create an extra object for each value;
_:w1a1 :author <#JohnSmith> .
_:w1a1 :rank "1" .
_:w1a2 :author <#FredBloggs> .
_:w1a2 :rank "2" .
_:w1a3 :author <#PaulErdos> .
_:w1a3 :rank "3" .
_:w1a4 :author <#Golgo13> .
_:w1a4 :rank "4" .
<#work1> :rankedAuthor _:w1a1 .
<#work1> :rankedAuthor _:w1a2 .
<#work1> :rankedAuthor _:w1a3 .
<#work1> :rankedAuthor _:w1a4 .
We can use a feature of OWL 2 called Property Role Chains to associate
the value of author from the rankedAuthor objects without having to
explicitly look at the rankedAuthor objects.
It is important to note here that, unlike in the first example, we do
not know that there is nobody behind Golgo 13. This can be handled in
a few different ways.
In CyC, one can state that the complete extent of predicate is known,
which means that if the system cannot infer that that there are any
more authors, it is can infer that there aren't. This "World
Closing" can also be done at query time, using Negation as Failure
semantics (e.g. using the "NOT EXISTS" filter in a SPARQL query.
We can also make explicit assertions; for example, in the CL family,
we can assert that there can for all works there can only be one
author at each rank, and that there for a specific work there is no
author whose rank is greater than 4. In IKL, CycL, and OWL, we can
also state that the work1 is something that has exactly four values of
author.
*Approach 3: Use constraints and rules *
In situations where only some authors are given numeric rank, and the
rest are ordered by some other principal (e.g. lexicographic order, or
no order specified), we can just state the constraints on authorship
are, and leave the ordering to be determined by the computer. We
could then indicate that JohnSmith was principal investigator; that
no-one goes behind Golgo 13, and the relative contributions of all
authors, then calculate appropriately ordered lists of authors based
on context (which might be that of the query, or that of the work, or
some other set of rules.
This is where the advantages of representing data as logical
propositions, rather than as strings should become immediately obvious
to anyone who has ever done work on scientometrics. Also, many
people may be disappointed to learn that their college courses in
philosophy might turn out to be of practical use.
It should be clear why no one should reasonably expect catalogers to
enter this sort of information directly. It should also be clear that
the Rules for a Knowledge Based need to be developed with direct input
from Subject Matter Experts who understand the theory behind the
practice. Most important of all, it ought to be obvious that any new
Bibliographic Framework needs to consider all the changes to work
flows and practice that can be helped or hindered by different
choices, and which cost/benefit tradeoffs need to be made.
*References: *
Information about ISO Common Logic and IKL, as well as relevant
portions of RDF can be found in Pat Hayes's guide at
http://www.ihmc.us/users/phayes/ikl/guide/guide.html . There are
several examples of how one can handle ordered lists in the section on
"SEQUENCE MARKERS VS. ARGUMENT LISTS" in the examples in Appendix B.
Information about RDF syntax can be found at
http://www.w3.org/TR/rdf-syntax-grammar/
Golgo 13 eye mask can be found at
http://www.amiami.com/top/detail/detail?oldscode=122694
*Appendix 1: How RDF turns Collections into triples.*
Lists in RDF are a lot like lists in programming languages like Lisp
and Scheme.
Non empty List content is handled by creating a new List object, and
defining two property values for it.
The property rdf:first is set to the value of the first value of the list.
The property rdf:rest is set to point to a List object containing the
rest of the list.
The first value of the rest of the list is the second value in the
collection.
If there is no more content in the collection, the value of rdf:rest
is set to the value rdf:nil. This explicitly states that there are no
more elements in the list that we don't know about; In our example we
can thus be sure that there is no-one behind Golgo 13.
The value of the property on the object we're describing is then set
to point to the first object in the list.
In N3 this becomes:
_:list1 rdf:first <#JohnSmith> .
_:list1 rdf:rest _:list2 .
_:list2 rdf:first <#FredBloggs> .
_:list2 rdf:rest _:list3 .
_:list3 rdf:first <#PaulErdos> .
_:list3 rdf:rest _:list4 .
_:list4 rdf:first <#Golgo13> .
_:list4 rdf:rest rdf:nil .
<#work1> ex:authors _:list1 .
This can become somewhat ungainly.
--
---------------------
Prof. Heidrun Wiesenmueller M.A.
Stuttgart Media University
Faculty of Information and Communication
Wolframstrasse 32, 70191 Stuttgart, Germany
www.hdm-stuttgart.de/bi