dc-rda  

Re: We don't need no stinking Access Points

Rob Styles
Mon, 09 Nov 2009 05:32:06 -0800

Comment posted to Jon's blog:

Jon,

I think the thrust of your argument is bang on the nail. It chimes strongly with work I did some time ago (and am building on further currently). http://events.linkeddata.org/ldow2008/slides/RobStyles_SemanticMarc.pdf

There is a key difference in the semantics of data found in MARC records and the data we would like to publish as Linked Data. The data in the marc record is a statement of what is printed on the book, not a statement of truth.

So, where it says "Publisher Statement" that's because it's the statement made in the book, which all comes back to book-in-hand cataloging for the purpose of stock management and discovery <em>within</em> a library.

This is key because the statement printed in the book will not change over time, whereas names and locations of publishers change as companies merge, split, go broke and re-form.

What is needed is both - literal values for what is printed on the book-in-hand (hence tying it to the manifestation in most cases) but also properties referring to the organizations and peoples involved. We can start to build on data mining techniques and bringing in external data to populate those properties.

Thanks for prompting more thought about this.

rob



On 9 Nov 2009, at 13:03, Haffner, Alexander wrote:

Hello,

really nice post and very interesting heading, and indeed I've got some comments...

To help parsing grammatically correct statements into its semantically meaningful constituent parts you already specified properties for the single sub-elements. And of course machines interpret corresponding metadata records much better without redundant or maybe inaccurate access points. Why can an access point be inaccurate? To detect changes in a linked entity/record a frequent check of their record content is required. Thus, I think a SPARQL query will never retrieve the access point (i.e. of a publicationStatement). Software will always be interested in the linked object (i.e. the placeOfPublication) and consequently in its URI to retrieve up-to-date information on time. The idea of statement-top-elements like in XML or MARCXML isn't this bad, it's tried and tested, so why discard, if we can find ways to reflect the aggregation in our ontology too? It's definitely not wrong and probably useful in the future. It's understandable that publishersName still is a string and not like I assumed an URI linking to the entity of the corporate body. But as a consequence we need an element that links to the corresponding entity. I therefore suggest the registration of additional roles for rdarole:publisher, rdarole:distributor and rdarole:manufacturer.

I think librarians have to let go the imagination that RDF metadata records are library data - it is data for the semantic web and to be read by machines! So we reach a point where no one should ever touch any record without software support. Library data will become the data on the screen and not the one in the record. And seriously, isn't this a great revolution for librarians? Why working inside the code? I know it's historically grown. However, before I came to the library I taught students in principles increasing the usability of software and software engineering concepts regarding human centered design. So out of my point of view usable library-software has to provide librarians by highly efficient user interfaces and by smart dialogs. This means librarians can totally concentrate on content while the applications manage structuring and encoding of records in conformity to our ontologys. To tie data to already existing records, software solutions are mandatory anyway, so where is "the one software" for all desires?

Best, Alex


-----Ursprüngliche Nachricht-----
Von: List for discussion on Resource Description and Access
(RDA) [mailto:dc-...@jiscmail.ac.uk] Im Auftrag von Diane I. Hillmann
Gesendet: Mittwoch, 4. November 2009 22:03
An: DC-RDA@JISCMAIL.AC.UK
Betreff: Re: AW: RDA element vocabularies

Alex:

Jon has posted to the "Metadata Matters" blog on this issue,
so you (and others on the list who have an interest in this
topic) might want to wander over there and take a look:

http://managemetadata.org/blog/2009/11/03/we-dont-need-no-stink
ing-access-points/

Comments are welcome!

Diane

Haffner, Alexander wrote:

Hello Jon,

thanks a lot for this detailed explanation. And yes all the
questions
did make my head hurt a little...

I wasn't aware about the idea to define a functional "access
point" as
a string in aggregation ranges like in publicationStatement. Of
course, in this case it is necessary to use a SES. But this
information is redundant, so actually I expected mechanisms in
upcoming systems which generate access points out of the
subordinated
elements (sub-elements).

You are exactly right, in future rda:placeOfPlublication as well as
rda:publishersName will link to according entities
(RDF-Descriptions)
because this is the basic idea of linked data and so non-literal
values will become more and more important. Consequently, an SES has
to support this circumstances, and if the SES isn't able to express
our linked situation we are probably back at the point of
dynamically
generated access points, aren't we?

Unfortunately there is no way to look at the content of your
SESs. So
how could a metadata instance look like?

<rdf:Description rdf:about="http://abc.de/record123-e1-m1";>

<rdf:type

rdf:resource="http://RDVocab.info/uri/schema/FRBRentitiesRDA/Ma
nifestation"
/>

...

<rda:publicationStatementManifestation> ???
</rda:publicationStatementManifestation>

<rda:placeOfPublicationManifestation
rdf:resource="http://abc.de/Place/123"; />

<rda:publishersNameManifestation
rdf:resource="http://abc.de/CorporateBody/123"; />


<rda:dateOfPublicationManifestation>2000</ rda:dateOfPublicationManifes
tation>

</rdf:Description>

The declaration of rda-frbr-specific aggregated statements as
properties with a domain of an rda-frbr entity and a range of an SES
still is the critical point. So are you already developing
approaches
for alternative specifications of custom datatypes which allow a
reflection of linked resources and if yes, how do they look?

Best regards.

Alexander



---------------------------------------------------------------
---------
   *Von:* List for discussion on Resource Description and Access
   (RDA) [mailto:dc-...@jiscmail.ac.uk] *Im Auftrag von *Jon Phipps
   *Gesendet:* Donnerstag, 22. Oktober 2009 17:44
   *An:* DC-RDA@JISCMAIL.AC.UK
   *Betreff:* Re: RDA element vocabularies

   Hi Alexander,

   I've put some responses to your questions in-line below.

   Thanks,
   Jon Phipps
   http://metadataregistry.org
   http://metadataregistry.org/rdabrowse.htm


   On Wed, Oct 21, 2009 at 4:47 AM, Haffner, Alexander
   <a.haff...@d-nb.de <mailto:a.haff...@d-nb.de>> wrote:

       Hello everyone,

       my name is Alexander Haffner and I started working at German
       National Library in September 2009. I am involved as a
       research fellow in the competence centre for interoperable
       metadata and I am responsible for investigations regarding a
       fitting integration of RDA compliant metadata for
the Semantic
       Web in the German National Library landscape.

       Last week I gave a presentation about RDA and the
Semantic Web
       in the Library Community Session at DC-2009 (slides
online soon).

   That would be great, we'd love to see them.

       During my preparation I ran into some trouble and
       misunderstandings regarding the RDA Element Sets. In the
       presentation I tried to bridge the idea out of RDA Element
       Analysis
(_www.rda-jsc.org/docs/5rda-elementanalysisrev3.pdf_)
       and the registered elements in the NSDL Registry.

   As you may have noticed, the RDA Element Analysis itself tries to
   bridge RDF, XML, Indecs, and DCAM and isn't yet fully formed or
   formally reviewed (as far as we know). So even though
we're paying
   attention to it, we've departed from it in a number of areas as
   we've tried to coerce the RDA "Element Set" into a set
of RDFS/OWL
   ontologies. Since the element analysis and the RDA documentation
   haven't always been written with RDFS/OWL or even RDF in mind,
   we've had to make some judgment calls. Many of these
decisions try
   to take into account the utility of the RDA elements beyond the
   domain of RDA/FRBR as well as the need to be specific about the
   RDA/FRBR model. We're also thinking about how many of the RDA
   properties might map beyond the obvious MARC21 mappings and about
   how our description of the semantics might enable or
restrict such
   mappings. And of course there are some additional
challenges, such
   as the fact that there isn't yet a formal FRBR ontology that can
   be incorporated into the model.

       For example, I cannot understand why
       rda:publicationStatementManifestation and
       rda:publishersNameManifestation are specified as properties
       (property and its sub-property) with domain manifestation.
       Because out of my point of view
       rda:publicationStatementManifestation has to be of type class
       and has to offer a property rda:publishersNameManifestation.
       Because the RDA Element Analysis clarifies that these element
       sub-element relationships reflect composite patterns
and so we
       have to find mechanisms to describe these aggregations in our
       schema too.

   The "aggregated" statements pose a particular problem and have
   been the source of much online and offline discussion. One of the
   primary questions is the one you pose and I think there are a
   number of problems with the approach you mention.

   The element analysis clarifies the fact that there is an
   element/sub-element relationship between publicationStatement and
   publishersName. In XML this is very easy to express:

   <Publication statement>
   <Place of publication> Austin, TX </Place of publication>
   <Publisher's name>The University of Texas at Austin, College of
   Liberal Arts </Publisher's name>
   <Date of publication> [2001]- </Date of publication>
   </Publication statement>

   And this maps very nicely to a MARC21 260 field:
   260 $a Austin, TX
   260 $b The University of Texas at Austin, College of Liberal Arts
   260 $c [2001]-

   This is clearly the way that the RDA authors are thinking about
   how these aggregated statements will be expressed in instance
   data. Note that in neither instance do the rules for the main
   element -- <Publication statement> or 260 -- permit that element
   to contain a value. It's clean and simple.

   So what's the best way to express this instance data in RDF and
   the semantics in RDFS/OWL? We tried this out in a number of
   different models and found that there are quite a few formal and
   informal inferences that we have to take into consideration.

   Despite the MARC21 rules, can an RDA:publicationStatement contain
   a string? If so, can that string be expected to conform to a
   particular encoding scheme? These aggregated statements are
   functional "access points" in a card catalog and as such they're
   strings that have a clearly specified order of elements -- typed
   literals defined by a syntax encoding scheme. This isn't in the
   rules, but is widely understood. If we disable the ability to use
   such an encoded string as a value for this property, what are the
   consequences for mapping? Does Publication Statement as
a distinct
   entity/element have any value at all in the description of a
   resource in the RDF data model if it can't exist independent of
   its constituent parts?

   And then of course there's those constituent parts. Do we
   want/need to restrict those to plain literals? Can we assume that
   in an open world the value of rda:placeOfPlublication will always
   be the plain literal that the rda:publicationStatement encoding
   scheme requires? It seems that it's just as likely, maybe more
   likely, that rda:placeOfPlublication will have a non-literal
   value. How do we conform to the RDA rules and still make the
   ontology extensible and future-proof and relatively domain
   independent? Do we need to be OWL-DL compatible as does that make
   us DCAM-compatible too? What about OWL2 and DCAM2?
What's the best
   way to define a Syntax Encoding Scheme in an ontology?

   Does your head hurt yet?

   Unless there's some serious objection, we've decided that yes,
   aggregated statements have value in the ontology, but
primarily as
   a place to store the aggregated syntax-encoding string
as it might
   appear in a card catalog or a display. Given the lack of
   semantics, it has limited value, in an RDF query, as a
   superproperty of its constituent parts.

   For the moment, we have decided to...

       * declare a general class of Syntax Encoding Schemes
       * declare a SES subclass for each statement for use
later as a
         custom datatype
       * declare each aggregated statement with no domain or range
       * declare each rda-frbr-specific aggregated statement as a
         property with a domain of the rda-frbr entity and
a range of
         the SES -- this is incorrect, but right now we need a
         placeholder until we can come up with a good way of
         registering custom datatypes
       * declare each sub-element component of the statement as a
         generic property with no domain or range
       * declare each rda-frbr-specific sub-element component of the
         statement as a subproperty of the generic property with a
         domain of the rda-frbr entity
       * There is no range declared on the rda-frbr-specific
         properties and they're not declared as either an
         owl:objectProperty or an owl:dataProperty

   We welcome further feedback and discussion on this approach --
   it's very clear that this isn't the only possible approach. I've
   inserted the related ontology fragments at the end of this email.

       Together with Alistair Miles I discussed my concerns and he
       shard my doubts. So I would appreciate to get some feedback
       regarding your decision making. Furthermore, I want to offer
       my contribution for the finalization of the RDA Element
       Vocabularies. So don't hesitate to involve me.

   Your contributions to the discussion are most welcome, as are
   Alistair's doubts.

       Additionally, I'd like to mention that we intend the
       development of a transformation from MARC to RDA metadata (in
       the beginning as proof of concept). Alistair already created
       an account to code4rda. In this context I'd like to know if
       there are any parallel efforts in this field I don't
know yet.

       Thanks in advance for your support and best regards from
       Frankfurt,
       Alexander



       --
       Alexander Haffner
       Deutsche Nationalbibliothek
       Informationstechnik
       Adickesallee 1
       D-60322 Frankfurt am Main
       Telefon: +49-69-1525-1766
       Telefax: +49-69-1525-1799
       _mailto:a.haff...@d-nb.de_
       _http://www.d-nb.de_ <http://www.d-nb.de/>


   <!--Class: Publication Statement Encoding Scheme-->
   <rdf:Description

rdf:about="http://RDVocab.info/Elements/ PublicationStatementEncodingSc
heme
<http://rdvocab.info/Elements/PublicationStatementEncodingScheme>">


     <rdfs:isDefinedBy
rdf:resource="http://RDVocab.info/Elements
<http://rdvocab.info/Elements>" />
     <reg:status
rdf:resource="http://metadataregistry.org/uri/RegStatus/1002"; />


     <reg:name
xml:lang="en">PublicationStatementEncodingScheme</reg:name>
     <rdfs:label xml:lang="en">Publication Statement Encoding
Scheme</rdfs:label>


     <skos:definition xml:lang="en">This subclass has been
created to define the Syntax Encoding Scheme for the RDA
Publication Statement composite string. The Publication
Statement is composed of an ordered, concatenated list of properties:


       - Place of publication
       - Parallel place of publication
       - Publisher's name
       - Parallel publisher's name
       - Date of publication
     </skos:definition>
     <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"; />


     <rdfs:subClassOf
rdf:resource="http://RDVocab.info/Elements/RDASyntaxEncodingScheme
<http://rdvocab.info/Elements/RDASyntaxEncodingScheme>" />


   </rdf:Description>

   <!--Class: RDA Syntax Encoding Scheme-->
   <rdf:Description
rdf:about="http://RDVocab.info/Elements/RDASyntaxEncodingScheme
<http://rdvocab.info/Elements/RDASyntaxEncodingScheme>">


     <rdfs:isDefinedBy
rdf:resource="http://RDVocab.info/Elements
<http://rdvocab.info/Elements>" />
     <reg:status
rdf:resource="http://metadataregistry.org/uri/RegStatus/1002"; />


     <reg:name xml:lang="en">RDASyntaxEncodingScheme</reg:name>
     <rdfs:label xml:lang="en">RDA Syntax Encoding
Scheme</rdfs:label>


     <skos:definition xml:lang="en">This subclass has been
created to gather the Syntax Encoding Schemes used in RDA.
</skos:definition>
     <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"; />


     <rdfs:subClassOf
rdf:resource="http://www.w3.org/2000/01/rdf-schema#Datatype"; />


   </rdf:Description>

   <!--Property: Publication statement-->
   <rdf:Description
rdf:about="http://RDVocab.info/Elements/publicationStatement
<http://rdvocab.info/Elements/publicationStatement>">


     <rdfs:isDefinedBy
rdf:resource="http://RDVocab.info/Elements
<http://rdvocab.info/Elements>" />
     <reg:status
rdf:resource="http://metadataregistry.org/uri/RegStatus/1002"; />


     <reg:name xml:lang="en">publicationStatement</reg:name>
     <rdfs:label xml:lang="en">Publication statement</rdfs:label>


     <rdf:type
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax- ns#Property" />


     <skos:definition xml:lang="en">A statement identifying
the place
or places of publication, publisher or publishers, and date or dates
of publication of a resource.</skos:definition>


   </rdf:Description>

   <!--Property: Publication statement (Manifestation)-->
   <rdf:Description

rdf:about="http://RDVocab.info/Elements/ publicationStatementManifestat
ion
<http://rdvocab.info/Elements/publicationStatementManifestation>">


     <rdfs:isDefinedBy
rdf:resource="http://RDVocab.info/Elements
<http://rdvocab.info/Elements>" />
     <reg:status
rdf:resource="http://metadataregistry.org/uri/RegStatus/1002"; />


     <reg:name
xml:lang="en">publicationStatementManifestation</reg:name>
     <rdfs:label xml:lang="en">Publication statement
(Manifestation)</rdfs:label>


     <skos:definition xml:lang="en">A statement identifying
the place
or places of publication, publisher or publishers, and date or dates
of publication of a resource.</skos:definition>


     <rdf:type
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax- ns#Property" />


     <rdfs:subPropertyOf
rdf:resource="http://RDVocab.info/Elements/publicationStatement
<http://rdvocab.info/Elements/publicationStatement>" />


     <rdfs:domain

rdf:resource="http://RDVocab.info/uri/schema/FRBRentitiesRDA/ Manifesta
tion <http://rdvocab.info/uri/schema/FRBRentitiesRDA/Manifestation>"
/>


     <rdfs:range

rdf:resource="http://RDVocab.info/Elements/ PublicationStatementEncodin
gScheme
<http://rdvocab.info/Elements/ PublicationStatementEncodingScheme>" />


   </rdf:Description>

   <!--Property: Publisher's name-->
   <rdf:Description
rdf:about="http://RDVocab.info/Elements/publishersName
<http://rdvocab.info/Elements/publishersName>">


     <rdfs:isDefinedBy
rdf:resource="http://RDVocab.info/Elements
<http://rdvocab.info/Elements>" />
     <reg:status
rdf:resource="http://metadataregistry.org/uri/RegStatus/1002"; />


     <reg:name xml:lang="en">publishersName</reg:name>
     <rdfs:label xml:lang="en">Publisher's name</rdfs:label>


     <skos:definition xml:lang="en">The name of a person,
family, or
corporate body responsible for publishing, releasing, or issuing a
resource.</skos:definition>


     <rdf:type
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax- ns#Property" />


   </rdf:Description>

   <!--Property: Publisher's name (Manifestation)-->
   <rdf:Description
rdf:about="http://RDVocab.info/Elements/publishersNameManifestation
<http://rdvocab.info/Elements/publishersNameManifestation>">


     <rdfs:isDefinedBy
rdf:resource="http://RDVocab.info/Elements
<http://rdvocab.info/Elements>" />
     <reg:status
rdf:resource="http://metadataregistry.org/uri/RegStatus/1002"; />


     <reg:name xml:lang="en">publishersNameManifestation</reg:name>
     <rdfs:label xml:lang="en">Publisher's name
(Manifestation)</rdfs:label>


     <skos:definition xml:lang="en">The name of a person,
family, or
corporate body responsible for publishing, releasing, or issuing a
resource.</skos:definition>


     <rdf:type
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax- ns#Property" />


     <rdfs:subPropertyOf
rdf:resource="http://RDVocab.info/Elements/publishersName
<http://rdvocab.info/Elements/publishersName>" />


     <rdfs:domain

rdf:resource="http://RDVocab.info/uri/schema/FRBRentitiesRDA/ Manifesta
tion <http://rdvocab.info/uri/schema/FRBRentitiesRDA/Manifestation>"
/>


   </rdf:Description>

   <!--Property: Place of publication-->
   <rdf:Description
rdf:about="http://RDVocab.info/Elements/placeOfPublication
<http://rdvocab.info/Elements/placeOfPublication>">


     <rdfs:isDefinedBy
rdf:resource="http://RDVocab.info/Elements
<http://rdvocab.info/Elements>" />
     <reg:status
rdf:resource="http://metadataregistry.org/uri/RegStatus/1002"; />


     <reg:name xml:lang="en">placeOfPublication</reg:name>
     <rdfs:label xml:lang="en">Place of publication</rdfs:label>


     <skos:definition xml:lang="en">A place associated with
the publication, release, or issuing of a resource.</skos:definition>
     <rdf:type
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax- ns#Property" />


   </rdf:Description>

   <!--Property: Place of publication (Manifestation)-->
   <rdf:Description

rdf:about="http://RDVocab.info/Elements/ placeOfPublicationManifestatio
n <http://rdvocab.info/Elements/placeOfPublicationManifestation>">


     <rdfs:isDefinedBy
rdf:resource="http://RDVocab.info/Elements
<http://rdvocab.info/Elements>" />
     <reg:status
rdf:resource="http://metadataregistry.org/uri/RegStatus/1002"; />


     <reg:name
xml:lang="en">placeOfPublicationManifestation</reg:name>
     <rdfs:label xml:lang="en">Place of publication
(Manifestation)</rdfs:label>


     <skos:definition xml:lang="en">A place associated with
the publication, release, or issuing of a resource. </ skos:definition>
     <rdf:type
rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax- ns#Property" />


     <rdfs:subPropertyOf
rdf:resource="http://RDVocab.info/Elements/placeOfPublication
<http://rdvocab.info/Elements/placeOfPublication>" />


     <rdfs:domain

rdf:resource="http://RDVocab.info/uri/schema/FRBRentitiesRDA/ Manifesta
tion <http://rdvocab.info/uri/schema/FRBRentitiesRDA/Manifestation>"
/>


   </rdf:Description>






Rob Styles
tel: +44 (0)870 400 5000
fax: +44 (0)870 400 5001
mobile: +44 (0)7971 475 257
msn: mmmmm...@yahoo.com
irc: irc.freenode.net/mmmmmrob,isnick
web: http://www.talis.com/
blog: http://www.dynamicorange.com/blog/
blog: http://blogs.talis.com/panlibus/
blog: http://blogs.talis.com/nodalities/
blog: http://blogs.talis.com/n2/

Please consider the environment before printing this email.

Find out more about Talis at www.talis.com
shared innovationTM

Any views or personal opinions expressed within this email may not be those of 
Talis Information Ltd or its employees. The content of this email message and 
any files that may be attached are confidential, and for the usage of the 
intended recipient only. If you are not the intended recipient, then please 
return this message to the sender and delete it. Any use of this e-mail by an 
unauthorised recipient is prohibited.

Talis Information Ltd is a member of the Talis Group of companies and is 
registered in England No 3638278 with its registered office at Knights Court, 
Solihull Parkway, Birmingham Business Park, B37 7YB.