Rob Styles
Mon, 09 Nov 2009 05:32:06 -0800
Comment posted to Jon's blog: Jon,I think the thrust of your argument is bang on the nail. It chimes strongly with work I did some time ago (and am building on further currently). http://events.linkeddata.org/ldow2008/slides/RobStyles_SemanticMarc.pdf
There is a key difference in the semantics of data found in MARC records and the data we would like to publish as Linked Data. The data in the marc record is a statement of what is printed on the book, not a statement of truth.
So, where it says "Publisher Statement" that's because it's the statement made in the book, which all comes back to book-in-hand cataloging for the purpose of stock management and discovery <em>within</em> a library.
This is key because the statement printed in the book will not change over time, whereas names and locations of publishers change as companies merge, split, go broke and re-form.
What is needed is both - literal values for what is printed on the book-in-hand (hence tying it to the manifestation in most cases) but also properties referring to the organizations and peoples involved. We can start to build on data mining techniques and bringing in external data to populate those properties.
Thanks for prompting more thought about this. rob On 9 Nov 2009, at 13:03, Haffner, Alexander wrote:
Hello,really nice post and very interesting heading, and indeed I've got some comments...To help parsing grammatically correct statements into its semantically meaningful constituent parts you already specified properties for the single sub-elements. And of course machines interpret corresponding metadata records much better without redundant or maybe inaccurate access points. Why can an access point be inaccurate? To detect changes in a linked entity/record a frequent check of their record content is required. Thus, I think a SPARQL query will never retrieve the access point (i.e. of a publicationStatement). Software will always be interested in the linked object (i.e. the placeOfPublication) and consequently in its URI to retrieve up-to-date information on time. The idea of statement-top-elements like in XML or MARCXML isn't this bad, it's tried and tested, so why discard, if we can find ways to reflect the aggregation in our ontology too? It's definitely not wrong and probably useful in the future. It's understandable that publishersName still is a string and not like I assumed an URI linking to the entity of the corporate body. But as a consequence we need an element that links to the corresponding entity. I therefore suggest the registration of additional roles for rdarole:publisher, rdarole:distributor and rdarole:manufacturer.I think librarians have to let go the imagination that RDF metadata records are library data - it is data for the semantic web and to be read by machines! So we reach a point where no one should ever touch any record without software support. Library data will become the data on the screen and not the one in the record. And seriously, isn't this a great revolution for librarians? Why working inside the code? I know it's historically grown. However, before I came to the library I taught students in principles increasing the usability of software and software engineering concepts regarding human centered design. So out of my point of view usable library-software has to provide librarians by highly efficient user interfaces and by smart dialogs. This means librarians can totally concentrate on content while the applications manage structuring and encoding of records in conformity to our ontologys. To tie data to already existing records, software solutions are mandatory anyway, so where is "the one software" for all desires?Best, Alex-----Ursprüngliche Nachricht----- Von: List for discussion on Resource Description and Access (RDA) [mailto:dc-...@jiscmail.ac.uk] Im Auftrag von Diane I. Hillmann Gesendet: Mittwoch, 4. November 2009 22:03 An: DC-RDA@JISCMAIL.AC.UK Betreff: Re: AW: RDA element vocabularies Alex: Jon has posted to the "Metadata Matters" blog on this issue, so you (and others on the list who have an interest in this topic) might want to wander over there and take a look: http://managemetadata.org/blog/2009/11/03/we-dont-need-no-stinking-access-points/Comments are welcome! Diane Haffner, Alexander wrote:Hello Jon, thanks a lot for this detailed explanation. And yes all thequestionsdid make my head hurt a little... I wasn't aware about the idea to define a functional "accesspoint" asa string in aggregation ranges like in publicationStatement. Of course, in this case it is necessary to use a SES. But this information is redundant, so actually I expected mechanisms in upcoming systems which generate access points out of thesubordinatedelements (sub-elements). You are exactly right, in future rda:placeOfPlublication as well as rda:publishersName will link to according entities(RDF-Descriptions)because this is the basic idea of linked data and so non-literal values will become more and more important. Consequently, an SES has to support this circumstances, and if the SES isn't able to express our linked situation we are probably back at the point ofdynamicallygenerated access points, aren't we? Unfortunately there is no way to look at the content of yourSESs. Sohow could a metadata instance look like? <rdf:Description rdf:about="http://abc.de/record123-e1-m1"> <rdf:typerdf:resource="http://RDVocab.info/uri/schema/FRBRentitiesRDA/Ma nifestation"<rda:dateOfPublicationManifestation>2000</ rda:dateOfPublicationManifes/> ... <rda:publicationStatementManifestation> ??? </rda:publicationStatementManifestation> <rda:placeOfPublicationManifestation rdf:resource="http://abc.de/Place/123" /> <rda:publishersNameManifestation rdf:resource="http://abc.de/CorporateBody/123" />tation> </rdf:Description> The declaration of rda-frbr-specific aggregated statements as properties with a domain of an rda-frbr entity and a range of an SES still is the critical point. So are you already developingapproachesfor alternative specifications of custom datatypes which allow a reflection of linked resources and if yes, how do they look? Best regards. Alexander--------------------------------------------------------------- ---------*Von:* List for discussion on Resource Description and Access (RDA) [mailto:dc-...@jiscmail.ac.uk] *Im Auftrag von *Jon Phipps *Gesendet:* Donnerstag, 22. Oktober 2009 17:44 *An:* DC-RDA@JISCMAIL.AC.UK *Betreff:* Re: RDA element vocabularies Hi Alexander, I've put some responses to your questions in-line below. Thanks, Jon Phipps http://metadataregistry.org http://metadataregistry.org/rdabrowse.htm On Wed, Oct 21, 2009 at 4:47 AM, Haffner, Alexander <a.haff...@d-nb.de <mailto:a.haff...@d-nb.de>> wrote: Hello everyone, my name is Alexander Haffner and I started working at German National Library in September 2009. I am involved as a research fellow in the competence centre for interoperable metadata and I am responsible for investigations regarding a fitting integration of RDA compliant metadata forthe SemanticWeb in the German National Library landscape. Last week I gave a presentation about RDA and theSemantic Webin the Library Community Session at DC-2009 (slidesonline soon).That would be great, we'd love to see them. During my preparation I ran into some trouble and misunderstandings regarding the RDA Element Sets. In the presentation I tried to bridge the idea out of RDA Element Analysis(_www.rda-jsc.org/docs/5rda-elementanalysisrev3.pdf_)and the registered elements in the NSDL Registry. As you may have noticed, the RDA Element Analysis itself tries to bridge RDF, XML, Indecs, and DCAM and isn't yet fully formed or formally reviewed (as far as we know). So even thoughwe're payingattention to it, we've departed from it in a number of areas as we've tried to coerce the RDA "Element Set" into a setof RDFS/OWLontologies. Since the element analysis and the RDA documentation haven't always been written with RDFS/OWL or even RDF in mind, we've had to make some judgment calls. Many of thesedecisions tryto take into account the utility of the RDA elements beyond the domain of RDA/FRBR as well as the need to be specific about the RDA/FRBR model. We're also thinking about how many of the RDA properties might map beyond the obvious MARC21 mappings and about how our description of the semantics might enable orrestrict suchmappings. And of course there are some additionalchallenges, suchas the fact that there isn't yet a formal FRBR ontology that can be incorporated into the model. For example, I cannot understand why rda:publicationStatementManifestation and rda:publishersNameManifestation are specified as properties (property and its sub-property) with domain manifestation. Because out of my point of view rda:publicationStatementManifestation has to be of type class and has to offer a property rda:publishersNameManifestation. Because the RDA Element Analysis clarifies that these element sub-element relationships reflect composite patternsand so wehave to find mechanisms to describe these aggregations in our schema too. The "aggregated" statements pose a particular problem and have been the source of much online and offline discussion. One of the primary questions is the one you pose and I think there are a number of problems with the approach you mention. The element analysis clarifies the fact that there is an element/sub-element relationship between publicationStatement and publishersName. In XML this is very easy to express: <Publication statement> <Place of publication> Austin, TX </Place of publication> <Publisher's name>The University of Texas at Austin, College of Liberal Arts </Publisher's name> <Date of publication> [2001]- </Date of publication> </Publication statement> And this maps very nicely to a MARC21 260 field: 260 $a Austin, TX 260 $b The University of Texas at Austin, College of Liberal Arts 260 $c [2001]- This is clearly the way that the RDA authors are thinking about how these aggregated statements will be expressed in instance data. Note that in neither instance do the rules for the main element -- <Publication statement> or 260 -- permit that element to contain a value. It's clean and simple. So what's the best way to express this instance data in RDF and the semantics in RDFS/OWL? We tried this out in a number of different models and found that there are quite a few formal and informal inferences that we have to take into consideration. Despite the MARC21 rules, can an RDA:publicationStatement contain a string? If so, can that string be expected to conform to a particular encoding scheme? These aggregated statements are functional "access points" in a card catalog and as such they're strings that have a clearly specified order of elements -- typed literals defined by a syntax encoding scheme. This isn't in the rules, but is widely understood. If we disable the ability to use such an encoded string as a value for this property, what are the consequences for mapping? Does Publication Statement asa distinctentity/element have any value at all in the description of a resource in the RDF data model if it can't exist independent of its constituent parts? And then of course there's those constituent parts. Do we want/need to restrict those to plain literals? Can we assume that in an open world the value of rda:placeOfPlublication will always be the plain literal that the rda:publicationStatement encoding scheme requires? It seems that it's just as likely, maybe more likely, that rda:placeOfPlublication will have a non-literal value. How do we conform to the RDA rules and still make the ontology extensible and future-proof and relatively domain independent? Do we need to be OWL-DL compatible as does that make us DCAM-compatible too? What about OWL2 and DCAM2?What's the bestway to define a Syntax Encoding Scheme in an ontology? Does your head hurt yet? Unless there's some serious objection, we've decided that yes, aggregated statements have value in the ontology, butprimarily asa place to store the aggregated syntax-encoding stringas it mightappear in a card catalog or a display. Given the lack of semantics, it has limited value, in an RDF query, as a superproperty of its constituent parts. For the moment, we have decided to... * declare a general class of Syntax Encoding Schemes * declare a SES subclass for each statement for uselater as acustom datatype * declare each aggregated statement with no domain or range * declare each rda-frbr-specific aggregated statement as a property with a domain of the rda-frbr entity anda range ofthe SES -- this is incorrect, but right now we need a placeholder until we can come up with a good way of registering custom datatypes * declare each sub-element component of the statement as a generic property with no domain or range * declare each rda-frbr-specific sub-element component of the statement as a subproperty of the generic property with a domain of the rda-frbr entity * There is no range declared on the rda-frbr-specific properties and they're not declared as either an owl:objectProperty or an owl:dataProperty We welcome further feedback and discussion on this approach -- it's very clear that this isn't the only possible approach. I've inserted the related ontology fragments at the end of this email. Together with Alistair Miles I discussed my concerns and he shard my doubts. So I would appreciate to get some feedback regarding your decision making. Furthermore, I want to offer my contribution for the finalization of the RDA Element Vocabularies. So don't hesitate to involve me. Your contributions to the discussion are most welcome, as are Alistair's doubts. Additionally, I'd like to mention that we intend the development of a transformation from MARC to RDA metadata (in the beginning as proof of concept). Alistair already created an account to code4rda. In this context I'd like to know if there are any parallel efforts in this field I don'tknow yet.rdf:about="http://RDVocab.info/Elements/ PublicationStatementEncodingScThanks in advance for your support and best regards from Frankfurt, Alexander -- Alexander Haffner Deutsche Nationalbibliothek Informationstechnik Adickesallee 1 D-60322 Frankfurt am Main Telefon: +49-69-1525-1766 Telefax: +49-69-1525-1799 _mailto:a.haff...@d-nb.de_ _http://www.d-nb.de_ <http://www.d-nb.de/> <!--Class: Publication Statement Encoding Scheme--> <rdf:Descriptionheme <http://rdvocab.info/Elements/PublicationStatementEncodingScheme>"> <rdfs:isDefinedByrdf:resource="http://RDVocab.info/Elements <http://rdvocab.info/Elements>" /><reg:status rdf:resource="http://metadataregistry.org/uri/RegStatus/1002" /> <reg:namexml:lang="en">PublicationStatementEncodingScheme</reg:name><rdfs:label xml:lang="en">Publication Statement Encoding Scheme</rdfs:label> <skos:definition xml:lang="en">This subclass has beencreated to define the Syntax Encoding Scheme for the RDA Publication Statement composite string. The Publication Statement is composed of an ordered, concatenated list of properties:- Place of publication - Parallel place of publication - Publisher's name - Parallel publisher's name - Date of publication </skos:definition> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class" /> <rdfs:subClassOf rdf:resource="http://RDVocab.info/Elements/RDASyntaxEncodingScheme <http://rdvocab.info/Elements/RDASyntaxEncodingScheme>" /> </rdf:Description> <!--Class: RDA Syntax Encoding Scheme--> <rdf:Description rdf:about="http://RDVocab.info/Elements/RDASyntaxEncodingScheme <http://rdvocab.info/Elements/RDASyntaxEncodingScheme>"> <rdfs:isDefinedByrdf:resource="http://RDVocab.info/Elements <http://rdvocab.info/Elements>" /><reg:status rdf:resource="http://metadataregistry.org/uri/RegStatus/1002" /> <reg:name xml:lang="en">RDASyntaxEncodingScheme</reg:name> <rdfs:label xml:lang="en">RDA Syntax Encoding Scheme</rdfs:label> <skos:definition xml:lang="en">This subclass has beencreated to gather the Syntax Encoding Schemes used in RDA. </skos:definition><rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class" /> <rdfs:subClassOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#Datatype" /> </rdf:Description> <!--Property: Publication statement--> <rdf:Description rdf:about="http://RDVocab.info/Elements/publicationStatement <http://rdvocab.info/Elements/publicationStatement>"> <rdfs:isDefinedByrdf:resource="http://RDVocab.info/Elements <http://rdvocab.info/Elements>" /><reg:status rdf:resource="http://metadataregistry.org/uri/RegStatus/1002" /> <reg:name xml:lang="en">publicationStatement</reg:name> <rdfs:label xml:lang="en">Publication statement</rdfs:label> <rdf:typerdf:resource="http://www.w3.org/1999/02/22-rdf-syntax- ns#Property" /><skos:definition xml:lang="en">A statement identifyingthe placerdf:about="http://RDVocab.info/Elements/ publicationStatementManifestator places of publication, publisher or publishers, and date or dates of publication of a resource.</skos:definition> </rdf:Description> <!--Property: Publication statement (Manifestation)--> <rdf:Descriptionion<http://rdvocab.info/Elements/publicationStatementManifestation>"><rdfs:isDefinedByrdf:resource="http://RDVocab.info/Elements <http://rdvocab.info/Elements>" /><reg:status rdf:resource="http://metadataregistry.org/uri/RegStatus/1002" /> <reg:namexml:lang="en">publicationStatementManifestation</reg:name><rdfs:label xml:lang="en">Publication statement (Manifestation)</rdfs:label> <skos:definition xml:lang="en">A statement identifyingthe placerdf:resource="http://RDVocab.info/uri/schema/FRBRentitiesRDA/ Manifestaor places of publication, publisher or publishers, and date or dates of publication of a resource.</skos:definition> <rdf:typerdf:resource="http://www.w3.org/1999/02/22-rdf-syntax- ns#Property" /><rdfs:subPropertyOf rdf:resource="http://RDVocab.info/Elements/publicationStatement <http://rdvocab.info/Elements/publicationStatement>" /> <rdfs:domainrdf:resource="http://RDVocab.info/Elements/ PublicationStatementEncodintion <http://rdvocab.info/uri/schema/FRBRentitiesRDA/Manifestation>" /> <rdfs:rangegScheme<http://rdvocab.info/Elements/ PublicationStatementEncodingScheme>" /></rdf:Description> <!--Property: Publisher's name--> <rdf:Description rdf:about="http://RDVocab.info/Elements/publishersName <http://rdvocab.info/Elements/publishersName>"> <rdfs:isDefinedByrdf:resource="http://RDVocab.info/Elements <http://rdvocab.info/Elements>" /><reg:status rdf:resource="http://metadataregistry.org/uri/RegStatus/1002" /> <reg:name xml:lang="en">publishersName</reg:name> <rdfs:label xml:lang="en">Publisher's name</rdfs:label> <skos:definition xml:lang="en">The name of a person,family, orcorporate body responsible for publishing, releasing, or issuing a resource.</skos:definition> <rdf:typerdf:resource="http://www.w3.org/1999/02/22-rdf-syntax- ns#Property" /></rdf:Description> <!--Property: Publisher's name (Manifestation)--> <rdf:Description rdf:about="http://RDVocab.info/Elements/publishersNameManifestation <http://rdvocab.info/Elements/publishersNameManifestation>"> <rdfs:isDefinedByrdf:resource="http://RDVocab.info/Elements <http://rdvocab.info/Elements>" /><reg:status rdf:resource="http://metadataregistry.org/uri/RegStatus/1002" /> <reg:name xml:lang="en">publishersNameManifestation</reg:name> <rdfs:label xml:lang="en">Publisher's name (Manifestation)</rdfs:label> <skos:definition xml:lang="en">The name of a person,family, orrdf:resource="http://RDVocab.info/uri/schema/FRBRentitiesRDA/ Manifestacorporate body responsible for publishing, releasing, or issuing a resource.</skos:definition> <rdf:typerdf:resource="http://www.w3.org/1999/02/22-rdf-syntax- ns#Property" /><rdfs:subPropertyOf rdf:resource="http://RDVocab.info/Elements/publishersName <http://rdvocab.info/Elements/publishersName>" /> <rdfs:domaintion <http://rdvocab.info/uri/schema/FRBRentitiesRDA/Manifestation>" /> </rdf:Description> <!--Property: Place of publication--> <rdf:Description rdf:about="http://RDVocab.info/Elements/placeOfPublication <http://rdvocab.info/Elements/placeOfPublication>"> <rdfs:isDefinedByrdf:resource="http://RDVocab.info/Elements <http://rdvocab.info/Elements>" /><reg:status rdf:resource="http://metadataregistry.org/uri/RegStatus/1002" /> <reg:name xml:lang="en">placeOfPublication</reg:name> <rdfs:label xml:lang="en">Place of publication</rdfs:label> <skos:definition xml:lang="en">A place associated withthe publication, release, or issuing of a resource.</skos:definition>rdf:about="http://RDVocab.info/Elements/ placeOfPublicationManifestatio<rdf:typerdf:resource="http://www.w3.org/1999/02/22-rdf-syntax- ns#Property" /></rdf:Description> <!--Property: Place of publication (Manifestation)--> <rdf:Descriptionn <http://rdvocab.info/Elements/placeOfPublicationManifestation>"> <rdfs:isDefinedByrdf:resource="http://RDVocab.info/Elements <http://rdvocab.info/Elements>" /><reg:status rdf:resource="http://metadataregistry.org/uri/RegStatus/1002" /> <reg:namexml:lang="en">placeOfPublicationManifestation</reg:name>the publication, release, or issuing of a resource. </ skos:definition><rdfs:label xml:lang="en">Place of publication (Manifestation)</rdfs:label> <skos:definition xml:lang="en">A place associated withrdf:resource="http://RDVocab.info/uri/schema/FRBRentitiesRDA/ Manifesta<rdf:typerdf:resource="http://www.w3.org/1999/02/22-rdf-syntax- ns#Property" /><rdfs:subPropertyOf rdf:resource="http://RDVocab.info/Elements/placeOfPublication <http://rdvocab.info/Elements/placeOfPublication>" /> <rdfs:domaintion <http://rdvocab.info/uri/schema/FRBRentitiesRDA/Manifestation>" /> </rdf:Description>
Rob Styles tel: +44 (0)870 400 5000 fax: +44 (0)870 400 5001 mobile: +44 (0)7971 475 257 msn: mmmmm...@yahoo.com irc: irc.freenode.net/mmmmmrob,isnick web: http://www.talis.com/ blog: http://www.dynamicorange.com/blog/ blog: http://blogs.talis.com/panlibus/ blog: http://blogs.talis.com/nodalities/ blog: http://blogs.talis.com/n2/ Please consider the environment before printing this email.Find out more about Talis at www.talis.com
shared innovationTM Any views or personal opinions expressed within this email may not be those of Talis Information Ltd or its employees. The content of this email message and any files that may be attached are confidential, and for the usage of the intended recipient only. If you are not the intended recipient, then please return this message to the sender and delete it. Any use of this e-mail by an unauthorised recipient is prohibited. Talis Information Ltd is a member of the Talis Group of companies and is registered in England No 3638278 with its registered office at Knights Court, Solihull Parkway, Birmingham Business Park, B37 7YB.