Re: [CODE4LIB] RDA in RDF, was: Something completely different
Ross Singer wrote: So, thanks to the help of my coworkers, here's the RDA Elements schema reformatted in an easier to read presentation: http://morph.talis.com/?data-uri[]=http%3A%2F%2Frdvocab.info%2FElements.rdfinput=output=exhibitcallback= I have to say I feel like this schema is trying to both do way too much and subsequently loses the resource specificity that RDF would be providing. Absolutely. I think there 's a real issue that NO technology folks were involved in the creation of RDA. So this is data from a cataloger's perspective, and from the perspective of guidance rules for creating bibliographic data. I'm pretty sure that we can't create a viable data record using the RDA data elements, and I hate the idea that the data format, once again, is an afterthought rather than integral to the data creation standard. For one thing, it seems to reinvent a _lot_ of wheels. Why does it define its own title property instead of using DC's? Because they wanted their own definition. Everything in the RDA element list has an RDA-specific meaning, which then makes it impossible to use any existing data properties. But there's more: RDA was defining RDA cataloging rules, not a schema or record format. Not only are there multiple data elements where one could do, there are things that are missing. For example, the FRBR place entity can ONLY be used as a subject, so it really means place as subject. There's no general place element that could be used, for example, in place of publication. The latter has no relationship to FRBR place. This is a FRBR problem as much as an RDA problem, but again FRBR functions at a conceptual level and doesn't really provide a schema that one can work with. By using properties like titleOfTheWork, dateOfWork and all of the properties that are specifically about TheSeries there is tremendous duplication of text. If Work was its own class, you would only need say that this manifestation was an embodimentOf of it and reuse all of the title-based properties for manifestation. Exactly. This is what I've been saying (or trying to say) in relation to the bibo discussion. You should be able to use whatever properties you want with the FRBR classes, and not restrict data elements to a single class. This is a big problem in RDA, but I can say that when it was brought up to them (JSC) they strongly defended this choice and would not budge. RDA, to JSC, has a specific relationship to FRBR, and if you use a data element with a different FRBR class, then you are no longer doing RDA. What does property 'uri' mean? Did you look at the rdf/xml? I'm wondering if it isn't the display that's confusing. I also can't figure out how people/institutions are modeled in this schema, since none of the elements have ranges. Are they their own resources? If so, what? The way it looks at a glance, they're strings? EVERYTHING is strings at the moment, with a very very few exceptions (like some dates, I think). Some data elements CAN use a controlled vocabulary, but I believe that all of those are a mixture of uncontrolled and controlled strings. People and institutions are mainly undefined because that is in the FRAD realm. And FRAD hasn't been finalized. Also note that the JSC didn't feel it could do anything that would be too incompatible with the 'legacy' -- that is, with all of our AACR/MARC data. It seems to me that very little work was done find preexisting vocabularies to reuse and this schema still presents a very 'document-centric' or 'record-centric' view of data. Absolutely. The catalogers are still creating a textual document, not data. At best you can mark up the text, as we do with the MARC record. I worry that we won't be able to mesh the cataloger's view with a data view -- that the two are some how inherently opposed. I'd like to start modeling a new data format but I can't imagine how we can bridge the gap between the catalogers and the system view. I suppose a very clever interface could hide the data view from the catalogers, but starting from either AACR2 or RDA and trying to get there feels extremely difficult. I guess my fear is that it will require compromises, and those will be hard to negotiate. kc p.s. The RDA element analysis is at http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf. That was the input to the registry. -- --- Karen Coyle / Digital Library Consultant kco...@kcoyle.net http://www.kcoyle.net ph.: 510-540-7596 skype: kcoylenet fx.: 510-848-3913 mo.: 510-435-8234
Re: [CODE4LIB] RDA in RDF, was: Something completely different
See also the thread, 'RDA: A Standard Nobody Will Notice'. http://www.mail-archive.com/code4lib@listserv.nd.edu/msg04422.html A standard nobody will notice ... for good reason. Rob On Tue, 2009-04-07 at 18:24 +0100, Eric Lease Morgan wrote: On Apr 7, 2009, at 1:15 PM, Karen Coyle wrote: Absolutely. The catalogers are still creating a textual document, not data. At best you can mark up the text, as we do with the MARC record... Listen... What you hear from over here is the sound of a very heavy sigh coming from a computer type who really wants to help improve the way library data is used in a networked environment, but they can't convince their own to modify the way they encode information.
Re: [CODE4LIB] RDA in RDF, was: Something completely different
On Tue, Apr 7, 2009 at 1:24 PM, Eric Lease Morgan emor...@nd.edu wrote: Listen... What you hear from over here is the sound of a very heavy sigh coming from a computer type who really wants to help improve the way library data is used in a networked environment, but they can't convince their own to modify the way they encode information. See also Fiander, David J. Applying XML to the Bibliographic Description. Cataloging and Classification Quarterly 33, no. 2 (2001): 17-28. Fiander, David J., and D. Grant Campbell. An XML Definition for an ISBD-Based Encoding Scheme. Journal of Internet Cataloging 6, no. 4 (2003): 29-58. Which is what happens when a computer type starts de novo with the cataloguing standards and builds simple data structures.
Re: [CODE4LIB] RDA in RDF, was: Something completely different
Karen, thanks for this summary of the process. It's pretty disheartening, sadly. I got 'uri' wrong, btw, it's Universal Resource Locator' !--Property: Uniform resource locator-- - rdf:Property rdf:about=http://RDVocab.info/Elements/uniformResourceLocator; rdfs:label xml:lang=enUniform resource locator/rdfs:label skos:definition xml:lang=en The address of a remote access resource. /skos:definition rdfs:isDefinedBy rdf:resource=http://RDVocab.info/Elements/ reg:status rdf:resource=http://metadataregistry.org/uri/RegStatus/1002/ /rdf:Property But again, not exactly the best use of the tools at their disposal. All this being said, it's really not too late to fix any of this, since nobody is implementing this and, realistically, nobody ever will. -Ross. On Tue, Apr 7, 2009 at 1:15 PM, Karen Coyle li...@kcoyle.net wrote: Ross Singer wrote: So, thanks to the help of my coworkers, here's the RDA Elements schema reformatted in an easier to read presentation: http://morph.talis.com/?data-uri[]=http%3A%2F%2Frdvocab.info%2FElements.rdfinput=output=exhibitcallback= I have to say I feel like this schema is trying to both do way too much and subsequently loses the resource specificity that RDF would be providing. Absolutely. I think there 's a real issue that NO technology folks were involved in the creation of RDA. So this is data from a cataloger's perspective, and from the perspective of guidance rules for creating bibliographic data. I'm pretty sure that we can't create a viable data record using the RDA data elements, and I hate the idea that the data format, once again, is an afterthought rather than integral to the data creation standard. For one thing, it seems to reinvent a _lot_ of wheels. Why does it define its own title property instead of using DC's? Because they wanted their own definition. Everything in the RDA element list has an RDA-specific meaning, which then makes it impossible to use any existing data properties. But there's more: RDA was defining RDA cataloging rules, not a schema or record format. Not only are there multiple data elements where one could do, there are things that are missing. For example, the FRBR place entity can ONLY be used as a subject, so it really means place as subject. There's no general place element that could be used, for example, in place of publication. The latter has no relationship to FRBR place. This is a FRBR problem as much as an RDA problem, but again FRBR functions at a conceptual level and doesn't really provide a schema that one can work with. By using properties like titleOfTheWork, dateOfWork and all of the properties that are specifically about TheSeries there is tremendous duplication of text. If Work was its own class, you would only need say that this manifestation was an embodimentOf of it and reuse all of the title-based properties for manifestation. Exactly. This is what I've been saying (or trying to say) in relation to the bibo discussion. You should be able to use whatever properties you want with the FRBR classes, and not restrict data elements to a single class. This is a big problem in RDA, but I can say that when it was brought up to them (JSC) they strongly defended this choice and would not budge. RDA, to JSC, has a specific relationship to FRBR, and if you use a data element with a different FRBR class, then you are no longer doing RDA. What does property 'uri' mean? Did you look at the rdf/xml? I'm wondering if it isn't the display that's confusing. I also can't figure out how people/institutions are modeled in this schema, since none of the elements have ranges. Are they their own resources? If so, what? The way it looks at a glance, they're strings? EVERYTHING is strings at the moment, with a very very few exceptions (like some dates, I think). Some data elements CAN use a controlled vocabulary, but I believe that all of those are a mixture of uncontrolled and controlled strings. People and institutions are mainly undefined because that is in the FRAD realm. And FRAD hasn't been finalized. Also note that the JSC didn't feel it could do anything that would be too incompatible with the 'legacy' -- that is, with all of our AACR/MARC data. It seems to me that very little work was done find preexisting vocabularies to reuse and this schema still presents a very 'document-centric' or 'record-centric' view of data. Absolutely. The catalogers are still creating a textual document, not data. At best you can mark up the text, as we do with the MARC record. I worry that we won't be able to mesh the cataloger's view with a data view -- that the two are some how inherently opposed. I'd like to start modeling a new data format but I can't imagine how we can bridge the gap between the catalogers and the system view. I suppose a very clever interface could hide the data view from the catalogers, but starting from either AACR2 or RDA and trying to get
Re: [CODE4LIB] RDA in RDF, was: Something completely different
Roy, That's true. Unfortunately, I missed Kevin's talk at Access '02 in Windsor, and since I wrote the first of those two papers I've mostly been out of the loop, since it's not my area any more. - David On Tue, Apr 7, 2009 at 1:48 PM, Roy Tennant tenna...@oclc.org wrote: Well, and then you have the XOBIS work from Stanford that ksclarke was involved with. Roy On 4/7/09 4/7/09 € 10:41 AM, David Fiander da...@fiander.info wrote: On Tue, Apr 7, 2009 at 1:24 PM, Eric Lease Morgan emor...@nd.edu wrote: Listen... What you hear from over here is the sound of a very heavy sigh coming from a computer type who really wants to help improve the way library data is used in a networked environment, but they can't convince their own to modify the way they encode information. See also Fiander, David J. Applying XML to the Bibliographic Description. Cataloging and Classification Quarterly 33, no. 2 (2001): 17-28. Fiander, David J., and D. Grant Campbell. An XML Definition for an ISBD-Based Encoding Scheme. Journal of Internet Cataloging 6, no. 4 (2003): 29-58. Which is what happens when a computer type starts de novo with the cataloguing standards and builds simple data structures. --
Re: [CODE4LIB] RDA in RDF, was: Something completely different
It's not off-topic, at least I don't think so. And I don't think anybody is asking to give up on catalogers. Just like I don't think anybody would want the technologists to describe the materials, I think the problem is that the catalogers tried to apply their idea of a data model into tangible technology. Actually, I think the resource sharing argument is red herring. A shift to resource-centricity (vs. record-centricity) just means you when you grab a new 'manifestation' for your local catalog, you may also have to grab the creator, the publisher, the series, the expression, the work, the subjects, etc. All of these can be bundled in the same xml document, though -- really it's just a different way of looking at the data, but it's not a radical departure in the delivery/discovery. -Ross. On Tue, Apr 7, 2009 at 1:44 PM, Anna Headley ahead...@swarthmore.edu wrote: And what you hear over here is a plea to not give up on catalogers. Some are beyond ready to move from text to data. Hiding the data view -- do you mean making it look like marc? -- sounds pretty awful. Catalogers who are on board are trapped by the way sharing currently works, i.e. record sharing. If the leaders of the cataloging community are failing, what can catalogers do? This is an honest question, not a throwing-up-of-hands. Though maybe completely off-topic for this list. ah Karen Coyle wrote: Absolutely. The catalogers are still creating a textual document, not data. At best you can mark up the text, as we do with the MARC record. I worry that we won't be able to mesh the cataloger's view with a data view -- that the two are some how inherently opposed. I'd like to start modeling a new data format but I can't imagine how we can bridge the gap between the catalogers and the system view. I suppose a very clever interface could hide the data view from the catalogers, but starting from either AACR2 or RDA and trying to get there feels extremely difficult. I guess my fear is that it will require compromises, and those will be hard to negotiate. kc p.s. The RDA element analysis is at http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf. That was the input to the registry. -- Anna Headley Swarthmore College Library 610.690.5781 ahead...@swarthmore.edu
Re: [CODE4LIB] RDA in RDF, was: Something completely different
On Tue, Apr 7, 2009 at 1:44 PM, Anna Headley ahead...@swarthmore.eduwrote: And what you hear over here is a plea to not give up on catalogers. Some are beyond ready to move from text to data. Hiding the data view -- do you mean making it look like marc? -- sounds pretty awful. Catalogers who are on board are trapped by the way sharing currently works, i.e. record sharing. If the leaders of the cataloging community are failing, what can catalogers do? This is an honest question, not a throwing-up-of-hands. Though maybe completely off-topic for this list. Hear, hear. I don't think we'll see a real solution unless we consider both the tech-folks' and the catalogers' concerns. I'm also sympathetic to knowledge domains wanting to have control over the meaning of their data elements (to have a useful and well defined set). How we move forward when we have so much legacy data (and supporting systems), as Anna said, is a difficult problem. Thanks for the plug Roy. The checks in the mail. ;-) Kevin -- Kevin S. Clarke Coordinator of Web Services Belk Library Information Commons Appalachian State University 218 College Street Boone, NC 28608 clark...@appstate.edu (828) 262-8472 There are two kinds of people in the world: those who believe there are two kinds of people and those who know better.
Re: [CODE4LIB] RDA in RDF, was: Something completely different
Well, there's the project by Alistair Miles that Karen alluded to earlier: http://code.google.com/p/code4rda The goals of this project are, in my mind, crucial in moving forward, since it's taking our existing corpus of records and turning them into RDA/RDF. Not only is it a good proof of concept to show how these new data models would look and work (esp. how they would work w/r/t to current applications/workflows), but, more importantly, it shows it can be done *with our current data* alleviating the need for some unrealistic retrospective recataloging effort. I guess the way I look at it is, there's still time to fix this, at least technologically. There is a difference between the standard, the data model and the application. Karen posted a couple of weeks back that UKMARC didn't include punctuation, instead leaving it to technology to add it. This doesn't mean they didn't follow AACR2, they just didn't encode it into the data fields, explicitly. Of course, they gave this up when they adopted MARC21. Anyway, there's a separation of concerns that is currently being blurred, but doesn't have to be in practice. -Ross. On Tue, Apr 7, 2009 at 2:25 PM, Anna Headley ahead...@swarthmore.edu wrote: But the first one to take this on has no one to grab from. The sharing argument may be a red herring in that the problem, from some perspectives, isn't so much about sharing one's own work -- it's more about using others' work. Or is there already a community of people doing something like what Ross describes? If so, where can I find out more about who, and how this works? It seems to me that the best movements forward in this opening of data are centered on translating marc into more web-usable forms. Which is great**... for everyone except catalogers with no love for marc. Jakob makes a good point in the post that Rob pointed out (http://www.mail-archive.com/code4lib@listserv.nd.edu/msg04422.html)... when cataloging can look like librarything, the rules *and, I would add, tools* we use seem incredibly bloated. ** I do mean great. We have to start somewhere. It's just that the cataloging pieces move so excruciatingly slowly. ah Ross Singer wrote: It's not off-topic, at least I don't think so. And I don't think anybody is asking to give up on catalogers. Just like I don't think anybody would want the technologists to describe the materials, I think the problem is that the catalogers tried to apply their idea of a data model into tangible technology. Actually, I think the resource sharing argument is red herring. A shift to resource-centricity (vs. record-centricity) just means you when you grab a new 'manifestation' for your local catalog, you may also have to grab the creator, the publisher, the series, the expression, the work, the subjects, etc. All of these can be bundled in the same xml document, though -- really it's just a different way of looking at the data, but it's not a radical departure in the delivery/discovery. -Ross. On Tue, Apr 7, 2009 at 1:44 PM, Anna Headley ahead...@swarthmore.edu wrote: And what you hear over here is a plea to not give up on catalogers. Some are beyond ready to move from text to data. Hiding the data view -- do you mean making it look like marc? -- sounds pretty awful. Catalogers who are on board are trapped by the way sharing currently works, i.e. record sharing. If the leaders of the cataloging community are failing, what can catalogers do? This is an honest question, not a throwing-up-of-hands. Though maybe completely off-topic for this list. ah Karen Coyle wrote: Absolutely. The catalogers are still creating a textual document, not data. At best you can mark up the text, as we do with the MARC record. I worry that we won't be able to mesh the cataloger's view with a data view -- that the two are some how inherently opposed. I'd like to start modeling a new data format but I can't imagine how we can bridge the gap between the catalogers and the system view. I suppose a very clever interface could hide the data view from the catalogers, but starting from either AACR2 or RDA and trying to get there feels extremely difficult. I guess my fear is that it will require compromises, and those will be hard to negotiate. kc p.s. The RDA element analysis is at http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf. That was the input to the registry. -- Anna Headley Swarthmore College Library 610.690.5781 ahead...@swarthmore.edu -- Anna Headley Swarthmore College Library 610.690.5781 ahead...@swarthmore.edu
Re: [CODE4LIB] RDA in RDF, was: Something completely different
But the first one to take this on has no one to grab from. The sharing argument may be a red herring in that the problem, from some perspectives, isn't so much about sharing one's own work -- it's more about using others' work. Or is there already a community of people doing something like what Ross describes? If so, where can I find out more about who, and how this works? It seems to me that the best movements forward in this opening of data are centered on translating marc into more web-usable forms. Which is great**... for everyone except catalogers with no love for marc. Jakob makes a good point in the post that Rob pointed out (http://www.mail-archive.com/code4lib@listserv.nd.edu/msg04422.html)... when cataloging can look like librarything, the rules *and, I would add, tools* we use seem incredibly bloated. ** I do mean great. We have to start somewhere. It's just that the cataloging pieces move so excruciatingly slowly. ah Ross Singer wrote: It's not off-topic, at least I don't think so. And I don't think anybody is asking to give up on catalogers. Just like I don't think anybody would want the technologists to describe the materials, I think the problem is that the catalogers tried to apply their idea of a data model into tangible technology. Actually, I think the resource sharing argument is red herring. A shift to resource-centricity (vs. record-centricity) just means you when you grab a new 'manifestation' for your local catalog, you may also have to grab the creator, the publisher, the series, the expression, the work, the subjects, etc. All of these can be bundled in the same xml document, though -- really it's just a different way of looking at the data, but it's not a radical departure in the delivery/discovery. -Ross. On Tue, Apr 7, 2009 at 1:44 PM, Anna Headley ahead...@swarthmore.edu wrote: And what you hear over here is a plea to not give up on catalogers. Some are beyond ready to move from text to data. Hiding the data view -- do you mean making it look like marc? -- sounds pretty awful. Catalogers who are on board are trapped by the way sharing currently works, i.e. record sharing. If the leaders of the cataloging community are failing, what can catalogers do? This is an honest question, not a throwing-up-of-hands. Though maybe completely off-topic for this list. ah Karen Coyle wrote: Absolutely. The catalogers are still creating a textual document, not data. At best you can mark up the text, as we do with the MARC record. I worry that we won't be able to mesh the cataloger's view with a data view -- that the two are some how inherently opposed. I'd like to start modeling a new data format but I can't imagine how we can bridge the gap between the catalogers and the system view. I suppose a very clever interface could hide the data view from the catalogers, but starting from either AACR2 or RDA and trying to get there feels extremely difficult. I guess my fear is that it will require compromises, and those will be hard to negotiate. kc p.s. The RDA element analysis is at http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf. That was the input to the registry. -- Anna Headley Swarthmore College Library 610.690.5781 ahead...@swarthmore.edu -- Anna Headley Swarthmore College Library 610.690.5781 ahead...@swarthmore.edu
Re: [CODE4LIB] RDA in RDF, was: Something completely different
Ross Singer wrote: Well, there's the project by Alistair Miles that Karen alluded to earlier: http://code.google.com/p/code4rda The goals of this project are, in my mind, crucial in moving forward, since it's taking our existing corpus of records and turning them into RDA/RDF. Not only is it a good proof of concept to show how these new data models would look and work (esp. how they would work w/r/t to current applications/workflows), but, more importantly, it shows it can be done *with our current data* alleviating the need for some unrealistic retrospective recataloging effort. I guess the way I look at it is, there's still time to fix this, at least technologically. There is a difference between the standard, the data model and the application. An interesting experiment would be to attempt to use the cataloger's use cases that Alistair worked from, but instead of using the RDA vocabulary to use bibo+vocab.org/frbr. That would give us something comparative to look at. If bibo+frbr can do all or even a lot of what RDA does, then we can demonstrate a different model and explain why one is better than the other (or at least that more than one model will work). kc -- --- Karen Coyle / Digital Library Consultant kco...@kcoyle.net http://www.kcoyle.net ph.: 510-540-7596 skype: kcoylenet fx.: 510-848-3913 mo.: 510-435-8234