Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread Karen Coyle

Ross Singer wrote:

So, thanks to the help of my coworkers, here's the RDA Elements schema
reformatted in an easier to read presentation:
http://morph.talis.com/?data-uri[]=http%3A%2F%2Frdvocab.info%2FElements.rdfinput=output=exhibitcallback=

I have to say I feel like this schema is trying to both do way too
much and subsequently loses the resource specificity that RDF would be
providing.
  


Absolutely. I think there 's a real issue that NO technology folks were 
involved in the creation of RDA. So this is data from a cataloger's 
perspective, and from the perspective of guidance rules for creating 
bibliographic data. I'm pretty sure that we can't create a viable data 
record using the RDA data elements, and I hate the idea that the data 
format, once again, is an afterthought rather than integral to the data 
creation standard.



For one thing, it seems to reinvent a _lot_ of wheels.  Why does it
define its own title property instead of using DC's? 


Because they wanted their own definition. Everything in the RDA element 
list has an RDA-specific meaning, which then makes it impossible to use 
any existing data properties. But there's more: RDA was defining RDA 
cataloging rules, not a schema or record format. Not only are there 
multiple data elements where one could do, there are things that are 
missing. For example, the FRBR place entity can ONLY be used as a 
subject, so it really means place as subject. There's no general 
place element that could be used, for example, in place of 
publication. The latter has no relationship to FRBR place. This is a 
FRBR problem as much as an RDA problem, but again FRBR functions at a 
conceptual level and doesn't really provide a schema that one can work with.



 By using
properties like titleOfTheWork, dateOfWork and all of the properties
that are specifically about TheSeries there is tremendous duplication
of text.  If Work was its own class, you would only need say that this
manifestation was an embodimentOf of it and reuse all of the
title-based properties for manifestation. 


Exactly. This is what I've been saying (or trying to say) in relation to 
the bibo discussion. You should be able to use whatever properties you 
want with the FRBR classes, and not restrict data elements to a single 
class. This is a big problem in RDA, but I can say that when it was 
brought up to them (JSC) they strongly defended this choice and would 
not budge. RDA, to JSC, has a specific relationship to FRBR, and if you 
use a data element with a different FRBR class, then you are no longer 
doing RDA.


 
What does property 'uri' mean?
  


Did you look at the rdf/xml? I'm wondering if it isn't the display 
that's confusing.



I also can't figure out how people/institutions are modeled in this
schema, since none of the elements have ranges.  Are they their own
resources?  If so, what?  The way it looks at a glance, they're
strings?
  


EVERYTHING is strings at the moment, with a very very few exceptions 
(like some dates, I think). Some data elements CAN use a controlled 
vocabulary, but I believe that all of those are a mixture of 
uncontrolled and controlled strings. People and institutions are mainly 
undefined because that is in the FRAD realm. And FRAD hasn't been 
finalized. Also note that the JSC didn't feel it could do anything that 
would be too incompatible with the 'legacy' -- that is, with all of our 
AACR/MARC data.



It seems to me that very little work was done find preexisting
vocabularies to reuse and this schema still presents a very
'document-centric' or 'record-centric' view of data.
  


Absolutely. The catalogers are still creating a textual document, not 
data. At best you can mark up the text, as we do with the MARC record. I 
worry that we won't be able to mesh the cataloger's view with a data 
view -- that the two are some how inherently opposed. I'd like to start 
modeling a new data format but I can't imagine how we can bridge the gap 
between the catalogers and the system view. I suppose a very clever 
interface could hide the data view from the catalogers, but starting 
from either AACR2 or RDA and trying to get there feels extremely 
difficult. I guess my fear is that it will require compromises, and 
those will be hard to negotiate.


kc

p.s. The RDA element analysis is at 
http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf. 
That was the input to the registry.


--
---
Karen Coyle / Digital Library Consultant
kco...@kcoyle.net http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234



Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread Rob Sanderson
See also the thread, 'RDA: A Standard Nobody Will Notice'.

http://www.mail-archive.com/code4lib@listserv.nd.edu/msg04422.html

A standard nobody will notice ... for good reason. 

Rob

On Tue, 2009-04-07 at 18:24 +0100, Eric Lease Morgan wrote:
 On Apr 7, 2009, at 1:15 PM, Karen Coyle wrote:
 
  Absolutely. The catalogers are still creating a textual document, not
  data. At best you can mark up the text, as we do with the MARC  
  record...
 
 
 Listen...  What you hear from over here is the sound of a very heavy  
 sigh coming from a computer type who really wants to help improve the  
 way library data is used in a networked environment, but they can't  
 convince their own to modify the way they encode information.
 


Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread David Fiander
On Tue, Apr 7, 2009 at 1:24 PM, Eric Lease Morgan emor...@nd.edu wrote:
 Listen...  What you hear from over here is the sound of a very heavy sigh
 coming from a computer type who really wants to help improve the way library
 data is used in a networked environment, but they can't convince their own
 to modify the way they encode information.

See also

Fiander, David J. Applying XML to the Bibliographic Description.
Cataloging and Classification Quarterly 33, no. 2 (2001): 17-28.

Fiander, David J., and D. Grant Campbell. An XML Definition for an
ISBD-Based Encoding Scheme. Journal of Internet Cataloging 6, no. 4
(2003): 29-58.

Which is what happens when a computer type starts de novo with the
cataloguing standards and builds simple data structures.


Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread Ross Singer
Karen, thanks for this summary of the process.  It's pretty
disheartening, sadly.

I got 'uri' wrong, btw, it's Universal Resource Locator'
!--Property: Uniform resource locator--
-
rdf:Property rdf:about=http://RDVocab.info/Elements/uniformResourceLocator;
rdfs:label xml:lang=enUniform resource locator/rdfs:label
skos:definition xml:lang=en
The address of a remote access resource.  /skos:definition
rdfs:isDefinedBy rdf:resource=http://RDVocab.info/Elements/
reg:status rdf:resource=http://metadataregistry.org/uri/RegStatus/1002/
/rdf:Property

But again, not exactly the best use of the tools at their disposal.

All this being said, it's really not too late to fix any of this,
since nobody is implementing this and, realistically, nobody ever
will.

-Ross.

On Tue, Apr 7, 2009 at 1:15 PM, Karen Coyle li...@kcoyle.net wrote:
 Ross Singer wrote:

 So, thanks to the help of my coworkers, here's the RDA Elements schema
 reformatted in an easier to read presentation:

 http://morph.talis.com/?data-uri[]=http%3A%2F%2Frdvocab.info%2FElements.rdfinput=output=exhibitcallback=

 I have to say I feel like this schema is trying to both do way too
 much and subsequently loses the resource specificity that RDF would be
 providing.


 Absolutely. I think there 's a real issue that NO technology folks were
 involved in the creation of RDA. So this is data from a cataloger's
 perspective, and from the perspective of guidance rules for creating
 bibliographic data. I'm pretty sure that we can't create a viable data
 record using the RDA data elements, and I hate the idea that the data
 format, once again, is an afterthought rather than integral to the data
 creation standard.

 For one thing, it seems to reinvent a _lot_ of wheels.  Why does it
 define its own title property instead of using DC's?

 Because they wanted their own definition. Everything in the RDA element list
 has an RDA-specific meaning, which then makes it impossible to use any
 existing data properties. But there's more: RDA was defining RDA cataloging
 rules, not a schema or record format. Not only are there multiple data
 elements where one could do, there are things that are missing. For example,
 the FRBR place entity can ONLY be used as a subject, so it really means
 place as subject. There's no general place element that could be used,
 for example, in place of publication. The latter has no relationship to FRBR
 place. This is a FRBR problem as much as an RDA problem, but again FRBR
 functions at a conceptual level and doesn't really provide a schema that one
 can work with.

  By using
 properties like titleOfTheWork, dateOfWork and all of the properties
 that are specifically about TheSeries there is tremendous duplication
 of text.  If Work was its own class, you would only need say that this
 manifestation was an embodimentOf of it and reuse all of the
 title-based properties for manifestation.

 Exactly. This is what I've been saying (or trying to say) in relation to the
 bibo discussion. You should be able to use whatever properties you want with
 the FRBR classes, and not restrict data elements to a single class. This is
 a big problem in RDA, but I can say that when it was brought up to them
 (JSC) they strongly defended this choice and would not budge. RDA, to JSC,
 has a specific relationship to FRBR, and if you use a data element with a
 different FRBR class, then you are no longer doing RDA.

  What does property 'uri' mean?


 Did you look at the rdf/xml? I'm wondering if it isn't the display that's
 confusing.

 I also can't figure out how people/institutions are modeled in this
 schema, since none of the elements have ranges.  Are they their own
 resources?  If so, what?  The way it looks at a glance, they're
 strings?


 EVERYTHING is strings at the moment, with a very very few exceptions (like
 some dates, I think). Some data elements CAN use a controlled vocabulary,
 but I believe that all of those are a mixture of uncontrolled and controlled
 strings. People and institutions are mainly undefined because that is in the
 FRAD realm. And FRAD hasn't been finalized. Also note that the JSC didn't
 feel it could do anything that would be too incompatible with the 'legacy'
 -- that is, with all of our AACR/MARC data.

 It seems to me that very little work was done find preexisting
 vocabularies to reuse and this schema still presents a very
 'document-centric' or 'record-centric' view of data.


 Absolutely. The catalogers are still creating a textual document, not data.
 At best you can mark up the text, as we do with the MARC record. I worry
 that we won't be able to mesh the cataloger's view with a data view -- that
 the two are some how inherently opposed. I'd like to start modeling a new
 data format but I can't imagine how we can bridge the gap between the
 catalogers and the system view. I suppose a very clever interface could hide
 the data view from the catalogers, but starting from either AACR2 or RDA and
 trying to get 

Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread David Fiander
Roy,

That's true. Unfortunately, I missed Kevin's talk at Access '02 in
Windsor, and since I wrote the first of those two papers I've mostly
been out of the loop, since it's not my area any more.

- David

On Tue, Apr 7, 2009 at 1:48 PM, Roy Tennant tenna...@oclc.org wrote:
 Well, and then you have the XOBIS work from Stanford that ksclarke was
 involved with.
 Roy


 On 4/7/09 4/7/09 € 10:41 AM, David Fiander da...@fiander.info wrote:

 On Tue, Apr 7, 2009 at 1:24 PM, Eric Lease Morgan emor...@nd.edu wrote:
 Listen...  What you hear from over here is the sound of a very heavy sigh
 coming from a computer type who really wants to help improve the way library
 data is used in a networked environment, but they can't convince their own
 to modify the way they encode information.

 See also

 Fiander, David J. Applying XML to the Bibliographic Description.
 Cataloging and Classification Quarterly 33, no. 2 (2001): 17-28.

 Fiander, David J., and D. Grant Campbell. An XML Definition for an
 ISBD-Based Encoding Scheme. Journal of Internet Cataloging 6, no. 4
 (2003): 29-58.

 Which is what happens when a computer type starts de novo with the
 cataloguing standards and builds simple data structures.


 --



Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread Ross Singer
It's not off-topic, at least I don't think so.

And I don't think anybody is asking to give up on catalogers.  Just
like I don't think anybody would want the technologists to describe
the materials, I think the problem is that the catalogers tried to
apply their idea of a data model into tangible technology.

Actually, I think the resource sharing argument is red herring.  A
shift to resource-centricity (vs. record-centricity) just means you
when you grab a new 'manifestation' for your local catalog, you may
also have to grab the creator, the publisher, the series, the
expression, the work, the subjects, etc.  All of these can be bundled
in the same xml document, though -- really it's just a different way
of looking at the data, but it's not a radical departure in the
delivery/discovery.

-Ross.

On Tue, Apr 7, 2009 at 1:44 PM, Anna Headley ahead...@swarthmore.edu wrote:
 And what you hear over here is a plea to not give up on catalogers.  Some
 are beyond ready to move from text to data.  Hiding the data view -- do you
 mean making it look like marc? -- sounds pretty awful.  Catalogers who are
 on board are trapped by the way sharing currently works, i.e. record
 sharing.  If the leaders of the cataloging community are failing, what can
 catalogers do?  This is an honest question, not a throwing-up-of-hands.
  Though maybe completely off-topic for this list.

 ah


 Karen Coyle wrote:

 Absolutely. The catalogers are still creating a textual document, not
 data. At best you can mark up the text, as we do with the MARC record. I
 worry that we won't be able to mesh the cataloger's view with a data view --
 that the two are some how inherently opposed. I'd like to start modeling a
 new data format but I can't imagine how we can bridge the gap between the
 catalogers and the system view. I suppose a very clever interface could hide
 the data view from the catalogers, but starting from either AACR2 or RDA and
 trying to get there feels extremely difficult. I guess my fear is that it
 will require compromises, and those will be hard to negotiate.

 kc

 p.s. The RDA element analysis is at
 http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf.
 That was the input to the registry.


 --
 Anna Headley
 Swarthmore College Library
 610.690.5781
 ahead...@swarthmore.edu



Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread Kevin S. Clarke
On Tue, Apr 7, 2009 at 1:44 PM, Anna Headley ahead...@swarthmore.eduwrote:

 And what you hear over here is a plea to not give up on catalogers.  Some
 are beyond ready to move from text to data.  Hiding the data view -- do you
 mean making it look like marc? -- sounds pretty awful.  Catalogers who are
 on board are trapped by the way sharing currently works, i.e. record
 sharing.  If the leaders of the cataloging community are failing, what can
 catalogers do?  This is an honest question, not a throwing-up-of-hands.
  Though maybe completely off-topic for this list.


Hear, hear.  I don't think we'll see a real solution unless we consider both
the tech-folks' and the catalogers' concerns.  I'm also sympathetic to
knowledge domains wanting to have control over the meaning of their data
elements (to have a useful and well defined set).  How we move forward when
we have so much legacy data (and supporting systems), as Anna said, is a
difficult problem.

Thanks for the plug Roy.  The checks in the mail.  ;-)

Kevin

-- 
Kevin S. Clarke
Coordinator of Web Services
Belk Library  Information Commons
Appalachian State University
218 College Street
Boone, NC 28608

clark...@appstate.edu
(828) 262-8472

There are two kinds of people in the world: those who believe there are two
kinds of people and those who know better.


Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread Ross Singer
Well, there's the project by Alistair Miles that Karen alluded to earlier:

http://code.google.com/p/code4rda

The goals of this project are, in my mind, crucial in moving forward,
since it's taking our existing corpus of records and turning them into
RDA/RDF.  Not only is it a good proof of concept to show how these new
data models would look and work (esp. how they would work w/r/t to
current applications/workflows), but, more importantly, it shows it
can be done *with our current data* alleviating the need for some
unrealistic retrospective recataloging effort.

I guess the way I look at it is, there's still time to fix this, at
least technologically.  There is a difference between the standard,
the data model and the application.

Karen posted a couple of weeks back that UKMARC didn't include
punctuation, instead leaving it to technology to add it.  This doesn't
mean they didn't follow AACR2, they just didn't encode it into the
data fields, explicitly.  Of course, they gave this up when they
adopted MARC21.

Anyway, there's a separation of concerns that is currently being
blurred, but doesn't have to be in practice.

-Ross.

On Tue, Apr 7, 2009 at 2:25 PM, Anna Headley ahead...@swarthmore.edu wrote:
 But the first one to take this on has no one to grab from.  The sharing
 argument may be a red herring in that the problem, from some perspectives,
 isn't so much about sharing one's own work -- it's more about using others'
 work.  Or is there already a community of people doing something like what
 Ross describes?  If so, where can I find out more about who, and how this
 works?

 It seems to me that the best movements forward in this opening of data are
 centered on translating marc into more web-usable forms.  Which is
 great**... for everyone except catalogers with no love for marc.  Jakob
 makes a good point in the post that Rob pointed out
 (http://www.mail-archive.com/code4lib@listserv.nd.edu/msg04422.html)... when
 cataloging can look like librarything, the rules *and, I would add, tools*
 we use seem incredibly bloated.

 ** I do mean great.  We have to start somewhere.  It's just that the
 cataloging pieces move so excruciatingly slowly.

 ah




 Ross Singer wrote:

 It's not off-topic, at least I don't think so.

 And I don't think anybody is asking to give up on catalogers.  Just
 like I don't think anybody would want the technologists to describe
 the materials, I think the problem is that the catalogers tried to
 apply their idea of a data model into tangible technology.

 Actually, I think the resource sharing argument is red herring.  A
 shift to resource-centricity (vs. record-centricity) just means you
 when you grab a new 'manifestation' for your local catalog, you may
 also have to grab the creator, the publisher, the series, the
 expression, the work, the subjects, etc.  All of these can be bundled
 in the same xml document, though -- really it's just a different way
 of looking at the data, but it's not a radical departure in the
 delivery/discovery.

 -Ross.

 On Tue, Apr 7, 2009 at 1:44 PM, Anna Headley ahead...@swarthmore.edu
 wrote:


 And what you hear over here is a plea to not give up on catalogers.  Some
 are beyond ready to move from text to data.  Hiding the data view -- do
 you
 mean making it look like marc? -- sounds pretty awful.  Catalogers who
 are
 on board are trapped by the way sharing currently works, i.e. record
 sharing.  If the leaders of the cataloging community are failing, what
 can
 catalogers do?  This is an honest question, not a throwing-up-of-hands.
  Though maybe completely off-topic for this list.

 ah


 Karen Coyle wrote:


 Absolutely. The catalogers are still creating a textual document, not
 data. At best you can mark up the text, as we do with the MARC record. I
 worry that we won't be able to mesh the cataloger's view with a data
 view --
 that the two are some how inherently opposed. I'd like to start modeling
 a
 new data format but I can't imagine how we can bridge the gap between
 the
 catalogers and the system view. I suppose a very clever interface could
 hide
 the data view from the catalogers, but starting from either AACR2 or RDA
 and
 trying to get there feels extremely difficult. I guess my fear is that
 it
 will require compromises, and those will be hard to negotiate.

 kc

 p.s. The RDA element analysis is at

 http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf.
 That was the input to the registry.



 --
 Anna Headley
 Swarthmore College Library
 610.690.5781
 ahead...@swarthmore.edu



 --
 Anna Headley
 Swarthmore College Library
 610.690.5781
 ahead...@swarthmore.edu



Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread Anna Headley
But the first one to take this on has no one to grab from.  The sharing 
argument may be a red herring in that the problem, from some 
perspectives, isn't so much about sharing one's own work -- it's more 
about using others' work.  Or is there already a community of people 
doing something like what Ross describes?  If so, where can I find out 
more about who, and how this works?


It seems to me that the best movements forward in this opening of data 
are centered on translating marc into more web-usable forms.  Which is 
great**... for everyone except catalogers with no love for marc.  Jakob 
makes a good point in the post that Rob pointed out 
(http://www.mail-archive.com/code4lib@listserv.nd.edu/msg04422.html)... 
when cataloging can look like librarything, the rules *and, I would add, 
tools* we use seem incredibly bloated.


** I do mean great.  We have to start somewhere.  It's just that the 
cataloging pieces move so excruciatingly slowly.


ah




Ross Singer wrote:

It's not off-topic, at least I don't think so.

And I don't think anybody is asking to give up on catalogers.  Just
like I don't think anybody would want the technologists to describe
the materials, I think the problem is that the catalogers tried to
apply their idea of a data model into tangible technology.

Actually, I think the resource sharing argument is red herring.  A
shift to resource-centricity (vs. record-centricity) just means you
when you grab a new 'manifestation' for your local catalog, you may
also have to grab the creator, the publisher, the series, the
expression, the work, the subjects, etc.  All of these can be bundled
in the same xml document, though -- really it's just a different way
of looking at the data, but it's not a radical departure in the
delivery/discovery.

-Ross.

On Tue, Apr 7, 2009 at 1:44 PM, Anna Headley ahead...@swarthmore.edu wrote:
  

And what you hear over here is a plea to not give up on catalogers.  Some
are beyond ready to move from text to data.  Hiding the data view -- do you
mean making it look like marc? -- sounds pretty awful.  Catalogers who are
on board are trapped by the way sharing currently works, i.e. record
sharing.  If the leaders of the cataloging community are failing, what can
catalogers do?  This is an honest question, not a throwing-up-of-hands.
 Though maybe completely off-topic for this list.

ah


Karen Coyle wrote:


Absolutely. The catalogers are still creating a textual document, not
data. At best you can mark up the text, as we do with the MARC record. I
worry that we won't be able to mesh the cataloger's view with a data view --
that the two are some how inherently opposed. I'd like to start modeling a
new data format but I can't imagine how we can bridge the gap between the
catalogers and the system view. I suppose a very clever interface could hide
the data view from the catalogers, but starting from either AACR2 or RDA and
trying to get there feels extremely difficult. I guess my fear is that it
will require compromises, and those will be hard to negotiate.

kc

p.s. The RDA element analysis is at
http://www.collectionscanada.gc.ca/jsc/docs/5rda-elementanalysisrev2.pdf.
That was the input to the registry.

  

--
Anna Headley
Swarthmore College Library
610.690.5781
ahead...@swarthmore.edu




--
Anna Headley
Swarthmore College Library
610.690.5781
ahead...@swarthmore.edu 


Re: [CODE4LIB] RDA in RDF, was: Something completely different

2009-04-07 Thread Karen Coyle

Ross Singer wrote:

Well, there's the project by Alistair Miles that Karen alluded to earlier:

http://code.google.com/p/code4rda

The goals of this project are, in my mind, crucial in moving forward,
since it's taking our existing corpus of records and turning them into
RDA/RDF.  Not only is it a good proof of concept to show how these new
data models would look and work (esp. how they would work w/r/t to
current applications/workflows), but, more importantly, it shows it
can be done *with our current data* alleviating the need for some
unrealistic retrospective recataloging effort.

I guess the way I look at it is, there's still time to fix this, at
least technologically.  There is a difference between the standard,
the data model and the application.
  


An interesting experiment would be to attempt to use the cataloger's use 
cases that Alistair worked from, but instead of using the RDA vocabulary 
to use bibo+vocab.org/frbr. That would give us something comparative to 
look at. If bibo+frbr can do all or even a lot of what RDA does, then we 
can demonstrate a different model and explain why one is better than the 
other (or at least that more than one model will work).


kc

--
---
Karen Coyle / Digital Library Consultant
kco...@kcoyle.net http://www.kcoyle.net
ph.: 510-540-7596   skype: kcoylenet
fx.: 510-848-3913
mo.: 510-435-8234