Re: [ol-tech] Author RDF for testing

Jonathan Rochkind Fri, 04 Jun 2010 17:48:47 -0700

Also, as far as "I don't know how it is covered in RDF", I'm pretty sure the 
answer is "any way you want."  RDF does not really supply a domain model. It 
just says you'll express things using entity-attribute-value modelling. It's 
the _vocabularies_ you choose that will end up supplying the domain model. If 
the vocabularies you are choosing are not sufficient to represent the actual 
semantic information in OL, that is the internal domain model of OL (whether 
it's formalized and explicit or not, OL has one, by virtue of having a database 
of info)...   then you need to extend the vocabularies, add additional 
vocabularies, or replace the vocabularies.  The vocabularies are there to serve 
your needs of expressing what you've got in a 'standard' way -- if they are not 
serving that need, then they should not be your master.


The only exception to that might be a vocabulary _so_ standardly accepted that 
you DO want to change the internal OL domain model to meet it. I don't think 
anything in the RDF world is yet at that place, due to just where RDF (and 
standardized domain modelling in general) is in the "maturity curve".   For 
library people, FRBR (and/or the RDA vocabularies which are the best attempt 
yet to actually take FRBR to the next level of formal description) arguably (or 
arguably not) might meet that, being so standardly accepted that you might want 
to change the OL domain model to meet it -- but I feel like OL has already 
decided NOT to do that, which is a reasonable decision. 

Jonathan
________________________________________
From: Jonathan Rochkind
Sent: Friday, June 04, 2010 8:39 PM
To: Open Library -- technical discussion
Subject: RE: [ol-tech] Author RDF for testing

It is definitely useful to link a conference report to the 'entity' for the 
conference, IF your system actually tracks conferences as entities at all.

Whether that 'link' is called 'author' or not is less important. Sure, library 
practice is to consider that the same type of relationship as 'authors' (which 
isn't actually an 'author' relationship at all in library practice in the first 
place, it's a "primary responsibility" relationship, remembering that may make 
it easier to wrap your head around library practice.  Library practice doesn't 
_have_ a controlled relationship for 'author', only for 'primary responsiblity' 
and 'other contributors'.  With both of those _sometimes_ qualified with nature 
of contribution.)

Similar with corporate bodies vs individual authors.  If your system knows the 
difference, then it would be useful to reveal it somehow -- if FOAF isn't 
capable of revealing it, then perhaps find another way. Although it doesn't 
neccesarily have to be revealed in the _relationship_, perhaps it's simply 
revealed by following the relationship to it's destination and seeing that the 
destination is asserted to be a corporate body vs an individual.

But in general, I wouldn't worry too much about copying library practice -- to 
my mind,  OpenLibrary has essentially already made the decision to NOT ape 
library domain modelling (whether implicit in library practices, or explicit in 
things like FRBR), by transforming MARC to an internal format which is the 
"real" OpenLibrary data, and making that transformation without worrying about 
being compatible the library world domain modelling.  This is a reasonable 
choice to have made (although the opposite would also have been reasonable), 
and it's essentially been made,  and now that's it been made, you don't need to 
worry about maintaining compatibilty with library domain modelling, which 
actually makes your job easier.

So I'd think about:
1) What semantic data is actually captured in current OpenLibrary system.

2) How to expose the maximum amount of that semantic data in your 
machine-readable representations.  If it's semantic info captured in the OL 
system, it should be represented. If the vocabularies you are using are not 
sufficient to represent, then it is appropraite to find new vocabularies, 
extend vocabularies, or even make up your own new vocabularies when neccesary. 
The existing vocabularies are a tool, and should not be a straightjacket. The 
goal is expressing all your semantic info.

3) If there is semantic data that is NOT captured in the OL system, then of 
course it can not be represented. This is fine. The task of representation is 
only to represent what IS there, in the OL system, using the domain modelling 
already implicit in the OL system, as high-fidelity as possible.  Now, when 
undertaking this task, it may give you insight in ways you should _change_ the 
OL domain model, that is change what semantic info OL keeps or how it keeps it. 
Definitely this can be an iterative process. But I suggest it is helpful to 
keep separate the task of "creating a representation of what is in OL" and the 
task of "changing what is in OL" to keep your tasks manageable, and to help you 
make the right decisions -- because of course you will get information on how 
"what is in OL" should potentially be changed from places OTHER than the 
insight you gain in the "creating a representation" task -- namely, you will 
get information on that from actual USE of OL, that will be 
 the best information you can get.  And use requires you to first make some 
representations available so they can be used. So at this point, I'd focus on 
the "creating the representation" task, based on what is actually in OL, and 
later come back to modifications or enhancements to what is in OL.

And, like I said, in my opinion the focus of that "creating a representation" 
task should be "how do we represent what is in OL as high-fidelity as 
possible", _without_ worrying about how things are done in library domain 
modelling.  (Now, traditional library domain modelling might give you _useful 
ideas_, since it is based on 100 years+ of expreience. I'm not saying ignore 
any lessons it might have,  I'm saying there is no reason to adhere to it 
solely out of principle, or reason to worry if the actual data in OL does not 
allow adhering to it, that's fine.)

Jonathan

________________________________________
From: [email protected] [[email protected]] On Behalf Of 
Karen Coyle [[email protected]]
Sent: Friday, June 04, 2010 5:20 PM
To: [email protected]
Subject: Re: [ol-tech] Author RDF for testing

Quoting Jim Pitman <[email protected]>:

>
> The edge case of corporate authors needs to be accomodated. An instructive
> example is Nicolas Bourbaki: http://en.wikipedia.org/wiki/Nicolas_Bourbaki
>
> http://openlibrary.org/search?q=Nicolas+Bourbaki
>
> I note that
>
> http://openlibrary.org/authors/OL5038897A/Bourbaki_Nicolas_pseud.
>
> hints that "Nicolas Bourbaki" is a pseudonym for an organization, while
>
> http://openlibrary.org/authors/OL145730A/Nicolas_Bourbaki
>
> does not.  More straighforwardly, you may have corporate authors
> like Committees, W3C, etc.
> I'd be interested to see how RDF experts would accommodate this fork.

I don't know how it is covered in RDF, but as you know in libraries
corporate authors are not considered an edge case -- they "author"
huge numbers of governmental publications as well as corporate
publications, and rival humans in their output. OL does not store
these as authors, however, so we can be sure that all authors are
persons, or some other entity presenting itself as a person. The FOAF
Person does not imply a natural person, and can be used for any
assertion of person-ness. It does not provide a means to indicate that
the person is a pseudonym for one or more natural persons. I would
need to look at the latest work on the person data being developed in
the library world, but I know that there is a debate on how important
it is to link natural persons to the person representation.

The edge case, in my mind, is the use of conferences as authors, which
is a practice in library data. I still have trouble wrapping my head
around that.

kc
--
Karen Coyle
[email protected] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Re: [ol-tech] Author RDF for testing

Reply via email to