Also, as far as "I don't know how it is covered in RDF", I'm pretty sure the answer is "any way you want." RDF does not really supply a domain model. It just says you'll express things using entity-attribute-value modelling. It's the _vocabularies_ you choose that will end up supplying the domain model. If the vocabularies you are choosing are not sufficient to represent the actual semantic information in OL, that is the internal domain model of OL (whether it's formalized and explicit or not, OL has one, by virtue of having a database of info)... then you need to extend the vocabularies, add additional vocabularies, or replace the vocabularies. The vocabularies are there to serve your needs of expressing what you've got in a 'standard' way -- if they are not serving that need, then they should not be your master.
The only exception to that might be a vocabulary _so_ standardly accepted that you DO want to change the internal OL domain model to meet it. I don't think anything in the RDF world is yet at that place, due to just where RDF (and standardized domain modelling in general) is in the "maturity curve". For library people, FRBR (and/or the RDA vocabularies which are the best attempt yet to actually take FRBR to the next level of formal description) arguably (or arguably not) might meet that, being so standardly accepted that you might want to change the OL domain model to meet it -- but I feel like OL has already decided NOT to do that, which is a reasonable decision. Jonathan ________________________________________ From: Jonathan Rochkind Sent: Friday, June 04, 2010 8:39 PM To: Open Library -- technical discussion Subject: RE: [ol-tech] Author RDF for testing It is definitely useful to link a conference report to the 'entity' for the conference, IF your system actually tracks conferences as entities at all. Whether that 'link' is called 'author' or not is less important. Sure, library practice is to consider that the same type of relationship as 'authors' (which isn't actually an 'author' relationship at all in library practice in the first place, it's a "primary responsibility" relationship, remembering that may make it easier to wrap your head around library practice. Library practice doesn't _have_ a controlled relationship for 'author', only for 'primary responsiblity' and 'other contributors'. With both of those _sometimes_ qualified with nature of contribution.) Similar with corporate bodies vs individual authors. If your system knows the difference, then it would be useful to reveal it somehow -- if FOAF isn't capable of revealing it, then perhaps find another way. Although it doesn't neccesarily have to be revealed in the _relationship_, perhaps it's simply revealed by following the relationship to it's destination and seeing that the destination is asserted to be a corporate body vs an individual. But in general, I wouldn't worry too much about copying library practice -- to my mind, OpenLibrary has essentially already made the decision to NOT ape library domain modelling (whether implicit in library practices, or explicit in things like FRBR), by transforming MARC to an internal format which is the "real" OpenLibrary data, and making that transformation without worrying about being compatible the library world domain modelling. This is a reasonable choice to have made (although the opposite would also have been reasonable), and it's essentially been made, and now that's it been made, you don't need to worry about maintaining compatibilty with library domain modelling, which actually makes your job easier. So I'd think about: 1) What semantic data is actually captured in current OpenLibrary system. 2) How to expose the maximum amount of that semantic data in your machine-readable representations. If it's semantic info captured in the OL system, it should be represented. If the vocabularies you are using are not sufficient to represent, then it is appropraite to find new vocabularies, extend vocabularies, or even make up your own new vocabularies when neccesary. The existing vocabularies are a tool, and should not be a straightjacket. The goal is expressing all your semantic info. 3) If there is semantic data that is NOT captured in the OL system, then of course it can not be represented. This is fine. The task of representation is only to represent what IS there, in the OL system, using the domain modelling already implicit in the OL system, as high-fidelity as possible. Now, when undertaking this task, it may give you insight in ways you should _change_ the OL domain model, that is change what semantic info OL keeps or how it keeps it. Definitely this can be an iterative process. But I suggest it is helpful to keep separate the task of "creating a representation of what is in OL" and the task of "changing what is in OL" to keep your tasks manageable, and to help you make the right decisions -- because of course you will get information on how "what is in OL" should potentially be changed from places OTHER than the insight you gain in the "creating a representation" task -- namely, you will get information on that from actual USE of OL, that will be the best information you can get. And use requires you to first make some representations available so they can be used. So at this point, I'd focus on the "creating the representation" task, based on what is actually in OL, and later come back to modifications or enhancements to what is in OL. And, like I said, in my opinion the focus of that "creating a representation" task should be "how do we represent what is in OL as high-fidelity as possible", _without_ worrying about how things are done in library domain modelling. (Now, traditional library domain modelling might give you _useful ideas_, since it is based on 100 years+ of expreience. I'm not saying ignore any lessons it might have, I'm saying there is no reason to adhere to it solely out of principle, or reason to worry if the actual data in OL does not allow adhering to it, that's fine.) Jonathan ________________________________________ From: [email protected] [[email protected]] On Behalf Of Karen Coyle [[email protected]] Sent: Friday, June 04, 2010 5:20 PM To: [email protected] Subject: Re: [ol-tech] Author RDF for testing Quoting Jim Pitman <[email protected]>: > > The edge case of corporate authors needs to be accomodated. An instructive > example is Nicolas Bourbaki: http://en.wikipedia.org/wiki/Nicolas_Bourbaki > > http://openlibrary.org/search?q=Nicolas+Bourbaki > > I note that > > http://openlibrary.org/authors/OL5038897A/Bourbaki_Nicolas_pseud. > > hints that "Nicolas Bourbaki" is a pseudonym for an organization, while > > http://openlibrary.org/authors/OL145730A/Nicolas_Bourbaki > > does not. More straighforwardly, you may have corporate authors > like Committees, W3C, etc. > I'd be interested to see how RDF experts would accommodate this fork. I don't know how it is covered in RDF, but as you know in libraries corporate authors are not considered an edge case -- they "author" huge numbers of governmental publications as well as corporate publications, and rival humans in their output. OL does not store these as authors, however, so we can be sure that all authors are persons, or some other entity presenting itself as a person. The FOAF Person does not imply a natural person, and can be used for any assertion of person-ness. It does not provide a means to indicate that the person is a pseudonym for one or more natural persons. I would need to look at the latest work on the person data being developed in the library world, but I know that there is a debate on how important it is to link natural persons to the person representation. The edge case, in my mind, is the use of conferences as authors, which is a practice in library data. I still have trouble wrapping my head around that. kc -- Karen Coyle [email protected] http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet _______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to [email protected] _______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to [email protected]
