On 27 March 2012 18:05, Karen Coyle <[email protected]> wrote: > On 3/27/12 3:08 PM, Ben Companjen wrote: > >> - we want to keep the distinction between foaf:Person and other >> entities, so changing the author template to use foaf:Agent (because >> we cannot tell the difference at the moment) is not accepted. >> I created issue 145 [1] to 'standardize' the values for entity_type >> found in author records. Using its value ("person" for humans and >> pseudonyms, "org" for organization) or its absence, we can choose >> foaf:Person, foaf:Organization or foaf:Agent. > > I wonder if some of this cannot be done using a comparison to VIAF -- > because in VIAF there should be coding to indicate whether it is a > person or a corporation. Also, some of this information could be > recovered from the original input records. We could use the MARC coding > from library data, and any unmatched strings from other sources like > Amazon could be designated "Agent" until a match is found from a source > that makes the distinction.
I think it must be possible, but first I think we should have a way of storing the information. We can use VIAF if the license terms allow us; Tom Morris pointed out on GitHub that it may be for non-commercial use only. The PD/CC0 MARC records are available anyway, so surely they can be re-read to extract the information. >> >> - I'd like to make the distinction between a URI for something that is >> described by Open Library (Authors, Editions, Works, etc.) and the >> URIs for the descriptions you get from Open Library (as HTML, RDF, >> JSON etc.). >> That's why I have asked to use the URIs without / at the end for the >> Authors, Editions and Works (in the pull request/issue 136 [2]) and to >> redirect HTTP agents to a description when they ask for a Work, >> Edition or Author (since you cannot transfer people and most of the >> works and books in OL over the internet) in issue 130 [3]. > > This was discussed at length in the development of the RDF and seems to > be a philosophical issue. It's the "real world object" issue: some > people feel that the URI should designate the real world object rather > than the representation of that object on the web. The thinking is that > people are interested in the RWO, not a specific representation. YMMV, > but if anyone has saved that discussion on this list it would be worth > reviewing. In July 2010 there was a short discussion, based on a mail from OCLC forwarded by you. Ross Singer replied [1] he thought it was feasible to implement 303 redirects :) I can't say there is concensus about the whole issue of using RWO and their descriptions in one document, as since Friday there must have been 200+ emails sent over de public-lod mailing list with this subject. [1] http://www.mail-archive.com/[email protected]/msg00199.html As far as I can tell, it is still safe to do 303 redirects and keep the distinction (one URI for the RWO, other URIs for the representations in HTML, RDF etc.). > >> - we want identifiers for Authors (such as VIAF) to be treated like >> identifiers, not like just another link (to the VIAF website). I >> created issue 144 [4] for this, and I think we're ready to agree on >> how to store these identifiers. The discussion on GitHub yielded a >> small list of possible identifiers already. > > Will there be any distinction between "same as" and "similar to"? Are we > ready to declare "same as?" Good point. We have to determine whether other systems identify people/organizations/... themselves, or (a collection of) records about those "entities". VIAF uses their URIs to identify the person or organization. E.g. the URI for J.K. Rowling is http://viaf.org/viaf/116796842, found in http://viaf.org/viaf/116796842/rdf.xml The Deutsche Nationalbibliothek also uses this approach: http://d-nb.info/gnd/122340469/about/rdf is the RDF description (although no statement is made about type), http://d-nb.info/gnd/122340469 is her URI. Both RDF files use owl:sameAs to say the URIs identify the same "thing". If an identifier is not used to identify the person/corporation/... directly, we can still use or define a subproperty of dc:identifier that connects the OL Author to the other source using the number/string, or use dc:identifier directly. I envision ol:isni (or is there bibo:isni already?). > > As for using the VIAF ID rather than the individual ID, I'm not entirely > sure about that. As VIAF grows, individual library authority identifiers > can move from one cluster to another. The VIAF id identifies the > cluster, not the individual heading. > The cluster itself does not have a string to match against. I'm not entirely sure what you mean - my understanding of VIAF is that a VIAF ID identifies a person and that the underlying database connects IDs from the individual authority files. I don't know whether VIAF IDs are reused when one becomes obsolete (e.g. after a merge). I originally proposed creating a special field for the VIAF ID, but Anand's idea to make it more general and Tom Morris's suggestion that VIAF may not be compatible with OL's public domain dedication, so that we cannot rely on VIAF to supply the IDs from individual authorities, easily convinced me that only supporting VIAF is not enough. I recall from a blog post that Open Library is about supporting any/every supplier of information, not just the usual library suppliers. > Ideally, the name headings from US MARC records would be matched with > the US name file, the name headings from (say) the National Library of > Spain would be matched with that name file (all in VIAF), etc. Do you mean that by matching name headings to authority records directly, we can circumvent VIAF but still get the identifiers used in various (national) libraries? That should be possible, I guess. >> >> - there is(?) the issue that OL Editions are a combination of FRBR >> expressions and manifestations. I personally think we can say Work = >> Work, Edition is Manifestation, and link the two by the RDA property >> workManifested and not mention Expression, like it is done now. I >> think Expressions can be added later, if wanted. I can imagine each >> translation can be its own Expression, but otherwise I'm okay with the >> current distinction. > > Again, there are some folks who feel that the lack of expression is > problematic, although OL is not the only database to skip that entity. I > believe that the Edition is expression+edition, and that Work is pretty > close to FRBR work. Let those people create their own dataset that connects works and editions through Expressions that they control :D Pseudo-triples: <X Expression> <realizationOf> <OL Work>. <OL Edition> <embodimentOf> <X Expression>. Perhaps in the future Open Library allows for creating this within its own boundaries (/type/expression + links)? Or some fork of OL? (N.B. I haven't met those people - I haven't really met any of you :) - and don't intend to offend anyone.) I don't think anything needs to be changed in respect of RDF FRBR relations, by the way - I just remembered it being said. > > kc Ben > >> >> I think these were the main topics related to RDF. The topics changed >> to types and documentation of types, then to finding out what actually >> _is_ in the data. >> >> Yesterday I changed the RDF templates (in my fork) to output correct >> XML Schema dateTime values, because the Sindice Inspector [5] failed >> reading the Open Library Work RDF [6]. >> >> I'd like to hear from others what (else) still needs to be changed >> before the RDF templates can be updated (or what may be wrong in my >> thinking). >> >> Regards, >> >> Ben >> >> [1] https://github.com/internetarchive/openlibrary/issues/145 >> [2] https://github.com/internetarchive/openlibrary/pull/136 >> [3] https://github.com/internetarchive/openlibrary/issues/130 >> [4] https://github.com/internetarchive/openlibrary/issues/144 >> [5] http://inspector.sindice.com/ >> [6] >> http://inspector.sindice.com/inspect?url=http%3A%2F%2Fopenlibrary.org%2Fworks%2FOL15120805W.rdf&content=&contentType=auto >> _______________________________________________ >> Ol-tech mailing list >> [email protected] >> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech >> To unsubscribe from this mailing list, send email to >> [email protected] > > -- > Karen Coyle > [email protected] http://kcoyle.net > ph: 1-510-540-7596 > m: 1-510-435-8234 > skype: kcoylenet > _______________________________________________ > Ol-tech mailing list > [email protected] > http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech > To unsubscribe from this mailing list, send email to > [email protected] _______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to [email protected]
