On Mar 27, 2012, at 4:06 PM, Ben Companjen wrote: > On 27 March 2012 18:05, Karen Coyle <[email protected]> wrote: >> On 3/27/12 3:08 PM, Ben Companjen wrote: >> >>> - we want to keep the distinction between foaf:Person and other >>> entities, so changing the author template to use foaf:Agent (because >>> we cannot tell the difference at the moment) is not accepted. >>> I created issue 145 [1] to 'standardize' the values for entity_type >>> found in author records. Using its value ("person" for humans and >>> pseudonyms, "org" for organization) or its absence, we can choose >>> foaf:Person, foaf:Organization or foaf:Agent. >> >> I wonder if some of this cannot be done using a comparison to VIAF -- >> because in VIAF there should be coding to indicate whether it is a >> person or a corporation. Also, some of this information could be >> recovered from the original input records. We could use the MARC coding >> from library data, and any unmatched strings from other sources like >> Amazon could be designated "Agent" until a match is found from a source >> that makes the distinction. > > I think it must be possible, but first I think we should have a way of > storing the information. We can use VIAF if the license terms allow > us; Tom Morris pointed out on GitHub that it may be for non-commercial > use only. The PD/CC0 MARC records are available anyway, so surely they > can be re-read to extract the information. >
Hi Ben, Which MARC records are you referring to here? -Ross. >>> >>> - I'd like to make the distinction between a URI for something that is >>> described by Open Library (Authors, Editions, Works, etc.) and the >>> URIs for the descriptions you get from Open Library (as HTML, RDF, >>> JSON etc.). >>> That's why I have asked to use the URIs without / at the end for the >>> Authors, Editions and Works (in the pull request/issue 136 [2]) and to >>> redirect HTTP agents to a description when they ask for a Work, >>> Edition or Author (since you cannot transfer people and most of the >>> works and books in OL over the internet) in issue 130 [3]. >> >> This was discussed at length in the development of the RDF and seems to >> be a philosophical issue. It's the "real world object" issue: some >> people feel that the URI should designate the real world object rather >> than the representation of that object on the web. The thinking is that >> people are interested in the RWO, not a specific representation. YMMV, >> but if anyone has saved that discussion on this list it would be worth >> reviewing. > > In July 2010 there was a short discussion, based on a mail from OCLC > forwarded by you. Ross Singer replied [1] he thought it was feasible > to implement 303 redirects :) I can't say there is concensus about the > whole issue of using RWO and their descriptions in one document, as > since Friday there must have been 200+ emails sent over de public-lod > mailing list with this subject. > > [1] http://www.mail-archive.com/[email protected]/msg00199.html > > As far as I can tell, it is still safe to do 303 redirects and keep > the distinction (one URI for the RWO, other URIs for the > representations in HTML, RDF etc.). > >> >>> - we want identifiers for Authors (such as VIAF) to be treated like >>> identifiers, not like just another link (to the VIAF website). I >>> created issue 144 [4] for this, and I think we're ready to agree on >>> how to store these identifiers. The discussion on GitHub yielded a >>> small list of possible identifiers already. >> >> Will there be any distinction between "same as" and "similar to"? Are we >> ready to declare "same as?" > > Good point. We have to determine whether other systems identify > people/organizations/... themselves, or (a collection of) records > about those "entities". > VIAF uses their URIs to identify the person or organization. E.g. the > URI for J.K. Rowling is http://viaf.org/viaf/116796842, found in > http://viaf.org/viaf/116796842/rdf.xml The Deutsche Nationalbibliothek > also uses this approach: http://d-nb.info/gnd/122340469/about/rdf is > the RDF description (although no statement is made about type), > http://d-nb.info/gnd/122340469 is her URI. Both RDF files use > owl:sameAs to say the URIs identify the same "thing". > If an identifier is not used to identify the person/corporation/... > directly, we can still use or define a subproperty of dc:identifier > that connects the OL Author to the other source using the > number/string, or use dc:identifier directly. I envision ol:isni (or > is there bibo:isni already?). > >> >> As for using the VIAF ID rather than the individual ID, I'm not entirely >> sure about that. As VIAF grows, individual library authority identifiers >> can move from one cluster to another. The VIAF id identifies the >> cluster, not the individual heading. >> The cluster itself does not have a string to match against. > > I'm not entirely sure what you mean - my understanding of VIAF is that > a VIAF ID identifies a person and that the underlying database > connects IDs from the individual authority files. I don't know whether > VIAF IDs are reused when one becomes obsolete (e.g. after a merge). > I originally proposed creating a special field for the VIAF ID, but > Anand's idea to make it more general and Tom Morris's suggestion that > VIAF may not be compatible with OL's public domain dedication, so that > we cannot rely on VIAF to supply the IDs from individual authorities, > easily convinced me that only supporting VIAF is not enough. > I recall from a blog post that Open Library is about supporting > any/every supplier of information, not just the usual library > suppliers. > >> Ideally, the name headings from US MARC records would be matched with >> the US name file, the name headings from (say) the National Library of >> Spain would be matched with that name file (all in VIAF), etc. > > Do you mean that by matching name headings to authority records > directly, we can circumvent VIAF but still get the identifiers used in > various (national) libraries? That should be possible, I guess. > >>> >>> - there is(?) the issue that OL Editions are a combination of FRBR >>> expressions and manifestations. I personally think we can say Work = >>> Work, Edition is Manifestation, and link the two by the RDA property >>> workManifested and not mention Expression, like it is done now. I >>> think Expressions can be added later, if wanted. I can imagine each >>> translation can be its own Expression, but otherwise I'm okay with the >>> current distinction. >> >> Again, there are some folks who feel that the lack of expression is >> problematic, although OL is not the only database to skip that entity. I >> believe that the Edition is expression+edition, and that Work is pretty >> close to FRBR work. > > Let those people create their own dataset that connects works and > editions through Expressions that they control :D > Pseudo-triples: > <X Expression> <realizationOf> <OL Work>. > <OL Edition> <embodimentOf> <X Expression>. > > Perhaps in the future Open Library allows for creating this within its > own boundaries (/type/expression + links)? Or some fork of OL? > (N.B. I haven't met those people - I haven't really met any of you :) > - and don't intend to offend anyone.) > > I don't think anything needs to be changed in respect of RDF FRBR > relations, by the way - I just remembered it being said. >> >> kc > > Ben >> >>> >>> I think these were the main topics related to RDF. The topics changed >>> to types and documentation of types, then to finding out what actually >>> _is_ in the data. >>> >>> Yesterday I changed the RDF templates (in my fork) to output correct >>> XML Schema dateTime values, because the Sindice Inspector [5] failed >>> reading the Open Library Work RDF [6]. >>> >>> I'd like to hear from others what (else) still needs to be changed >>> before the RDF templates can be updated (or what may be wrong in my >>> thinking). >>> >>> Regards, >>> >>> Ben >>> >>> [1] https://github.com/internetarchive/openlibrary/issues/145 >>> [2] https://github.com/internetarchive/openlibrary/pull/136 >>> [3] https://github.com/internetarchive/openlibrary/issues/130 >>> [4] https://github.com/internetarchive/openlibrary/issues/144 >>> [5] http://inspector.sindice.com/ >>> [6] >>> http://inspector.sindice.com/inspect?url=http%3A%2F%2Fopenlibrary.org%2Fworks%2FOL15120805W.rdf&content=&contentType=auto >>> _______________________________________________ >>> Ol-tech mailing list >>> [email protected] >>> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech >>> To unsubscribe from this mailing list, send email to >>> [email protected] >> >> -- >> Karen Coyle >> [email protected] http://kcoyle.net >> ph: 1-510-540-7596 >> m: 1-510-435-8234 >> skype: kcoylenet >> _______________________________________________ >> Ol-tech mailing list >> [email protected] >> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech >> To unsubscribe from this mailing list, send email to >> [email protected] > _______________________________________________ > Ol-tech mailing list > [email protected] > http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech > To unsubscribe from this mailing list, send email to > [email protected] _______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to [email protected]
