Richaard, It seems that my copyright has been abused. I don't mind if I don't get money but they should give details of publicatiom.
Colin Sloss Richard Light さんは書きました: >In message ><[email protected]>, Tom >Morris <[email protected]> writes >>On Wed, Jul 21, 2010 at 1:56 PM, Edward Betts <[email protected]> wrote: >> >>> Library MARC records use birth and death dates to disambiguate authors >>> with the same name. The problem is that some MARC records aren't that >>> great, they contain mistakes, or are missing the dates. We also load >>> data from non-MARC sources. We use some heuristics to try and guess if >>> the author represents the same person or not. We're always trying to >>> improve these heuristics. For example we should be looking at the type >>> of subjects that an author writes about and see if the new book we're >>> loading matches the profile of an existing author with that name. >> >>You should never assume that authors are the same based on name alone. > >Agreed. A person's name is just a property of that person: it isn't an >identity. For a start it's (clearly) not unique; second it's not >immutable (women's married names; life peers; Charles Dodgson). To have >a reasonable guarantee of a person's identity you need to match on a >number of properties. > >The larger the pool of people you are dealing with, the more properties >- and the more precision - you need. Thus in a typical workplace you can >often get away with just using first names as identifiers (with a >surname initial as a disambiguator where you have two or more "Richard"s >etc.). For authors, the conventional wisdom is that name and dates is a >sufficient set of properties, but again there is the question of >precision: do you include all names, and do the dates go down to the >year, or the day, of birth and death? > >>It's a lot easier to merge duplicates than it is to tease apart bad >>merges, so it is, in my opinion, much better to be very conservative >>in any automatic matching process. > >How far you go with this is a matter of choice, but I would certainly >suggest that you don't limit yourself to MARC practices if you want to >offer a useful Linked Data resource. > >The first question to address is whether you want to create an author >authority file, or a person authority file which happens to contain lots >of authors. If it's just an author file, presumably you then require a >separate file for people who are the subjects of works? So Winston >Churchill, for example, might then end up with two identifiers, because >he was both an author and the subject of books, e.g.: > >http://openlibrary.org/authors/OL123456A >http://openlibrary.org/subject/OL456789B > >I would argue for designing a person authority framework which allows >you to record a number of properties (where they are known) about any >person, and assigns a unique, persistent identifier only when you have >enough properties to be "sure" of the person's unique identity. As >properties, I would certainly include name (repeatable, with some sort >of type qualifier), date _and_place_ of birth and death. Then you can >choose whether to use this framework for a single person authority or >for separate author and "people as subject" authorities. > >Remember that the author statement can be "this book was written by a >person whose name was Richard Light", which is true even when you aren't >sure of the identity of "Richard Light". > >Richard >-- >Richard Light >_______________________________________________ >Ol-discuss mailing list >[email protected] >http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss >To unsubscribe from this mailing list, send email to >[email protected] > > ---- Colin Sloss [email protected] _______________________________________________ Ol-discuss mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to [email protected]
