Quoting Ross Singer <[email protected]>: > Ok, I have a followup on this. > > I've made an analogous web service to OCLC's X-Identifier or > LibraryThing's ThingISBN with Open Library's data:
This is great, Ross. I hope folks have time to play with it a bit and see what it reveals. > > http://ol-identifier.heroku.com/ > > One of the things that becomes apparent with this is how many > duplicate editions there are (multiple "owl:sameAs"es means the > identifier in question appears in multiple records): Editions get de-duped, but there are a lot of dups that I think are a result of either a bug or a failure of the de-dupe program to run for some time. I will, however, take a look at some of the dupes and see if they look like an algorithm problem or a data problem. There have been a lot of problems with Amazon data, which almost always contains an ISBN but often have either crappy data or put junk in the title field. But I think you're just seeing a result of a bug/error. kc > > http://ol-identifier.heroku.com/oclc/792033: > <rdf:Description rdf:about="http://ol-identifier.heroku.com/oclc/792033"> > <owl:sameAs rdf:resource="http://openlibrary.org/books/OL14464770M"/> > <owl:sameAs rdf:resource="http://openlibrary.org/books/OL24210371M"/> > </rdf:Description> > > http://ol-identifier.heroku.com/isbn/006251587X: > <rdf:Description rdf:about="http://ol-identifier.heroku.com/isbn/006251587X"> > <owl:sameAs rdf:resource="http://openlibrary.org/books/OL22359132M"/> > <owl:sameAs rdf:resource="http://openlibrary.org/books/OL9245413M"/> > <owl:sameAs rdf:resource="http://openlibrary.org/books/OL38986M"/> > <owl:sameAs rdf:resource="http://openlibrary.org/books/OL7290708M"/> > </rdf:Description> > > http://ol-identifier.heroku.com/lccn/00004240: > <rdf:Description rdf:about="http://ol-identifier.heroku.com/lccn/00004240"> > <owl:sameAs rdf:resource="http://openlibrary.org/books/OL23358959M"/> > <owl:sameAs rdf:resource="http://openlibrary.org/books/OL6774976M"/> > </rdf:Description> > > So, my question would be, do editions get merged? If so, is there > some log of these merges? > > By the way, I should have something to show with a feed of deprecated > work IDs tomorrow. > > Thanks! > -Ross. > On Mon, Dec 13, 2010 at 4:12 PM, Ross Singer <[email protected]> wrote: >> On Mon, Dec 13, 2010 at 3:13 PM, Edward Betts <[email protected]> wrote: >>> It doesn't create a new work, it uses ID of the existing work with the >>> most editions. If there is a draw it picks the lowest work ID. >>> >>> The other works get turned into redirects to the chosen work. >>> >> Ok, good -- the redirect is a good safety net, if nothing else. >> >>> The changes are also visible here: http://openlibrary.org/people/WorkBot >>> >>>> Is there a way to track the changes to the Work identifiers? The API >>>> only seems to show the edition ids. >>> >>> We don't have a good feed for watching the changes to work identifiers. >>> What would you use the feed for? >>> >> Well, it would be a lot easier to update anything with the old work id >> to use the new one than to match every edition identifier and change >> it. In the end, though, it's not a huge deal. >> >>> Currently it is only WorkBot that is merging works, in the future we'll >>> let humans do it as well, as we build that feature we can include a feed >>> of merged works. >>> >> Good to know! >> >> Thanks for all of the pointers, >> -Ross. >> > _______________________________________________ > Ol-tech mailing list > [email protected] > http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech > To unsubscribe from this mailing list, send email to > [email protected] > -- Karen Coyle [email protected] http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet _______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to [email protected]
