Ben, would it be easy for you to give some of the titles? Since the works don't appear in alphabetical order it isn't easy to find the dups doing a regular search in the UI. Not all of the titles, but maybe for a couple of the authors.
I did find some dups in A A Milne that look like a bug. I can find works that appear to have no editions: http://openlibrary.org/works/OL15649027W/ http://openlibrary.org/works/OL15652657W/ http://openlibrary.org/works/OL15645558W/ The editions all seem to be attached to this work: http://openlibrary.org/works/OL15658624W/ I think your stats are turning up some bugs, but I don't know where to take it from here. Does anyone think there is a way to find Works with no Editions? I assume those would all be errors. What is particularly worrisome about this particular example is that the Daisy book is linked to a work with no editions, but not linked to any of the editions. When you click on the Daisy link you get a work with no editions, and thus no Daisy ebook. This could be a real mess to sort out. kc On 5/20/12 2:21 PM, Ben Companjen wrote: > > I have one more list of duplicates, this time it's work records. > (Links to lists of duplicate authors were in a mail to OL-discuss > [1].) > > http://companjen.name/ol/dupe_works.html > > The author with the most duplicate work records (counting only the > works with title slug, subtitle slug and first author the same in at > least 5 records), is Plutarch. > > http://openlibrary.org/authors/OL58120A/Plutarch : 7,881 works, of > which at least 7100 are duplicate, i.e. are very similar in title and > subtitle to another work. > > Before any attempt to merge these records, more information about the > works is needed. Tropical Snow is also in this list, of which it is > known that the duplicate work records are a bad import. I don't know > what needs to happen with those records, but merging is probably not a > good idea. > Oh, I also don't know whether some duplicate works are actually > multiple volumes of the same work/edition. The works of "United > States. Immigration and Naturalization Service" appear to be volumes > rather than true duplicates (although the discussion on multivolume > works wasn't conclusive about what to do with them, I believe). This > could be checked when counting duplicate Editions. > > Regards, > > Ben > > [1] http://www.mail-archive.com/[email protected]/msg00668.html > > P.S. http://companjen.name/ol/dupe_works.csv has the same data, but > without the notes. > _______________________________________________ > Ol-tech mailing list > [email protected] > http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech > To unsubscribe from this mailing list, send email to > [email protected] > -- Karen Coyle [email protected] http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet _______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to [email protected]
