Ben, would it be easy for you to give some of the titles? Since the 
works don't appear in alphabetical order it isn't easy to find the dups 
doing a regular search in the UI. Not all of the titles, but maybe for a 
couple of the authors.

I did find some dups in A A Milne that look like a bug. I can find works 
that appear to have no editions:

http://openlibrary.org/works/OL15649027W/
http://openlibrary.org/works/OL15652657W/
http://openlibrary.org/works/OL15645558W/

The editions all seem to be attached to this work:

http://openlibrary.org/works/OL15658624W/

I think your stats are turning up some bugs, but I don't know where to 
take it from here. Does anyone think there is a way to find Works with 
no Editions? I assume those would all be errors.

What is particularly worrisome about this particular example is that the 
Daisy book is linked to a work with no editions, but not linked to any 
of the editions. When you click on the Daisy link you get a work with no 
editions, and thus no Daisy ebook. This could be a real mess to sort out.

kc

On 5/20/12 2:21 PM, Ben Companjen wrote:
>
> I have one more list of duplicates, this time it's work records.
> (Links to lists of duplicate authors were in a mail to OL-discuss
> [1].)
>
> http://companjen.name/ol/dupe_works.html
>
> The author with the most duplicate work records (counting only the
> works with title slug, subtitle slug and first author the same in at
> least 5 records), is Plutarch.
>
> http://openlibrary.org/authors/OL58120A/Plutarch : 7,881 works, of
> which at least 7100 are duplicate, i.e. are very similar in title and
> subtitle to another work.
>
> Before any attempt to merge these records, more information about the
> works is needed. Tropical Snow is also in this list, of which it is
> known that the duplicate work records are a bad import. I don't know
> what needs to happen with those records, but merging is probably not a
> good idea.
> Oh, I also don't know whether some duplicate works are actually
> multiple volumes of the same work/edition. The works of "United
> States. Immigration and Naturalization Service" appear to be volumes
> rather than true duplicates (although the discussion on multivolume
> works wasn't conclusive about what to do with them, I believe). This
> could be checked when counting duplicate Editions.
>
> Regards,
>
> Ben
>
> [1] http://www.mail-archive.com/[email protected]/msg00668.html
>
> P.S. http://companjen.name/ol/dupe_works.csv has the same data, but
> without the notes.
> _______________________________________________
> Ol-tech mailing list
> [email protected]
> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
> To unsubscribe from this mailing list, send email to 
> [email protected]
>

-- 
Karen Coyle
[email protected] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to