I uploaded new links pages, which do not contain links to merge
authors with different dates.
Since I have no way to distinguish between human and corporate
authors, there isn't really a way to exclude human authors without
date(s) but include corporate merge candidates.

There are 68.611 links, in 687 files.
http://companjen.name/ol/ol_merge_links_1.html
through
http://companjen.name/ol/ol_merge_links_687.html

Ben

On 18 May 2012 23:43, Ben Companjen <[email protected]> wrote:
> On 18 May 2012 23:12, Tom Morris <[email protected]> wrote:
>> On Thu, May 17, 2012 at 9:14 PM, Ben Companjen <[email protected]> 
>> wrote:
>>> So for those who like to take on a 'challenge': I just uploaded 1098
>>> files containing 100 merge links each. These are the authors with "en"
>>> somewhere in their names, sorted by number of possible duplicates. I
>>> removed the Shirley conference (10046 duplicates), since the URL was
>>> too long (~140kB).
>>>
>>> Since these files contain a lot more personal names than the file of
>>> United States names, please note that these names are more likely to
>>> belong to multiple people (i.e. "duplicate authors" may be different
>>> authors). My strategy for when I'm uncertain whether some name belongs
>>> to multiple people, is to not merge those. There is enough to do
>>> anyway :)
>>
>> It's hugely dangerous to be proposing author merges based on name
>> alone.  OpenLibrary has enough conflated author records without adding
>> to the mess!
>
> That's true, and it's the main reason for me to start with
> organizations like parts of US government and conferences. I try my
> best to watch out when reviewing proposed people merges and hope, by
> issuing warnings in my emails, that others do so too.
>>
>> For example, this URL
>> http://openlibrary.org/authors/merge?key=OL4313974A&key=OL4718276A&key=OL5123244A&key=OL5654080A&key=OL5757638A&key=OL6996482A&;
>>
>> proposes to merge six different authors, of whom five have distinct
>> birth dates (and the last is undated).
>
> I would back away from that one :)
>>
>> Birth and death dates should be used where they are available and
>> authors without them shouldn't be merged automatically at all.
>
> It's not all automatic: you choose a link
> (unitedstatescongresssenatecommitteeoninteroceaniccanals is very
> likely safe to merge, for example), review the proposed merge, tick
> the boxes (or have them ticked using the bookmarklet), click "merge"
> and finally click "yes".
>
> That said, it makes sense to not propose obviously different authors.
> I'll update my scripts, but don't expect new files in the next hour :)
>
> Ben
>>
>> Tom
>> _______________________________________________
>> Ol-discuss mailing list
>> [email protected]
>> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
>> To unsubscribe from this mailing list, send email to 
>> [email protected]
_______________________________________________
Ol-discuss mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to