I uploaded new links pages, which do not contain links to merge authors with different dates. Since I have no way to distinguish between human and corporate authors, there isn't really a way to exclude human authors without date(s) but include corporate merge candidates.
There are 68.611 links, in 687 files. http://companjen.name/ol/ol_merge_links_1.html through http://companjen.name/ol/ol_merge_links_687.html Ben On 18 May 2012 23:43, Ben Companjen <[email protected]> wrote: > On 18 May 2012 23:12, Tom Morris <[email protected]> wrote: >> On Thu, May 17, 2012 at 9:14 PM, Ben Companjen <[email protected]> >> wrote: >>> So for those who like to take on a 'challenge': I just uploaded 1098 >>> files containing 100 merge links each. These are the authors with "en" >>> somewhere in their names, sorted by number of possible duplicates. I >>> removed the Shirley conference (10046 duplicates), since the URL was >>> too long (~140kB). >>> >>> Since these files contain a lot more personal names than the file of >>> United States names, please note that these names are more likely to >>> belong to multiple people (i.e. "duplicate authors" may be different >>> authors). My strategy for when I'm uncertain whether some name belongs >>> to multiple people, is to not merge those. There is enough to do >>> anyway :) >> >> It's hugely dangerous to be proposing author merges based on name >> alone. OpenLibrary has enough conflated author records without adding >> to the mess! > > That's true, and it's the main reason for me to start with > organizations like parts of US government and conferences. I try my > best to watch out when reviewing proposed people merges and hope, by > issuing warnings in my emails, that others do so too. >> >> For example, this URL >> http://openlibrary.org/authors/merge?key=OL4313974A&key=OL4718276A&key=OL5123244A&key=OL5654080A&key=OL5757638A&key=OL6996482A& >> >> proposes to merge six different authors, of whom five have distinct >> birth dates (and the last is undated). > > I would back away from that one :) >> >> Birth and death dates should be used where they are available and >> authors without them shouldn't be merged automatically at all. > > It's not all automatic: you choose a link > (unitedstatescongresssenatecommitteeoninteroceaniccanals is very > likely safe to merge, for example), review the proposed merge, tick > the boxes (or have them ticked using the bookmarklet), click "merge" > and finally click "yes". > > That said, it makes sense to not propose obviously different authors. > I'll update my scripts, but don't expect new files in the next hour :) > > Ben >> >> Tom >> _______________________________________________ >> Ol-discuss mailing list >> [email protected] >> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss >> To unsubscribe from this mailing list, send email to >> [email protected] _______________________________________________ Ol-discuss mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss To unsubscribe from this mailing list, send email to [email protected]
