Karen, Do you have a sense of how well it actually works? Is Open Library implementing it?
Mike Beccaria Systems Librarian Head of Digital Initiative Paul Smith's College 518.327.6376 [email protected] Become a friend of Paul Smith's Library on Facebook today! -----Original Message----- From: Code for Libraries [mailto:[email protected]] On Behalf Of Karen Coyle Sent: Thursday, August 22, 2013 11:53 AM To: [email protected] Subject: Re: [CODE4LIB] De-dup MARC Ebook records The record matching algorithm used by the Open Library is available here: https://github.com/openlibrary/openlibrary/tree/master/openlibrary/catalog/merge The original spec, which may have changed in the implementation, is here: http://kcoyle.net/merge.html kc On 8/22/13 8:07 AM, Michael Beccaria wrote: > Steve, > I don't think it's so much find a control field (however, the closest match I > can use is ISBN or eISBN which has its issues) but also normalizing the data > in the fields so that matches are produced. It will no doubt take some time > to figure out. > > Mike Beccaria > Systems Librarian > Head of Digital Initiative > Paul Smith's College > 518.327.6376 > [email protected] > Become a friend of Paul Smith's Library on Facebook today! > > > -----Original Message----- > From: Code for Libraries [mailto:[email protected]] On Behalf > Of McDonald, Stephen > Sent: Friday, August 16, 2013 8:16 AM > To: [email protected] > Subject: Re: [CODE4LIB] De-dup MARC Ebook records > > Michael Beccaria said: >> Thanks for the replies. To clarify, I am working with 2 (or more in >> the future) marc records outside of the ILS. I've tried using >> Marcedit but my usage did vary...not much overlap with the control >> fields that were available to me. I have a feeling they are a bit >> varied. I'm also messing around with marcXimiL a little but I'm >> having trouble getting it to output any records at all. I also was >> looking at the XC aggregation module but I was having trouble getting >> that to work properly as well and the listserv was unresponsive. It >> seemed like good software but it required me to set up an OAI harvest >> source to allow it to ingest the records and that...well...enough is >> enough... I think I will probably need to write something, and at >> least that way I know what it will be doing rather than plowing >> through software that has little to no support. Please feel free to let me >> know of a particular strategy you think might work best in this regard... > If you couldn't get adequate deduping from the control fields available in > MarcEdit deduping, what control fields do you think you need to dedup on? > You can actually specify any arbitrary field and subfield for deduping in > MarcEdit. > > Steve McDonald > [email protected] -- Karen Coyle [email protected] http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet
