Karen,
Do you have a sense of how well it actually works? Is Open Library implementing 
it?

Mike Beccaria
Systems Librarian
Head of Digital Initiative
Paul Smith's College
518.327.6376
[email protected]
Become a friend of Paul Smith's Library on Facebook today!


-----Original Message-----
From: Code for Libraries [mailto:[email protected]] On Behalf Of Karen 
Coyle
Sent: Thursday, August 22, 2013 11:53 AM
To: [email protected]
Subject: Re: [CODE4LIB] De-dup MARC Ebook records

The record matching algorithm used by the Open Library is available here:
https://github.com/openlibrary/openlibrary/tree/master/openlibrary/catalog/merge

The original spec, which may have changed in the implementation, is here:

http://kcoyle.net/merge.html

kc


On 8/22/13 8:07 AM, Michael Beccaria wrote:
> Steve,
> I don't think it's so much find a control field (however, the closest match I 
> can use is ISBN or eISBN which has its issues) but also normalizing the data 
> in the fields so that matches are produced. It will no doubt take some time 
> to figure out.
>
> Mike Beccaria
> Systems Librarian
> Head of Digital Initiative
> Paul Smith's College
> 518.327.6376
> [email protected]
> Become a friend of Paul Smith's Library on Facebook today!
>
>
> -----Original Message-----
> From: Code for Libraries [mailto:[email protected]] On Behalf 
> Of McDonald, Stephen
> Sent: Friday, August 16, 2013 8:16 AM
> To: [email protected]
> Subject: Re: [CODE4LIB] De-dup MARC Ebook records
>
> Michael Beccaria said:
>> Thanks for the replies. To clarify, I am working with 2 (or more in 
>> the future) marc records outside of the ILS. I've tried using 
>> Marcedit but my usage did vary...not much overlap with the control 
>> fields that were available to me. I have a feeling they are a bit 
>> varied. I'm also messing around with marcXimiL a little but I'm 
>> having trouble getting it to output any records at all. I also was 
>> looking at the XC aggregation module but I was having trouble getting 
>> that to work properly as well and the listserv was unresponsive. It 
>> seemed like good software but it required me to set up an OAI harvest 
>> source to allow it to ingest the records and that...well...enough is 
>> enough... I think I will probably need to write something, and at 
>> least that way I know what it will be doing rather than plowing 
>> through software that has little to no support. Please feel free to let me 
>> know of a particular strategy you think might work best in this regard...
> If you couldn't get adequate deduping from the control fields available in 
> MarcEdit deduping, what control fields do you think you need to dedup on?  
> You can actually specify any arbitrary field and subfield for deduping in 
> MarcEdit.
>
>                                       Steve McDonald
>                                       [email protected]

--
Karen Coyle
[email protected] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

Reply via email to