Hi Sarah,

I agree with Kate that you may have to do a bit of extraction from the
description field into new field(s) before you can begin removing duplicate
entries. This data cleaning will help you to decide which duplicate (double)
entry to get rid of. I have performed this type of work for 60 small museums
over the past five years and it is somewhat tedious.

Contact me at richard at heritageinformation.ca if you would like to bounce
ideas outside of the listserv.

Richard Cloutier

-----Original Message-----
From: mcn-l-bounces at mcn.edu [mailto:[email protected]] On Behalf Of
sarah johnson
Sent: Friday, May 11, 2007 4:37 PM
To: undisclosed-recipients:
Subject: [MCN-L] database sorting questions

I recently became involved with a project attempting to sort through a 
rather large (50,000 or so) collection of multimedia assets (videotapes, 
audiotapes, cds, dvds), looking for duplicate copies.  I would like to at 
least get a start on this programmatically, using the database they were 
recorded in (the database has been exported to an excel file).

However, I am having some trouble coming up with a method to query and sort 
the data.

Each piece was logged separately, with no mention of whether it was actually

a clone or viewing copy of a previously entered piece.

For the most part, all of the information on the label was typed into a 
single 'description' field.  Runtimes and dates are both in separate fields,

as is title (although the separate title field was somewhat rarely used).  
Unfortunately, the labels did not always read exactly alike even if the 
videos were, nor did the layout of the entries always match.  Runtimes are 
also often off by a few seconds.

I really need a method of automatically pulling up all assets with about an 
80% match between fields.  Is there a way to do this?  Or am I approaching 
this problem all wrong?  Any help at all would be greatly, greatly 
appreciated!

Sarah Johnson
sarah at dvs.com

_________________________________________________________________
See what you're getting into.before you go there 
http://newlivehotmail.com/?ocid=TXT_TAGHM_migration_HM_viral_preview_0507




Reply via email to