RE: Duplicate matching

2014-11-30 Thread Adrian Halid
Hi Greg, Instead of using the filename to determine duplicate audio files have you considered using an audio fingerprint? I have used this software in the past to automatically tag my music. https://picard.musicbrainz.org/ “Picard uses AcoustID audio fingerprints, allowing files to be

Re: Duplicate matching

2014-11-30 Thread Greg Keogh
Instead of using the filename to determine duplicate audio files have you considered using an audio fingerprint? ... Apparently it uses http://acoustid.org/ which is an open source library. This is an interesting lateral-thinking idea. That's an ambitious and scientifically interesting

RE: Duplicate matching

2014-11-30 Thread Adrian Halid
Of Greg Keogh Sent: Monday, 1 December 2014 8:35 AM To: ozDotNet Subject: Re: Duplicate matching Instead of using the filename to determine duplicate audio files have you considered using an audio fingerprint? ... Apparently it uses http://acoustid.org/ which is an open source library

Re: Duplicate matching

2014-11-28 Thread Greg Harris
Hi Greg, Please find following what I have used in the past. It is very expensive, but I can not see a better way of doing it. It returns an integer which is the sum of: - number of times the same letter appears in both strings - 10 times the number of times the same two letters appears in

Re: Duplicate matching

2014-11-28 Thread Greg Harris
Hi Greg, I should look at my code before I write comments from memory... The result is a *double *value being the sum of: · number of times the same letter appears in both strings · 10 times the number of times the same two letters appears in both strings · 100 times

Re: Duplicate matching

2014-11-28 Thread Greg Keogh
Thanks Greg H, the weighting is a very interesting idea. I'm running some simple experiments now with a word list and an inverted list of file names, just to help me picture the problem in my head. The problem with a weighting comparison is that I don't know what to compare with what, comparing

Re: Duplicate matching

2014-11-28 Thread Stephen Price
Am curious, is the idea of the exercise to write your own code to solve the problem, or to solve the problem? I've used Treesize pro to find file duplicates in the past. Also have used Directory Opus to find duplicates. Great for finding identical files with different names. Probably won't help if

Re: Duplicate matching

2014-11-28 Thread Greg Keogh
Hi Stephen, I wrote a utility in Framework 1.0 that finds duplicate files by content (builds a dictionary of checksums). In this case the files with similar names might be the same recording at different bitrates, making them binary different. So it's a bit fuzzy what I'm looking for. Off the cuff

RE: Duplicate matching

2014-11-28 Thread ILT (O)
PM To: ozDotNet Subject: Re: Duplicate matching Am curious, is the idea of the exercise to write your own code to solve the problem, or to solve the problem? I've used Treesize pro to find file duplicates in the past. Also have used Directory Opus to find duplicates. Great for finding

Re: Duplicate matching

2014-11-28 Thread Stephen Price
-boun...@ozdotnet.com ozdotnet-boun...@ozdotnet.com] *On Behalf Of *Stephen Price *Sent:* Saturday, November 29, 2014 12:30 PM *To:* ozDotNet *Subject:* Re: Duplicate matching Am curious, is the idea of the exercise to write your own code to solve the problem, or to solve the problem? I've used