Greg Keogh
Sent: Monday, 1 December 2014 8:35 AM
To: ozDotNet
Subject: Re: Duplicate matching
Instead of using the filename to determine duplicate audio files have you
considered using an audio fingerprint? ... Apparently it uses
http://acoustid.org/ which is an open source library.
This is an
>
> Instead of using the filename to determine duplicate audio files have you
> considered using an audio fingerprint? ... Apparently it uses
> http://acoustid.org/ which is an open source library.
>
This is an interesting lateral-thinking idea. That's an ambitious and
scientifically interesting p
: ozDotNet
Subject: Duplicate matching
Folks, I was about this write some utility code to search through my 20,000
audio files looking for probable duplicates. I say "probable" because I found
file names like these:
Lovelock - Trumpet Concerto (SSO Concert).mp3
Trumpet Concerto (Willia
dotnet.com
] *On Behalf Of *Stephen Price
*Sent:* Saturday, November 29, 2014 12:30 PM
*To:* ozDotNet
*Subject:* Re: Duplicate matching
Am curious, is the idea of the exercise to write your own code to solve the
problem, or to solve the problem? I've used Treesize pro to find file
duplicates in
PM
To: ozDotNet
Subject: Re: Duplicate matching
Am curious, is the idea of the exercise to write your own code to solve the
problem, or to solve the problem? I've used Treesize pro to find file
duplicates in the past. Also have used Directory Opus to find duplicates. Great
for fi
Hi Stephen, I wrote a utility in Framework 1.0 that finds duplicate files
by content (builds a dictionary of checksums). In this case the files with
"similar" names might be the same recording at different bitrates, making
them binary different. So it's a bit fuzzy what I'm looking for. Off the
cuf
Am curious, is the idea of the exercise to write your own code to solve the
problem, or to solve the problem? I've used Treesize pro to find file
duplicates in the past. Also have used Directory Opus to find duplicates.
Great for finding identical files with different names. Probably won't help
if
Thanks Greg H, the "weighting" is a very interesting idea. I'm running some
simple experiments now with a word list and an inverted list of file names,
just to help me picture the problem in my head. The problem with a
weighting comparison is that I don't know what to compare with what,
comparing 2
Hi Greg,
I should look at my code before I write comments from memory...
The result is a *double *value being the sum of:
· number of times the same letter appears in both strings
· 10 times the number of times the same two letters appears in both
strings
· 100 times t
Hi Greg,
Please find following what I have used in the past.
It is very expensive, but I can not see a better way of doing it.
It returns an integer which is the sum of:
- number of times the same letter appears in both strings
- 10 times the number of times the same two letters appears in
Folks, I was about this write some utility code to search through my 20,000
audio files looking for probable duplicates. I say "probable" because I
found file names like these:
Lovelock - Trumpet Concerto (SSO Concert).mp3
Trumpet Concerto (William Lovelock).mp3
There are many other duplicates wi
11 matches
Mail list logo