Re: [c-prog] Comparing audio files

Thomas Hruska Thu, 22 Nov 2007 09:14:49 -0800

Michael Ballen wrote:
> I'm trying to compare two bird calls (same bird at different times)  
> for identification purposes. Anyone have experience writing code to do  
> this? Thanks,
> 
> Mike


No specific experience, but I am familiar with most AI techniques.  You 
are entering the domain of voice recognition.  Your first task should be 
to read just about everything you can on "neural networks" and "Fast 
Fourier Transforms" (FFTs).  Search Google for both phrases.

Both are pretty dense reading involving heavy math and statistics.  Hope 
you have a good background in both...you'll need it.  The idea behind 
_rudimentary_ voice recognition is to sample a sound (e.g. microphone 
input), run a FFT over small, discrete samples of the sound, and 
train/use a neural network with the output of the FFT.

As with most voice recognition software, distinguishing the voice of the 
subject from things like background noise is, well, complicated.  You'll 
probably find that this sort of project quickly gets out of control 
after you discover that you _maybe_ get 80% accuracy (if you are lucky). 
  Commercial voice recognition software is tweaked and tuned for years 
just for people in a specific _locale_.  Even then, it is still 
frustrating to use because the tool is less than 100% accurate.  Here's 
what happens when a commercial tool is NOT tweaked:

http://video.google.com/videoplay?docid=-1123221217782777472
http://www.youtube.com/user/scrubadub1

IMO, the current approach to voice recognition is wrong.  Statistics do 
not equate to voice recognition.

-- 
Thomas Hruska
CubicleSoft President
Ph: 517-803-4197

*NEW* MyTaskFocus 1.1
Get on task.  Stay on task.

http://www.CubicleSoft.com/MyTaskFocus/

Re: [c-prog] Comparing audio files

Reply via email to