Yeah.  Trying to define music is essentially pointless.  Defining speech would 
be easier.

However, since you mentioned neural nets, you could try to train a net on 
speech and music (there's an ANN object out there that I've tested) and see 
what happens.  That would be a fun experiment, no idea how well it would 
work...  you'd need a pretty huge training set for it to be even remotely (I 
would think).

——t3db0t

On Feb 7, 2011, at 1:38 PM, Pedro Lopes wrote:

> First of all, I would take it from another angle:
> 
> <this is one possible way, out of zillions>
> if it is speech or not. Thus if the speech recognizer has X % of recogniztion 
> rate, you inherit that percentage. Now you heavily depend on the recognizer, 
> some recognizers like teh default windows try to always match the input to 
> some string, thus they are a bit of garbage in academic terms, what you need 
> is a strong open recognizer that can tell you how % similar the sentence is 
> to a target sentence in database. 
> 
> Why do I suggest this angle?
> - Cause' I don't wanna think "what is music". Speech is a language, it is 
> defined, it easy structured. Music? Noise is music, drone is music, ambient 
> can be non rhythmical, what about an a Capella singing? Will it be music? and 
> all those inherited philosophical issues. Furthermore, if you need more help 
> maybe explaining the context will aid us, because if you only care for 
> certain "music" can be easier. ALSO: if you have access the audio data, you 
> can always extract (filter) the music. 
> 
> </this is one possible way, out of zillions>
> 
> best,
> pedro
> 
> 
> On Mon, Feb 7, 2011 at 5:43 PM, patrick <[email protected]> wrote:
> would it be possible to detect if the incoming audio is music or speech? i 
> guess it's very hard, but i was thinking about some methods:
> 
> using some kind of frequency detection
> using bonk (if the tempo is stable = music)
> env~ (most music are compressed nowadays)
> training a voice (using neural network?!?)
> 
> 
> From the author of aubio:
> Use a few low level features, such as energy of low and high frequencies 
> bands, spectral spread. In a second step, these approaches are often refined 
> using machine learning techniques bayesian networks or support vector 
> machines.
> 
> See for instance these papers:
> http://cobweb.ecn.purdue.edu/~malcolm/interval/1996-085/
> http://www.aclweb.org/anthology/O/O08/O08-1015.pdf
> http://www.hindawi.com/journals/asp/2009/628570.html
> 
> i would like to achieve > 90% of accuracy if possible. any suggestions are 
> welcome!
> 
> _______________________________________________
> [email protected] mailing list
> UNSUBSCRIBE and account-management -> 
> http://lists.puredata.info/listinfo/pd-list
> 
> 
> 
> -- 
> Pedro Lopes (MSc)
> contact: [email protected]
> website: http://web.ist.utl.pt/Pedro.Lopes / 
> http://pedrolopesresearch.wordpress.com/ | http://twitter.com/plopesresearch
> _______________________________________________
> [email protected] mailing list
> UNSUBSCRIBE and account-management -> 
> http://lists.puredata.info/listinfo/pd-list

_______________________________________________
[email protected] mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list

Reply via email to