Hi all, I recieved numerous replies to my query. I can't thanks everyone individually so I want to thank everyone who has replied. I am now looking through the information and links that you have provided. Many Thanks For All Your Help!!
Rishabh "Rishabh Gupta" <[EMAIL PROTECTED]> wrote in message a4eje9$ip8$[EMAIL PROTECTED]">news:a4eje9$ip8$[EMAIL PROTECTED]... > Hi All, > I'm a research student at the Department Of Electronics, University Of > York, UK. I'm working a project related to music analysis and > classification. I am at the stage where I perform some analysis on music > files (currently only in MIDI format) and extract about 500 variables that > are related to music properties like pitch, rhythm, polyphony and volume. I > am performing basic analysis like mean and standard deviation but then I > also perform more elaborate analysis like measuring complexity of melody and > rhythm. > > The aim is that the variables obtained can be used to perform a number of > different operations. > - The variables can be used to classify / categorise each piece of > music, on its own, in terms of some meta classifier (e.g. rock, pop, > classical). > - The variables can be used to perform comparison between two files. A > variable from one music file can be compared to the equivalent variable in > the other music file. By comparing all the variables in one file with the > equivalent variable in the other file, an overall similarity measurement can > be obtained. > > The next stage is to test the ability of the of the variables obtained to > perform the classification / comparison. I need to identify variables that > are redundant (redundant in the sense of 'they do not provide any > information' and 'they provide the same information as the other variable') > so that they can be removed and I need to identify variables that are > distinguishing (provide the most amount of information). > > My Basic Questions Are: > - What are the best statistical techniques / methods that should be > applied here. E.g. I have looked at Principal Component Analysis; this would > be a good method to remove the redundant variables and hence reduce some the > amount of data that needs to be processed. Can anyone suggest any other > sensible statistical anaysis methods? > - What are the ideal tools / software to perform the clustering / > classification. I have access to SPSS software but I have never used it > before and am not really sure how to apply it or whether it is any good when > dealing with 100s of variables. > > So far I have been analysing each variable on its own 'by eye' by plotting > the mean and sd for all music files. However this approach is not feasible > in the long term since I am dealing with such a large number of variables. > In addition, by looking at each variable on its own, I do not find clusters > / patterns that are only visible through multivariate analysis. If anyone > can recommend a better approach I would be greatly appreciated. > > Any help or suggestion that can be offered will be greatly appreciated. > > Many Thanks! > > Rishabh Gupta > > ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =================================================================