Thank you Andreas and Emanuele for the replies, Indeed there might be better forums to find the absolute knowledge in machine learning, but well, at times we might be helpful too ;)
On Thu, 25 Nov 2010, Emanuele Olivetti wrote: > About PCA, you are not double-dipping, since PCA uses just brain > data, not stimuli (or whatever you want to predict) as input. It > is an "unsupervised" method, so it is safe to use it on the whole dataset. agree! In particular if the goal is just a generalization estimation. I am not sure if it would be ok if I was interested in the "relevance" of any particular feature as diagnosed by classifier sensitivity and corresponding loadings on PCA components. To make it clear why it is ok for generalization assessment and there is no double-dipping: imagine I create a classifier which just stores labeled data obtained during training, and while new data provided during prediction, it takes both training data (without labels) and new data obtained for prediction (no labels provided), computes PCA, and only then takes PCA-transformed training data and labels to train a corresponding classifier; and doing PCA-projection for prediction as well. As you could see -- it would be just a memory-based classifier with cheap train and expensive predict ;) On Thu, 25 Nov 2010, Andreas Mueller wrote: > >accuracy, i am looking at transformations (such as time-frequency > >decomposition) on the data prior to feeding it to the classifier. btw, if you have pywt module available, you could give a try to WaveletPacketMapper and WaveletTransformationMapper > >PCA, CSP (common spatial patterns), DSP (discriminative spatial > >patterns) and the like. > As far as I know, PCA is mainly used to reduce the dimensionality and > therefore the computational cost of the SVM. > Since this is only a linear transform, I doubt that it will improve results. actually it depends... e.g. if underlying classifier's regularization is invariant to the transformation (e.g. margin width), then yeap -- there should be no effect. But if it is sensitive to it (e.g. feature selection , like in SMLR), then you might get advantage since, like in the case of SMLR, the goal of having fewer important features might be achieved with higher generalization. -- =------------------------------------------------------------------= Keep in touch www.onerussian.com Yaroslav Halchenko www.ohloh.net/accounts/yarikoptic _______________________________________________ Pkg-ExpPsy-PyMVPA mailing list Pkg-ExpPsy-PyMVPA@lists.alioth.debian.org http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa