Hi all, I'm using pymvpa to classify EEG data by SVM. in order to improve accuracy, i am looking at transformations (such as time-frequency decomposition) on the data prior to feeding it to the classifier. i stumbled upon some methods mostly used in the BCI domain, such as PCA, CSP (common spatial patterns), DSP (discriminative spatial patterns) and the like. i now have 2 questions:
1. running PCA, CSP, etc on the whole dataset _prior_ feeding it to a classifer: looks to me as a case of 'double-dipping', as all trials (training and test) are used to identifiy the components. thus all trials in the dataset given to the classifier are actually inter-dependent. am i right there? 2. if 1. is true, then one could still run the PCA (,etc) just on the training set in each split*, and then run a SVM. does this make any sense, or is a suited svm-kernel already taking care of this? thanks for any comments or thoughts on this, greetings, jakob *something like: clf = MappedClassifier(LinearCSVMC(), PCAMapper()) cv = CrossValidatedTransferError( TransferError(clf), NFoldSplitter(), enable_ca=[’results’]) cv(dataset) _______________________________________________ Pkg-ExpPsy-PyMVPA mailing list Pkg-ExpPsy-PyMVPA@lists.alioth.debian.org http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-pymvpa