Hypothesis: you have a very small dataset and when you leave out data, you create a distribution shift between the train and the test. A simplified example: 20 samples, 10 class a, 10 class b. A leave-one-out cross-validation will create a training set of 10 samples of one class, 9 samples of the other, and the test set is composed of the class that is minority on the train set.
G On Tue, Sep 26, 2017 at 06:10:39PM +0200, Thomas Evangelidis wrote: > Greetings, > I don't know if anyone encountered this before, but sometimes I get > anti-correlated predictions by the SVR I that am training. Namely, the > Pearson's R and Kendall's tau are negative when I compare the predictions on > the external test set with the true values. However, the SVR predictions on > the > training set have positive correlations with the experimental values and hence > I can't think of a way to know in advance if the trained SVR will produce > anti-correlated predictions in order to change their sign and avoid the > disaster. Here is an example of what I mean: > Training set predictions: R=0.452422, tau=0.333333 > External test set predictions: R=-0.537420, tau-0.300000 > Obviously, in a real case scenario where I wouldn't have the external test set > I would have used the worst observation instead of the best ones. Has anybody > any idea about how I could prevent this? > thanks in advance > Thomas -- Gael Varoquaux Researcher, INRIA Parietal NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France Phone: ++ 33-1-69-08-79-68 http://gael-varoquaux.info http://twitter.com/GaelVaroquaux _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn