I have very small training sets (10-50 observations). Currently, I am
working with 16 observations for training and 25 for validation (external
test set). And I am doing Regression, not Classification (hence the SVR
instead of SVC).


On 26 September 2017 at 18:21, Gael Varoquaux <gael.varoqu...@normalesup.org
> wrote:

> Hypothesis: you have a very small dataset and when you leave out data,
> you create a distribution shift between the train and the test. A
> simplified example: 20 samples, 10 class a, 10 class b. A leave-one-out
> cross-validation will create a training set of 10 samples of one class, 9
> samples of the other, and the test set is composed of the class that is
> minority on the train set.
>
> G
>
> On Tue, Sep 26, 2017 at 06:10:39PM +0200, Thomas Evangelidis wrote:
> > Greetings,
>
> > I don't know if anyone encountered this before, but sometimes I get
> > anti-correlated predictions by the SVR I that am training. Namely, the
> > Pearson's R and Kendall's tau are negative when I compare the
> predictions on
> > the external test set with the true values. However, the SVR predictions
> on the
> > training set have positive correlations with the experimental values and
> hence
> > I can't think of a way to know in advance if the trained SVR will produce
> > anti-correlated predictions in order to change their sign and avoid the
> > disaster. Here is an example of what I mean:
>
> > Training set predictions: R=0.452422, tau=0.333333
> > External test set predictions: R=-0.537420, tau-0.300000
>
> > Obviously, in a real case scenario where I wouldn't have the external
> test set
> > I would have used the worst observation instead of the best ones. Has
> anybody
> > any idea about how I could prevent this?
>
> > thanks in advance
> > Thomas
> --
>     Gael Varoquaux
>     Researcher, INRIA Parietal
>     NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
>     Phone:  ++ 33-1-69-08-79-68
>     http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>



-- 

======================================================================

Dr Thomas Evangelidis

Post-doctoral Researcher
CEITEC - Central European Institute of Technology
Masaryk University
Kamenice 5/A35/2S049,
62500 Brno, Czech Republic

email: tev...@pharm.uoa.gr

          teva...@gmail.com


website: https://sites.google.com/site/thomasevangelidishomepage/
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to