I am not surprised to observe these variations. First, the oob score tends to 
overestimate the statistical performance of the model. Then the last score is 
an evaluation without cross validation. Therefore, you trained a single model 
on full x1 and tested on x3. In cross validation you evaluate 3 models on 
different of x1. So you have less data and I would expect the last score to be 
potentially better. The remaining variation is across the fold in the first 
score. This can happen when you use a idols that does not shuffle the data and 
that there is a structure in the order of the data. Shuffling the data will 
break this and make it easier to predict without this variation, most probably. 
What is important however is to know if this structure is supposed to be 
existing or not. If it is then shuffling should not be done and the original 
estimate is what you should look at. Such a wrong shuffling coule be something 
like shuffling time series: you break the ordering by shuffling while you 
certainly want to split considering this time structure. 

Sent from my iPhone

> On 27 Dec 2021, at 04:18, Haylee Miller <mh.nw...@gmail.com> wrote:
> 
> 
> I don’t know if the email was successfully sent last time. I send it again 
> now. I’m sorry to disturb you.
> 
> ---------- Forwarded message ---------
> 发件人: Haylee Miller <mh.nw...@gmail.com>
> Date: 2021年12月24日周五 21:17
> Subject: There is a problem with using "r2" to calculate cross_val_score and 
> GridSearchCV scores
> To: <scikit-learn@python.org>
> 
> 
> Dear sklearn developers:
> First of all, thank you for developing this module, it is very useful. 
> However, recently we found a small problem in the use of cross_val_score and 
> GridSearchCV.
> Using "scoring = ‘r2’" to calculate the cross_val_score and GridSearchCV 
> scores is inconsistent with the result calculated using "metrics.r2_score".
> 
> According to the principle of k-fold cross-validation, we performed manual 
> 3-fold cross-validation and there was a big gap between the score and the 
> result of cross_val_score. 
> Below is the code and results of our manual verification process.
> 
> 
> 
> Theoretically, the three values in results 1-3 should be similar to the three 
> values in cross_val_score 1 and cross_val_score 2. 
> However, only the first value in cross_val_score 1 and cross_val_score 2 is 
> close to the result 1-3 in  figures.
> Why is this so, looking forward to your reply!
> Finally, Merry Christmas!
> Best wishes,
> Ma Hui
> _______________________________________________
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to