[scikit-learn] Scores in Cross Validation

Raga Markely Thu, 26 Jan 2017 08:05:00 -0800

Hello,

I have 2 questions regarding cross_val_score.
1. Do the scores returned by cross_val_score correspond to only the test
set or the whole data set (training and test sets)?
I tried to look at the source code, and it looks like it returns the score
of only the test set (line 145: "return_train_score=False") - I am not sure
if I am reading the codes properly, though..
https://github.com/scikit-learn/scikit-learn/blob/14031f6/sklearn/model_
selection/_validation.py#L36
I came across the paper below and the authors use the score of the whole
dataset when the author performs repeated nested loop, grid search cv,
etc.. e.g. see algorithm 1 (line 1c) and 2 (line 2d) on page 3.
https://jcheminf.springeropen.com/articles/10.1186/1758-2946-6-10
I wonder what's the pros and cons of using the accuracy score of the whole
dataset vs just the test set.. any thoughts?


2. On line 283 of the cross_val_score source code, there is a function
_score. However, I can't find where this function is called. Could you let
me know where this function is called?

Thank you very much!
Raga

_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

[scikit-learn] Scores in Cross Validation

Reply via email to