You are welcome! And in addition, if you select among different algorithms,
here are some more suggestions
a) don’t do it based on your independent test set if this is going to your
final model performance estimate, or be aware that it would be overly optimistic
b) also, it’s not the best idea
Ahh.. nice.. I will use that.. thanks a lot, Sebastian!
Best,
Raga
On Thu, Jan 26, 2017 at 6:34 PM, Sebastian Raschka
wrote:
> Hi, Raga,
>
> I think that if GridSearchCV is used for classification, the stratified
> k-fold doesn’t do shuffling by default.
>
> Say you do 20
Hi, Raga,
I think that if GridSearchCV is used for classification, the stratified k-fold
doesn’t do shuffling by default.
Say you do 20 grid search repetitions, you could then do sth like:
from sklearn.model_selection import StratifiedKFold
for i in range(n_reps):
k_fold =
Okay, I didn't see anything equivalent in the issue tracker, so submitted a
pull request.
Jeremiah
===
Jeremiah W. Johnson, Ph. D
Assistant Professor of Data Science
Analytics Bachelor of Science Program Coordinator
University of New Hampshire
Thank you, Guillaume.
1. I agree with you - that's what I have been learning and makes sense.. I
was a bit surprised when I read the paper today..
2. Ah.. thank you.. I got to change my glasses :P
Best,
Raga
*Guillaume Lemaître* g.lemaitre58 at gmail.com
Hello,
I have 2 questions regarding cross_val_score.
1. Do the scores returned by cross_val_score correspond to only the test
set or the whole data set (training and test sets)?
I tried to look at the source code, and it looks like it returns the score
of only the test set (line 145:
Hey all.
I created a survey to prioritize and justify (to people that give me
money) future scikit-learn development.
It would be great if you could answer it, it should be pretty sort (it's
10 questions, mostly multiple choice).
Feel free to share, more replies are better ;)