Hi, As far as I understand GridSearchCV selects the parameter value with lower 
score but 
in case of multiple parameter values having the same minimum score - which is 
not 
infrequent in the case of small datasets - it always selects deterministically 
the first 
one of the group. This is usually the one with lowest value among equivalents 
because we 
usually feed ordered parameter grids to GridSearchCV. Isn't this a biased way 
of doing the 
selection? What about picking one at random (among equivalents) each time in 
order avoid 
bias through stochasticity? The related code is here: 
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/grid_search.py#L352
 Best, 
Emanuele


------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to