2012/2/3 Mathias Verbeke <[email protected]>:
> Hi Adreas,
>
> Thanks a lot; that answers my questions. Just a quick check to be sure I
> understand it correctly: the results in the classification report for the
> best classifier are the ones on the test set, right?
It print the performance measured on the test set (also known as
evaluation set) of the best classifier as found on the training set
(also known as development set).
If you do the parameter selection and evaluation on the same dataset
you will be likely to overfit the hyperparameters settings and hence
your performance estimation will be an over-estimate of the true
generalization performance.
> And another small question: could you tell me how/where I need to set the
> class_weight parameter, since this doesn't seem to work in the regular way
> in the fit method? Would it furthermore be possible to - besides 'auto' -
> tune this as well with GridSearch?
You can extend the grid search as follows (that will double the running time):
tuned_parameters = [
{'kernel': ['rbf'], 'gamma': [1e-3, 1e-4],
'C': [1, 10, 100, 1000], 'class_weight': [None, 'auto']},
{'kernel': ['linear'], 'C': [1, 10, 100, 1000],
'class_weight': [None, 'auto']}
]
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general