Hi Adreas,
Thanks a lot; that answers my questions. Just a quick check to be sure I
understand it correctly: the results in the classification report for the
best classifier are the ones on the test set, right?
And another small question: could you tell me how/where I need to set the
class_weight parameter, since this doesn't seem to work in the regular way
in the fit method? Would it furthermore be possible to - besides 'auto' -
tune this as well with GridSearch?
Thanks,
Mathias
On Fri, Feb 3, 2012 at 11:03 AM, Andreas <[email protected]> wrote:
> **
> Hi Mathias.
> First, please note that you are looking at an "old" version of the docs.
> We are in the process to include a warning.
> Please refer to
> http://scikit-learn.org/stable/auto_examples/grid_search_digits.html<http://scikit-learn.org/0.9/auto_examples/grid_search_digits.html>instead.
>
> For your first question:
> I didn't write the example but this is how I understood it:
>
> Usually when evaluating a machine learning method, you are given a "test"
> and a "training"
> set and train on the training set and test on the test set.
> If you want to adjust hyper parameters of the method, a common way is to
> do cross-validation
> on the training set.
> Then you still need to evaluate on an independent test set, to see how
> well your parameters
> generalize to something unseen.
> As the digits data set does not come in a training and test part, the
> StratifiedKFold split
> is used to to simulate this.
> Does this answer your question?
>
>
> For your second question:
> There is a parameter "refit" of the GridSearchCV (see the
> references<http://scikit-learn.org/stable/modules/generated/sklearn.grid_search.GridSearchCV.html#sklearn.grid_search.GridSearchCV>)
> that decides exactly that.
> It is "True" by default.
>
> Cheers,
> Andy
>
>
> On 02/03/2012 10:54 AM, Mathias Verbeke wrote:
>
> Hi all,
>
> I'm currently looking at the GridSearch example (
> http://scikit-learn.org/0.9/auto_examples/grid_search_digits.html), and I
> don't completely get the point of using cross-validation twice. Why aren't
> the parameters and the classifier selected in on cross-validations step?
>
> Furthermore, I was wondering if I do a refit at the end of the GridSearch
> procedure, it will train the model on the complete dataset, so that it can
> be applied on the test set afterwards?
>
> Best and thanks,
>
> Mathias
>
>
> ------------------------------------------------------------------------------
> Try before you buy = See our experts in action!
> The most comprehensive online learning library for Microsoft developers
> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> Metro Style Apps, more. Free future releases when you subscribe
> now!http://p.sf.net/sfu/learndevnow-dev2
>
>
> _______________________________________________
> Scikit-learn-general mailing
> [email protected]https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
>
> ------------------------------------------------------------------------------
> Try before you buy = See our experts in action!
> The most comprehensive online learning library for Microsoft developers
> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> Metro Style Apps, more. Free future releases when you subscribe now!
> http://p.sf.net/sfu/learndevnow-dev2
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general