Btw there was a branch by me doing exactly that: https://github.com/scikit-learn/scikit-learn/pull/1742 I don't really remember what the reason not to merge it was (it is now hopelessly out of data I think).

On 05/02/2014 08:17 AM, Robert McGibbon wrote:
> There have been previous attempts to incorporate training score, but there's a general open question of how best to > return Gird Search results: The current format cv_scores_ is not really extensible, which seems to have stalled many of > these issues. Input on this issue is welcome. Otherwise, for the moment, you will have to roll your own implementation
> (and I should note that _fit_and_score is a fairly recent invention).

What about adding more fields to the _CVScoreTuple namedtuple (GridSearchCV.grid_scores_ is a list of these namedtuples)? If things are added at the end of the list, it should have a pretty small chance of breaking backward compatibility. The current field names (`parameters`, `mean_validation_score`, `cv_validation_scores`) are quite specific, so for example adding `cv_train_scores` could be an option.

I'm not too aware of the history of the project or what has been tried previously on this issue, so appologies if this is obviously incorrect.

FWIW, I put together the code + tests for this change:
https://github.com/rmcgibbo/scikit-learn/compare/scikit-learn:master...rmcgibbo:grid-search-train-error
Happy to file a PR if this is worthwhile for others.

-Robert


On Thu, May 1, 2014 at 10:10 PM, Joel Nothman <joel.noth...@gmail.com <mailto:joel.noth...@gmail.com>> wrote:

    There have been previous attempts to incorporate training score,
    but there's a general open question of how best to return Gird
    Search results: The current format cv_scores_ is not really
    extensible, which seems to have stalled many of these issues.
    Input on this issue is welcome. Otherwise, for the moment, you
    will have to roll your own implementation (and I should note that
    _fit_and_score is a fairly recent invention).


    On 2 May 2014 13:34, Robert McGibbon <rmcgi...@gmail.com
    <mailto:rmcgi...@gmail.com>> wrote:

        Hi all,

        Is there any to get the score on the training data for each
        parameter set (and each fold) when running GridSearchCV? While
        I haven't looked too closely at the code, it appears that
        BaseSearchCV
        
<https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/grid_search.py#L378>
        uses the _fit_and_score
        
<https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/cross_validation.py#L1118>
 method,
        which does have the ability to calculated and return scores on
        the training data, but that this functionality isn't exposed
        in GridSearchCV.

        The use case for this would to compare training and test error
        (ala the classictraining error and test error vs. model
        complexity plot
        
<http://link.springer.com/protocol/10.1007%2F978-1-60327-429-6_15/fulltext.html#Fig3_15>)

        -Robert

        
------------------------------------------------------------------------------
        "Accelerate Dev Cycles with Automated Cross-Browser Testing -
        For FREE
        Instantly run your Selenium tests across 300+ browser/OS
        combos.  Get
        unparalleled scalability from the best Selenium testing
        platform available.
        Simple to use. Nothing to install. Get started now for free."
        http://p.sf.net/sfu/SauceLabs
        _______________________________________________
        Scikit-learn-general mailing list
        Scikit-learn-general@lists.sourceforge.net
        <mailto:Scikit-learn-general@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/scikit-learn-general



    
------------------------------------------------------------------------------
    "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
    Instantly run your Selenium tests across 300+ browser/OS combos.  Get
    unparalleled scalability from the best Selenium testing platform
    available.
    Simple to use. Nothing to install. Get started now for free."
    http://p.sf.net/sfu/SauceLabs
    _______________________________________________
    Scikit-learn-general mailing list
    Scikit-learn-general@lists.sourceforge.net
    <mailto:Scikit-learn-general@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.  Get
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.  Get 
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to