Hi Ralf.
Sorry I'm tired I didn't see the attachment, sorry.
Andy


On 10/22/2013 08:48 PM, Ralf Gunter wrote:
Hi Andreas,

I'm not sure what you mean by "more comprehensive"; the gist on the first message should reproduce the problem -- if not, then it might be something on my local configuration (python, numpy, etc). The script is exactly the same one I'm using in "production", just with a much bigger dataset. Please let me know what kind of extra information you're looking for.

Thanks!


2013/10/22 Andreas Mueller <[email protected] <mailto:[email protected]>>

    Hi Ralf.

    Can you give a more comprehensive gist maybe? https://gist.github.com/
    My first intuition would be that you are in fact using the r2
    score, not the MSE, when outputting these numbers.

    Cheers,
    Andy



    On 10/22/2013 07:20 PM, Ralf Gunter wrote:
    Hello,

    I'm testing a few regression algorithms to map ndarrays of
    eigenvalues to floats, using StratifieldKFolds + GridSearchCV for
    cross-validation & hyperparameter estimation using some code
    borrowed from [1]. Although GridSearchCV appears to be working as
    advertised (i.e. the "best_estimator_" is much better than the
    baseline), it's giving a negative "mean_score" for both the
    "mean_squared_error" metric and with my manually-implemented RMS
    error function. Code & sample datasets are at [2]. Here's a
    trimmed sample output:


      ~/a/a/g/test ??? python regression.py
      ...
      Tuning hyper-parameters for mean_squared_error
      Best parameters set found on development set:
      SVR(C=100.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.1,
    gamma=0.0,
        kernel=rbf, max_iter=-1, probability=False, random_state=None,
          shrinking=True, tol=0.001, verbose=False)
      Grid scores on development set:
      ...
      -122.503 (+/-21.396) for {'epsilon': 0.01, 'C': 0.001,
    'kernel': 'rbf'}
      -122.503 (+/-21.396) for {'epsilon': 0.10000000000000001, 'C':
    0.001, 'kernel': 'rbf'}
      ...
      RMSE: 0.100012
      MAE:  0.100012
      RMSE: 0.099933
      MAE:  0.099933
      ...


    Currently I'm testing both 0.14.1 and Mathieu Blondel's
    "kernel_ridge" branch with python 2.7.5 on arch linux. The above
    phenomenon happens with both versions for SVR and (naturally) in
    the dev branch for KernelRidge. As you can see, the exact same
    custom metric applied manually (lines 117-122) gives appropriate,
    positive errors, whereas the one printed in lines 113-114 does not.

    Why are these numbers negative? Am I missing something obvious
    here? I'm a bit concerned about trusting these estimators (or
    rather, their optimality) because of this oddity. A quick google
    search only came up with similar "problems" with r2 (which aren't
    really problems, since r2 can be negative, unlike
    "mean_squared_error").

    Thanks!

    [1] - http://scikit-learn.org/dev/_downloads/grid_search_digits.py
    [2] -
    https://gist.github.com/anonymous/1af53a1da1357a6a97c3 (sorry for
    the mangled code -- it's gone through a botched anonymization
    procedure)


    
------------------------------------------------------------------------------
    October Webinars: Code for Performance
    Free Intel webinars can help you accelerate application performance.
    Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most 
from
    the latest Intel processors and coprocessors. See abstracts and register >
    http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk


    _______________________________________________
    Scikit-learn-general mailing list
    [email protected]  
<mailto:[email protected]>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


    
------------------------------------------------------------------------------
    October Webinars: Code for Performance
    Free Intel webinars can help you accelerate application performance.
    Explore tips for MPI, OpenMP, advanced profiling, and more. Get
    the most from
    the latest Intel processors and coprocessors. See abstracts and
    register >
    http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk
    _______________________________________________
    Scikit-learn-general mailing list
    [email protected]
    <mailto:[email protected]>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk


_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to