Re: [Scikit-learn-general] [scikit-learn-general] Possible bug in RFECV.fit?

Andy Thu, 23 Jul 2015 04:23:34 -0700

I don't think it is feasible to try to fail fast for all possible cases.

I think users should try their code with something that runs throughquickly before doing a multi-day run.

I do that for all my experiments.


On 07/22/2015 02:38 PM, Joel Nothman wrote:

This isn't directly a problem with RFECV, it's a problem with what youprovided as an argument to `scoring`. I suspect you provided afunction with signature fn(y_true, y_pred) -> score, where what isrequired is a function fn(estimator, X, y_true) -> score. Seehttp://scikit-learn.org/stable/modules/model_evaluation.html#the-scoring-parameter-defining-model-evaluation-rules

Perhaps we should be failing faster in such a case. We could, forinstance, extend check_scoring to smoke-test scoring(estimator, X,y_true), at a cost that we hope is small relative to fitting.

And where is that parallelism happening? It looks like the RFECV codecould be parallelised, but is not atm.

On 22 July 2015 at 21:34, Dale Smith <[email protected]<mailto:[email protected]>> wrote:


    Hello,

    I just ran a four-day fit using RFECV. At the end I got the
    following message. My question is whether this is a bug? If so,
    I’ll write some reproducible code (if I can) and submit a report.

    I have searched for similar messages but didn’t find anything.

    I am using Windows Server 8 R2 Enterprise with Anaconda 2.2.0
    64-bit. I haven’t patched scikit-learn or any dependencies.

    ……………………………

    Fitting estimator with 4 features.

    [Parallel(n_jobs=20)]: Done   1 out of 300 | elapsed:    0.6s
    remaining:  3.5min

    [Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed:   10.6s finished

    Fitting estimator with 3 features.

    [Parallel(n_jobs=20)]: Done   1 out of 300 | elapsed:    0.4s
    remaining:  2.6min

    [Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed:    7.1s finished

    Fitting estimator with 2 features.

    [Parallel(n_jobs=20)]: Done   1 out of 300 | elapsed:    0.6s
    remaining:  3.5min

    [Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed:    8.2s finished

    [Parallel(n_jobs=20)]: Done   1 out of 300 | elapsed:    0.5s
    remaining:  2.9min

    [Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed:    8.6s finished

    [Parallel(n_jobs=20)]: Done   1 out of 300 | elapsed:    0.5s
    remaining:  3.2min

    [Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed:    8.6s finished

    Traceback (most recent call last):

      File "test_rfecv.py", line 62, in <module>

    churn.rfe()

      File "D:\Research\Churn\python\churn.py", line 805, in rfe

    print(r"%s" % traceback.format_exc())

      File
    "C:\Anaconda3\lib\site-packages\sklearn\feature_selection\rfe.py",
    line 382, in fit

        score = _score(estimator, X_test[:, indices], y_test, scorer)

      File
    "C:\Anaconda3\lib\site-packages\sklearn\cross_validation.py", line
    1534, in _score

        score = scorer(estimator, X_test, y_test)

      File
    "C:\Anaconda3\lib\site-packages\sklearn\metrics\classification.py", line
    676, in fbeta_score

    sample_weight=sample_weight)

      File
    "C:\Anaconda3\lib\site-packages\sklearn\metrics\classification.py", line
    855, in precision_recall_fscore_support

        if beta <= 0:

    ValueError: The truth value of an array with more than one element
    is ambiguous.

    Use a.any() or a.all()


    *Dale Smith, Ph.D.*
    Data Scientist
    
    http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20logo.png
    <http://nexidia.com/>
    *
    d.*404.495.7220 x 4008 <tel:404.495.7220%20x%204008> *f.*
    404.795.7221 <tel:404.795.7221>
    Nexidia Corporate | 3565 Piedmont Road, Building Two, Suite 400 |
    Atlanta, GA 30305

    http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Blog.jpeg
    <http://blog.nexidia.com/>
    http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20LinkedIn.jpeg
    <https://www.linkedin.com/company/nexidia>
    http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Google.jpeg
    <https://plus.google.com/u/0/107921893643164441840/posts>
    http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20twitter.jpeg
    <https://twitter.com/Nexidia>
    http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Youtube.jpeg
    <https://www.youtube.com/user/NexidiaTV>


    
------------------------------------------------------------------------------
    Don't Limit Your Business. Reach for the Cloud.
    GigeNET's Cloud Solutions provide you with the tools and support that
    you need to offload your IT needs and focus on growing your business.
    Configured For All Businesses. Start Your Cloud Today.
    https://www.gigenetcloud.com/
    _______________________________________________
    Scikit-learn-general mailing list
    [email protected]
    <mailto:[email protected]>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/


_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------

_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] [scikit-learn-general] Possible bug in RFECV.fit?

Reply via email to