I don't think it is feasible to try to fail fast for all possible cases.
I think users should try their code with something that runs through quickly before doing a multi-day run.
I do that for all my experiments.

On 07/22/2015 02:38 PM, Joel Nothman wrote:
This isn't directly a problem with RFECV, it's a problem with what you provided as an argument to `scoring`. I suspect you provided a function with signature fn(y_true, y_pred) -> score, where what is required is a function fn(estimator, X, y_true) -> score. See http://scikit-learn.org/stable/modules/model_evaluation.html#the-scoring-parameter-defining-model-evaluation-rules

Perhaps we should be failing faster in such a case. We could, for instance, extend check_scoring to smoke-test scoring(estimator, X, y_true), at a cost that we hope is small relative to fitting.

And where is that parallelism happening? It looks like the RFECV code could be parallelised, but is not atm.


On 22 July 2015 at 21:34, Dale Smith <dsm...@nexidia.com <mailto:dsm...@nexidia.com>> wrote:

    Hello,

    I just ran a four-day fit using RFECV. At the end I got the
    following message. My question is whether this is a bug? If so,
    I’ll write some reproducible code (if I can) and submit a report.

    I have searched for similar messages but didn’t find anything.

    I am using Windows Server 8 R2 Enterprise with Anaconda 2.2.0
    64-bit. I haven’t patched scikit-learn or any dependencies.

    ……………………………

    Fitting estimator with 4 features.

    [Parallel(n_jobs=20)]: Done   1 out of 300 | elapsed:    0.6s
    remaining:  3.5min

    [Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed:   10.6s finished

    Fitting estimator with 3 features.

    [Parallel(n_jobs=20)]: Done   1 out of 300 | elapsed:    0.4s
    remaining:  2.6min

    [Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed:    7.1s finished

    Fitting estimator with 2 features.

    [Parallel(n_jobs=20)]: Done   1 out of 300 | elapsed:    0.6s
    remaining:  3.5min

    [Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed:    8.2s finished

    [Parallel(n_jobs=20)]: Done   1 out of 300 | elapsed:    0.5s
    remaining:  2.9min

    [Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed:    8.6s finished

    [Parallel(n_jobs=20)]: Done   1 out of 300 | elapsed:    0.5s
    remaining:  3.2min

    [Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed:    8.6s finished

    Traceback (most recent call last):

      File "test_rfecv.py", line 62, in <module>

    churn.rfe()

      File "D:\Research\Churn\python\churn.py", line 805, in rfe

    print(r"%s" % traceback.format_exc())

      File
    "C:\Anaconda3\lib\site-packages\sklearn\feature_selection\rfe.py",
    line 382, in fit

        score = _score(estimator, X_test[:, indices], y_test, scorer)

      File
    "C:\Anaconda3\lib\site-packages\sklearn\cross_validation.py", line
    1534, in _score

        score = scorer(estimator, X_test, y_test)

      File
    "C:\Anaconda3\lib\site-packages\sklearn\metrics\classification.py", line
    676, in fbeta_score

    sample_weight=sample_weight)

      File
    "C:\Anaconda3\lib\site-packages\sklearn\metrics\classification.py", line
    855, in precision_recall_fscore_support

        if beta <= 0:

    ValueError: The truth value of an array with more than one element
    is ambiguous.

    Use a.any() or a.all()


    *Dale Smith, Ph.D.*
    Data Scientist
    ​
    http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20logo.png
    <http://nexidia.com/>
    *
    d.*404.495.7220 x 4008 <tel:404.495.7220%20x%204008> *f.*
    404.795.7221 <tel:404.795.7221>
    Nexidia Corporate | 3565 Piedmont Road, Building Two, Suite 400 |
    Atlanta, GA 30305

    http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Blog.jpeg
    <http://blog.nexidia.com/>
    http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20LinkedIn.jpeg
    <https://www.linkedin.com/company/nexidia>
    http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Google.jpeg
    <https://plus.google.com/u/0/107921893643164441840/posts>
    http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20twitter.jpeg
    <https://twitter.com/Nexidia>
    http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Youtube.jpeg
    <https://www.youtube.com/user/NexidiaTV>


    
------------------------------------------------------------------------------
    Don't Limit Your Business. Reach for the Cloud.
    GigeNET's Cloud Solutions provide you with the tools and support that
    you need to offload your IT needs and focus on growing your business.
    Configured For All Businesses. Start Your Cloud Today.
    https://www.gigenetcloud.com/
    _______________________________________________
    Scikit-learn-general mailing list
    Scikit-learn-general@lists.sourceforge.net
    <mailto:Scikit-learn-general@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/


_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to