Re: [Scikit-learn-general] [scikit-learn-general] Possible bug in RFECV.fit?

Joel Nothman Wed, 22 Jul 2015 05:45:53 -0700

This isn't directly a problem with RFECV, it's a problem with what you
provided as an argument to `scoring`. I suspect you provided a function
with signature fn(y_true, y_pred) -> score, where what is required is a
function fn(estimator, X, y_true) -> score. See
http://scikit-learn.org/stable/modules/model_evaluation.html#the-scoring-parameter-defining-model-evaluation-rules


Perhaps we should be failing faster in such a case. We could, for instance,
extend check_scoring to smoke-test scoring(estimator, X, y_true), at a cost
that we hope is small relative to fitting.

And where is that parallelism happening? It looks like the RFECV code could
be parallelised, but is not atm.


On 22 July 2015 at 21:34, Dale Smith <[email protected]> wrote:

>  Hello,
>
>
>
> I just ran a four-day fit using RFECV. At the end I got the following
> message. My question is whether this is a bug? If so, I’ll write some
> reproducible code (if I can) and submit a report.
>
>
>
> I have searched for similar messages but didn’t find anything.
>
>
>
> I am using Windows Server 8 R2 Enterprise with Anaconda 2.2.0 64-bit. I
> haven’t patched scikit-learn or any dependencies.
>
>
>
> ……………………………
>
> Fitting estimator with 4 features.
>
> [Parallel(n_jobs=20)]: Done   1 out of 300 | elapsed:    0.6s remaining:
> 3.5min
>
> [Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed:   10.6s finished
>
> Fitting estimator with 3 features.
>
> [Parallel(n_jobs=20)]: Done   1 out of 300 | elapsed:    0.4s remaining:
> 2.6min
>
> [Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed:    7.1s finished
>
> Fitting estimator with 2 features.
>
> [Parallel(n_jobs=20)]: Done   1 out of 300 | elapsed:    0.6s remaining:
> 3.5min
>
> [Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed:    8.2s finished
>
> [Parallel(n_jobs=20)]: Done   1 out of 300 | elapsed:    0.5s remaining:
> 2.9min
>
> [Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed:    8.6s finished
>
> [Parallel(n_jobs=20)]: Done   1 out of 300 | elapsed:    0.5s remaining:
> 3.2min
>
> [Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed:    8.6s finished
>
> Traceback (most recent call last):
>
>   File "test_rfecv.py", line 62, in <module>
>
>     churn.rfe()
>
>   File "D:\Research\Churn\python\churn.py", line 805, in rfe
>
>     print(r"%s" % traceback.format_exc())
>
>   File "C:\Anaconda3\lib\site-packages\sklearn\feature_selection\rfe.py",
> line 382, in fit
>
>     score = _score(estimator, X_test[:, indices], y_test, scorer)
>
>   File "C:\Anaconda3\lib\site-packages\sklearn\cross_validation.py", line
> 1534, in _score
>
>     score = scorer(estimator, X_test, y_test)
>
>   File "C:\Anaconda3\lib\site-packages\sklearn\metrics\classification.py",
> line 676, in fbeta_score
>
>     sample_weight=sample_weight)
>
>   File "C:\Anaconda3\lib\site-packages\sklearn\metrics\classification.py",
> line 855, in precision_recall_fscore_support
>
>     if beta <= 0:
>
> ValueError: The truth value of an array with more than one element is
> ambiguous.
>
> Use a.any() or a.all()
>
>
>
>
> *Dale Smith, Ph.D.*
> Data Scientist
> 
> [image:
> http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20logo.png]
> <http://nexidia.com/>
>
> * d.* 404.495.7220 x 4008   *f.* 404.795.7221
> Nexidia Corporate | 3565 Piedmont Road, Building Two, Suite 400 | Atlanta,
> GA 30305
>
> [image:
> http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Blog.jpeg]
> <http://blog.nexidia.com/> [image:
> http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20LinkedIn.jpeg]
> <https://www.linkedin.com/company/nexidia> [image:
> http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Google.jpeg]
> <https://plus.google.com/u/0/107921893643164441840/posts> [image:
> http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20twitter.jpeg]
> <https://twitter.com/Nexidia> [image:
> http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Youtube.jpeg]
> <https://www.youtube.com/user/NexidiaTV>
>
>
>
>
> ------------------------------------------------------------------------------
> Don't Limit Your Business. Reach for the Cloud.
> GigeNET's Cloud Solutions provide you with the tools and support that
> you need to offload your IT needs and focus on growing your business.
> Configured For All Businesses. Start Your Cloud Today.
> https://www.gigenetcloud.com/
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>

------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/

_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] [scikit-learn-general] Possible bug in RFECV.fit?

Reply via email to