I don't think it is feasible to try to fail fast for all possible cases.
I think users should try their code with something that runs through
quickly before doing a multi-day run.
I do that for all my experiments.
On 07/22/2015 02:38 PM, Joel Nothman wrote:
This isn't directly a problem with RFECV, it's a problem with what you
provided as an argument to `scoring`. I suspect you provided a
function with signature fn(y_true, y_pred) -> score, where what is
required is a function fn(estimator, X, y_true) -> score. See
http://scikit-learn.org/stable/modules/model_evaluation.html#the-scoring-parameter-defining-model-evaluation-rules
Perhaps we should be failing faster in such a case. We could, for
instance, extend check_scoring to smoke-test scoring(estimator, X,
y_true), at a cost that we hope is small relative to fitting.
And where is that parallelism happening? It looks like the RFECV code
could be parallelised, but is not atm.
On 22 July 2015 at 21:34, Dale Smith <dsm...@nexidia.com
<mailto:dsm...@nexidia.com>> wrote:
Hello,
I just ran a four-day fit using RFECV. At the end I got the
following message. My question is whether this is a bug? If so,
I’ll write some reproducible code (if I can) and submit a report.
I have searched for similar messages but didn’t find anything.
I am using Windows Server 8 R2 Enterprise with Anaconda 2.2.0
64-bit. I haven’t patched scikit-learn or any dependencies.
……………………………
Fitting estimator with 4 features.
[Parallel(n_jobs=20)]: Done 1 out of 300 | elapsed: 0.6s
remaining: 3.5min
[Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed: 10.6s finished
Fitting estimator with 3 features.
[Parallel(n_jobs=20)]: Done 1 out of 300 | elapsed: 0.4s
remaining: 2.6min
[Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed: 7.1s finished
Fitting estimator with 2 features.
[Parallel(n_jobs=20)]: Done 1 out of 300 | elapsed: 0.6s
remaining: 3.5min
[Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed: 8.2s finished
[Parallel(n_jobs=20)]: Done 1 out of 300 | elapsed: 0.5s
remaining: 2.9min
[Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed: 8.6s finished
[Parallel(n_jobs=20)]: Done 1 out of 300 | elapsed: 0.5s
remaining: 3.2min
[Parallel(n_jobs=20)]: Done 300 out of 300 | elapsed: 8.6s finished
Traceback (most recent call last):
File "test_rfecv.py", line 62, in <module>
churn.rfe()
File "D:\Research\Churn\python\churn.py", line 805, in rfe
print(r"%s" % traceback.format_exc())
File
"C:\Anaconda3\lib\site-packages\sklearn\feature_selection\rfe.py",
line 382, in fit
score = _score(estimator, X_test[:, indices], y_test, scorer)
File
"C:\Anaconda3\lib\site-packages\sklearn\cross_validation.py", line
1534, in _score
score = scorer(estimator, X_test, y_test)
File
"C:\Anaconda3\lib\site-packages\sklearn\metrics\classification.py", line
676, in fbeta_score
sample_weight=sample_weight)
File
"C:\Anaconda3\lib\site-packages\sklearn\metrics\classification.py", line
855, in precision_recall_fscore_support
if beta <= 0:
ValueError: The truth value of an array with more than one element
is ambiguous.
Use a.any() or a.all()
*Dale Smith, Ph.D.*
Data Scientist
http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20logo.png
<http://nexidia.com/>
*
d.*404.495.7220 x 4008 <tel:404.495.7220%20x%204008> *f.*
404.795.7221 <tel:404.795.7221>
Nexidia Corporate | 3565 Piedmont Road, Building Two, Suite 400 |
Atlanta, GA 30305
http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Blog.jpeg
<http://blog.nexidia.com/>
http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20LinkedIn.jpeg
<https://www.linkedin.com/company/nexidia>
http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Google.jpeg
<https://plus.google.com/u/0/107921893643164441840/posts>
http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20twitter.jpeg
<https://twitter.com/Nexidia>
http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Youtube.jpeg
<https://www.youtube.com/user/NexidiaTV>
------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
<mailto:Scikit-learn-general@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general