My experiences with parallel GridSearchCV and RFECV have not been pleasant.
Memory usage was a huge problem, as apparently each job got a copy of the data
with an out-of-the box scikit-learn installation using Anaconda 3. No matter
how I set pre_dispatch, I could not get n_jobs = 2 to work, even with no one
else using a 100 gb 24 core Windows box.
I can create some reproducible code if anyone has time to work on it.
Dale Smith, Ph.D.
Data Scientist
[http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20logo.png]<http://nexidia.com/>
d. 404.495.7220 x 4008 f. 404.795.7221
Nexidia Corporate | 3565 Piedmont Road, Building Two, Suite 400 | Atlanta, GA
30305
[http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Blog.jpeg]<http://blog.nexidia.com/>
[http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20LinkedIn.jpeg]
<https://www.linkedin.com/company/nexidia>
[http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Google.jpeg]
<https://plus.google.com/u/0/107921893643164441840/posts>
[http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20twitter.jpeg]
<https://twitter.com/Nexidia>
[http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Youtube.jpeg]
<https://www.youtube.com/user/NexidiaTV>
From: Clyde Fare [mailto:clyde.f...@gmail.com]
Sent: Thursday, September 24, 2015 8:38 AM
To: scikit-learn-general@lists.sourceforge.net
Subject: [Scikit-learn-general] GridSearchCV using too many cores?
Hi,
I'm trying to run GridSearchCV on a computational cluster but my jobs keep
failing with an error from the queuing system claiming I'm using too many cores.
If I set n_jobs equal 1, then the job doesn't fail but if it's more than one,
no matter what number it is the job fails.
In the example below I've set n_jobs to 6 and pre_dispatch to 12, and asked for
8 processors from the queue. I got the following error after ~10 minutes: "PBS:
job killed: ncpus 19.73 exceeded limit 8 (sum)"
I've tried playing around the pre_dispatch but it makes difference. There will
be other people running calculations on these nodes, so might there be some
kind of intereference between GridSearchCV and the other jobs?
Anyone come across anything like this before?
Cheers
Clyde
import dill
import numpy as np
from sklearn.kernel_ridge import KernelRidge
from sklearn.grid_search import GridSearchCV
label='test_grdsrch3'
X_train = np.random.rand(971,276)
y_train = np.random.rand(971)
kr = GridSearchCV(KernelRidge(), cv=10,
param_grid={"kernel": ['rbf', 'laplacian'],
"alpha": [2**i for i in np.arange(-40,-5,0.5)],
#alpha=lambda
"gamma": [1/(2.**(2*i)) for i in
np.arange(5,18,0.5)]}, #gamma = 1/sigma^2
pre_dispatch=12,
n_jobs=6)
kr.fit(X_train, y_train)
with open(label+'.pkl','w') as data_f:
dill.dump(kr, data_f)
------------------------------------------------------------------------------
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general