Re: [Scikit-learn-general] GridSearchCV using too many cores?

Dale Smith Thu, 24 Sep 2015 06:22:52 -0700

My experiences with parallel GridSearchCV and RFECV have not been pleasant. 
Memory usage was a huge problem, as apparently each job got a copy of the data 
with an out-of-the box scikit-learn installation using Anaconda 3. No matter 
how I set pre_dispatch, I could not get n_jobs = 2 to work, even with no one 
else using a 100 gb 24 core Windows box.


I can create some reproducible code if anyone has time to work on it.


Dale Smith, Ph.D.
Data Scientist

[http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20logo.png]<http://nexidia.com/>

d. 404.495.7220 x 4008   f. 404.795.7221
Nexidia Corporate | 3565 Piedmont Road, Building Two, Suite 400 | Atlanta, GA 
30305

[http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Blog.jpeg]<http://blog.nexidia.com/>
 [http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20LinkedIn.jpeg] 
<https://www.linkedin.com/company/nexidia>  
[http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Google.jpeg] 
<https://plus.google.com/u/0/107921893643164441840/posts>  
[http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20twitter.jpeg] 
<https://twitter.com/Nexidia>  
[http://host.msgapp.com/Extranet/96621/Signature%20Images/sig%20Youtube.jpeg] 
<https://www.youtube.com/user/NexidiaTV>

From: Clyde Fare [mailto:[email protected]]
Sent: Thursday, September 24, 2015 8:38 AM
To: [email protected]
Subject: [Scikit-learn-general] GridSearchCV using too many cores?

Hi,

I'm trying to run GridSearchCV on a computational cluster but my jobs keep 
failing with an error from the queuing system claiming I'm using too many cores.

If I set n_jobs equal 1, then the job doesn't fail but if it's more than one, 
no matter what number it is the job fails.

In the example below I've set n_jobs to 6 and pre_dispatch to 12, and asked for 
8 processors from the queue. I got the following error after ~10 minutes: "PBS: 
job killed: ncpus 19.73 exceeded limit 8 (sum)"

I've tried playing around the pre_dispatch but it makes difference. There will 
be other people running calculations on these nodes, so might there be some 
kind of intereference between GridSearchCV and the other jobs?

Anyone come across anything like this before?

Cheers

Clyde


import dill
import numpy as np

from sklearn.kernel_ridge import KernelRidge
from sklearn.grid_search import GridSearchCV

label='test_grdsrch3'
X_train = np.random.rand(971,276)
y_train = np.random.rand(971)

kr = GridSearchCV(KernelRidge(), cv=10,
                  param_grid={"kernel": ['rbf', 'laplacian'],
                              "alpha": [2**i for i in np.arange(-40,-5,0.5)],   
              #alpha=lambda
                              "gamma": [1/(2.**(2*i)) for i in 
np.arange(5,18,0.5)]},   #gamma = 1/sigma^2
                  pre_dispatch=12,
                  n_jobs=6)

kr.fit(X_train, y_train)

with open(label+'.pkl','w') as data_f:
        dill.dump(kr, data_f)

------------------------------------------------------------------------------
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140

_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] GridSearchCV using too many cores?

Reply via email to