Hi,
Using GridSearchCV, I am trying to optimize two parameters values.
In total, I have 8 parameter combinations and doing 4 fold cross validation.
I want to run it in parallel environment.
My questions are:
1. What should be the n_jobs value, 8 or (8*4=) 32 ?
(I know I can specify n_jobs=-1 but due to some technical reasons, I want
to know how many jobs GridSearchCV will start.)
2. If I use the classifier such as RandomForestClassifier where 'n_jobs'
can be specified, will it make any difference if I specify "n_jobs" at the
classifier level also-
>>>clf = RandomForestClassifier(n_jobs=-1)
>>>grid_search = GridSearchCV(clf, param_grid, n_jobs = -1)
Will this be faster compare to GridSearchCV(RandomForestClassifier() ) ?
Thanks
--
Sheila
------------------------------------------------------------------------------
Slashdot TV.
Video for Nerds. Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general