There are two cases : n_jobs > 1 works when data is smaller - when the training docs numpy array is 15MB. It does not work when training matrix is 100MB. My Mac has 16GB RAM.
In the second case, the jobs die out pretty quickly, in seconds, and the main python process seems to die out (min CPU usage). There is a popup message saying 'python processes appear to have died'. This is when i run python on bash command line. When I run in python GUI IDLE, a message pops up 'your program is still running, sure you want to close window'. What are these jobs anyway? Are they various parameter combinations in param_grid, or lower level jobs out of compiler etc? Does each job replicate the training data in RAM? regards On Sun, Jan 7, 2018 at 11:35 AM, Sumeet Sandhu <sumeet.k.san...@gmail.com> wrote: > Hi, > > I was able to run this with n_jobs=-1, and the activity monitor does show > all 8 CPUs engaged, but the jobs start to die out one by one. I tried with > n_jobs=2, same story. > The only option that works is n_jobs=1. > I played around with 'pre_dispatch' a bit - unclear what that does. > > GRID = GridSearchCV(LogisticRegression(), param_grid, scoring=None, > fit_params=None, n_jobs=1, iid=True, refit=True, cv=10, verbose=0, > error_score=0, return_train_score=False) > GRID.fit(trainDocumentV,trainLabelV) > > > How can I sustain at least 3-4 parallel jobs? > > thanks, > Sumeet >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn