> scikit-learn is really bad when n_jobs=10. To avoid the memory copy you can try my branch of joblib:
https://github.com/joblib/joblib/pull/44 You need to hack the X_argsorted generation code to generate a memmap array instead of a numpy array (I am planning to add a helper in joblib to make that easier). Once you have this the memory usage is much better. You can clone by branch and replace the sklearn/externals/joblib folder by a symlink to my branch clone. Then you can try to run: python benchmarks/bench_covertype.py --n-jobs=4 --classifiers=ExtraTrees However there is still a penalty in when first accessing memmap data from the freshly spawned multiprocessing workers. When using IPython parallel with workers spawn ahead of time I can workaround this issue. I need to investigate more with a profile to understand what's happening. However it's not trivial to profile multiprocessing related stuff... I think joblib.Parallel could be extended to be able to pass a pre-allocated multiprocessing.Pool instance as an alternative to an integer value for n_jobs. I will discuss with Gael at some point to devise this. Woud be great to also be able to pass an IPython.parallel.Client instance as value for n_jobs too. ------------------------------------------------------------------------------ Keep yourself connected to Go Parallel: INSIGHTS What's next for parallel hardware, programming and related areas? Interviews and blogs by thought leaders keep you ahead of the curve. http://goparallel.sourceforge.net _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
