There are smarter ways to speed up SVM with parallel computation by
changing the algorithm, e.g:

http://www.cs.utexas.edu/~cjhsieh/dcsvm/

But this is new and not implemented in scikit-learn and it's too
recent to be implemented an maintained as part of scikit-learn.
However it could be implemented as a third party project following the
same API and conventions as scikit-learn if there are some volunteers
to do it.

A couple of notes on the threading vs forking problems:

- joblib (used in scikit-learn) does not leverage the copy-on-write
capabilities of multiprocessing fork as it uses the
multiprocessing.Pool API that does copy the data (via pickle and
pipes) both under POSIX and windows.

- calling posix thread operations after a call to "fork" and before
calling "exec" is a violation of the POSIX standard as those functions
are not part of the limited list of async-signal-safe functions [1].
As the fork mode of multiprocessing (only mode available under POSIX
for Python < 3.4) never calls "exec" this might crash any program that
also uses threads (and therefore some OpenMP implementations that rely
on thread pools internally).

- in practice here are some known libs that use threads and that can
crash / hang when used in a program that does a fork without exec:

  - CUDA
  - various MPI runtimes
  - the GCC OpenMP implementation (and this is not likely to change: see [2])
  - OpenBLAS (compiled with and without the OpenMP flag)
  - Apple OSX Accelerate framework

There is some ongoing work [2] by the OpenBLAS maintainer to
experiment with a fork-robust mode for OpenBLAS but not all problems
have been solved yet.

So my advice for now is to not mess with OpenMP as part of the
scikit-learn code base otherwise bugs reports by users who don't read
the doc will explode on the mailing list and github issue tracker.

For the longer term (e.g. ~1 year) we might be able to improve joblib
to not rely on the fork mode of multiprocessing by default. We will
need to refactor the multiprocess pool management and interprocess job
queuing and dispatching to implement nested parallelism.

[1] http://man7.org/linux/man-pages/man7/signal.7.html
[2] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58378
[3] https://github.com/xianyi/OpenBLAS/issues/294#issuecomment-33536895

-- 
Olivier

------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to