Hi all,

I have just pushed a fix to make joblib.Parallel work on the
development version of Python that adds more flexibility to the way
multiprocessing is spawning the worker process. See:
http://docs.python.org/dev/library/multiprocessing.html#contexts-and-start-methods

I think this along with the memory mapping support, the threading
backend and other fixes deserves to be tested by a broader community
of users and in particular as part of the scikit-learn project. Hence
I would like to do an alpha release of joblib as soon as possible and
embed it in the master branch of scikit-learn prior to the 0.15
release scheduled for January 2014.

The detailed list of changes since 0.7 are listed here:

  https://github.com/joblib/joblib/blob/master/CHANGES.rst

Gilles (from scikit-learn) confirmed to me that the threading backend
is working as expected to parallelize the fit of large forests of randomized
trees. In my own experience this completely fixes the memory copy issue
and further removes some pickling overhead caused by the communication
between the parent process and its child workers.

Please let me know quickly if you have any objection to this plan.

Regards,

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to