Re: [Scikit-learn-general] Scikit-learn for distributed processing?

Mathieu Blondel Fri, 20 Jul 2012 08:50:47 -0700

On Fri, Jul 20, 2012 at 5:16 PM, Olivier Grisel <[email protected]>wrote:


>
>
> It depends on what you want to achieve. Some stuff in machine learning
> are embarrassingly parallel (grid searching optimal parameter with
> cross validation for model selection or fitting random forests) others
> non that easily parallelizable (e.g. fitting a model with stochastic
> gradient descent as you need synchronization steps a.k.a. inter-node
> communication for averaging the parameters while learning) others not
> at all (e.g. fitting a kernel SVM with the SMO algorithm AFAIK).
>


A simple strategy is to train multiple SGD classifiers on different subsets
of the entire training set and then combine them in some way (e.g. weighted
mixture or majority vote):

http://www.ryanmcd.com/papers/efficient_maxentNIPS2009.pdf

Mathieu

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/

_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Scikit-learn for distributed processing?

Reply via email to