Re: [Scikit-learn-general] Using sklearn in Hadoop

Robert Kern Mon, 04 Feb 2013 07:01:22 -0800

On Mon, Feb 4, 2013 at 2:50 PM, Nick Pentreath <[email protected]> wrote:
> @Robert sorry for the delay in responding, I was away on vacation.
>
> Here's a link to a gist of a very simple implementation of parallelized SGD
> using Spark (https://gist.github.com/4707012). It basically replicates the
> existing Spark logistic regression example, but using sklearn's linear_model
> module. However the approach used is iterative parameter mixtures (where the
> local weight vectors are averaged and the resulting weight vector
> rebroadcast) as opposed to distributed gradient descent (where the local
> gradients are aggregated, a gradient step taken on the master and the weight
> vector rebroadcast) - see
> http://faculty.utpa.edu/reillycf/courses/CSCI6175-F11/papers/nips2010mannetal.pdf
> for some details.


Very cool. Thanks!

--
Robert Kern

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_jan
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Using sklearn in Hadoop

Reply via email to