On Mon, Feb 4, 2013 at 2:50 PM, Nick Pentreath <[email protected]> wrote: > @Robert sorry for the delay in responding, I was away on vacation. > > Here's a link to a gist of a very simple implementation of parallelized SGD > using Spark (https://gist.github.com/4707012). It basically replicates the > existing Spark logistic regression example, but using sklearn's linear_model > module. However the approach used is iterative parameter mixtures (where the > local weight vectors are averaged and the resulting weight vector > rebroadcast) as opposed to distributed gradient descent (where the local > gradients are aggregated, a gradient step taken on the master and the weight > vector rebroadcast) - see > http://faculty.utpa.edu/reillycf/courses/CSCI6175-F11/papers/nips2010mannetal.pdf > for some details.
Very cool. Thanks! -- Robert Kern ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_jan _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
