In a slightly related note, I am trying to write a distributed PCA based upon http://biglearn.org/2013/files/papers/biglearning2013_submission_18.pdf The algorithm works by computing SVD locally then broadcasting the locally computed principal components. I wonder if anyone might have recommendation on scala native implementation of SVD. C
On Thu, Jan 2, 2014 at 7:06 PM, Ameet Talwalkar <[email protected]>wrote: > Hi Deb, > > Thanks for your email. We currently do not have a DSGD implementation in > MLlib. Also, just to clarify, DSGD is not a variant of ALS, but rather a > different algorithm for solving the same the same bi-convex objective > function. > > It would be a good thing to do add, but to the best of my knowledge, no > one is actively working on this right now. > > Also, as you mentioned, the ALS implementation in mllib is more > robust/scalable than the one in spark.examples. > > -Ameet > > > On Thu, Jan 2, 2014 at 3:16 PM, Debasish Das <[email protected]>wrote: > >> Hi, >> >> I am not noticing any DSGD implementation of ALS in Spark. >> >> There are two ALS implementations. >> >> org.apache.spark.examples.SparkALS does not run on large matrices and >> seems more like a demo code. >> >> org.apache.spark.mllib.recommendation.ALS looks feels more robust version >> and I am experimenting with it. >> >> References here are Jellyfish, Twitter's implementation of Jellyfish >> called Scalafish, Google paper called Sparkler and similar idea put forward >> by IBM paper by Gemulla et al. (large-scale matrix factorization with >> distributed stochastic gradient descent) >> >> https://github.com/azymnis/scalafish >> >> Are there any plans of adding DSGD in Spark or there are any existing >> JIRA ? >> >> Thanks. >> Deb >> >> > -- - Charles
