On Fri, Jan 3, 2014 at 10:28 AM, Sebastian Schelter <[email protected]> wrote:
> > I wonder if anyone might have recommendation on scala native > implementation > > of SVD. > > Mahout has a scala implementation of an SVD variant called Stochastic SVD: > > > https://svn.apache.org/viewvc/mahout/trunk/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/SSVD.scala?view=markup Mahout also has SVD and Eigen decompositions mapped to scala as svd() and eigen(). Unfortunately i have not put it on wiki yet but the summary is available here https://issues.apache.org/jira/browse/MAHOUT-1297 Mahout also has distributed PCA implementation (which is based on distributed Stochastic SVD and has a special provisions for sparse matrix cases). Unfortunately our wiki is in flux now due to migration off confluence to CMS so the SSVD page has not yet been migrated to CMS so confluence version is here https://cwiki.apache.org/confluence/display/MAHOUT/Stochastic+Singular+Value+Decomposition > > Otherwise, all the major java math libraries (mahout math, jblas, > commons-math) should provide an implementation that you can use in scala. > > --sebastian > > > C > > > > > > > > > > On Thu, Jan 2, 2014 at 7:06 PM, Ameet Talwalkar <[email protected] > >wrote: > > > >> Hi Deb, > >> > >> Thanks for your email. We currently do not have a DSGD implementation > in > >> MLlib. Also, just to clarify, DSGD is not a variant of ALS, but rather a > >> different algorithm for solving the same the same bi-convex objective > >> function. > >> > >> It would be a good thing to do add, but to the best of my knowledge, no > >> one is actively working on this right now. > >> > >> Also, as you mentioned, the ALS implementation in mllib is more > >> robust/scalable than the one in spark.examples. > >> > >> -Ameet > >> > >> > >> On Thu, Jan 2, 2014 at 3:16 PM, Debasish Das <[email protected] > >wrote: > >> > >>> Hi, > >>> > >>> I am not noticing any DSGD implementation of ALS in Spark. > >>> > >>> There are two ALS implementations. > >>> > >>> org.apache.spark.examples.SparkALS does not run on large matrices and > >>> seems more like a demo code. > >>> > >>> org.apache.spark.mllib.recommendation.ALS looks feels more robust > version > >>> and I am experimenting with it. > >>> > >>> References here are Jellyfish, Twitter's implementation of Jellyfish > >>> called Scalafish, Google paper called Sparkler and similar idea put > forward > >>> by IBM paper by Gemulla et al. (large-scale matrix factorization with > >>> distributed stochastic gradient descent) > >>> > >>> https://github.com/azymnis/scalafish > >>> > >>> Are there any plans of adding DSGD in Spark or there are any existing > >>> JIRA ? > >>> > >>> Thanks. > >>> Deb > >>> > >>> > >> > > > > > >
