Re: Spark Matrix Factorization

Charles Earl Fri, 03 Jan 2014 10:18:06 -0800

In a slightly related note, I am trying to write a distributed PCA based
upon
http://biglearn.org/2013/files/papers/biglearning2013_submission_18.pdf
The algorithm works by computing SVD locally then broadcasting the locally
computed principal components.
I wonder if anyone might have recommendation on scala native implementation
of SVD.
C





On Thu, Jan 2, 2014 at 7:06 PM, Ameet Talwalkar <[email protected]>wrote:

> Hi Deb,
>
> Thanks for your email.  We currently do not have a DSGD implementation in
> MLlib. Also, just to clarify, DSGD is not a variant of ALS, but rather a
> different algorithm for solving the same the same bi-convex objective
> function.
>
> It would be a good thing to do add, but to the best of my knowledge, no
> one is actively working on this right now.
>
> Also, as you mentioned, the ALS implementation in mllib is more
> robust/scalable than the one in spark.examples.
>
> -Ameet
>
>
> On Thu, Jan 2, 2014 at 3:16 PM, Debasish Das <[email protected]>wrote:
>
>> Hi,
>>
>> I am not noticing any DSGD implementation of ALS in Spark.
>>
>> There are two ALS implementations.
>>
>> org.apache.spark.examples.SparkALS does not run on large matrices and
>> seems more like a demo code.
>>
>> org.apache.spark.mllib.recommendation.ALS looks feels more robust version
>> and I am experimenting with it.
>>
>> References here are Jellyfish, Twitter's implementation of Jellyfish
>> called Scalafish, Google paper called Sparkler and similar idea put forward
>> by IBM paper by Gemulla et al. (large-scale matrix factorization with
>> distributed stochastic gradient descent)
>>
>> https://github.com/azymnis/scalafish
>>
>> Are there any plans of adding DSGD in Spark or there are any existing
>> JIRA ?
>>
>> Thanks.
>> Deb
>>
>>
>


-- 
- Charles

Re: Spark Matrix Factorization

Reply via email to