Re: Spark Matrix Factorization

Debasish Das Fri, 03 Jan 2014 11:00:15 -0800

Hi Ameet,

Matrix factorization is a non-convex problem and ALS solves it using 2
convex problems, DSGD solves the problem by finding a local minima.


I am experimenting with Spark Parallel ALS but I intend to port Scalafish
https://github.com/azymnis/scalafish to Spark as well.

For bigger matrices jury is not out that which algorithms provides a better
local optima with an iteration bound. It is also highly dependent on
datasets I believe.

Thanks.
Deb



On Thu, Jan 2, 2014 at 4:06 PM, Ameet Talwalkar <[email protected]>wrote:

> Hi Deb,
>
> Thanks for your email.  We currently do not have a DSGD implementation in
> MLlib. Also, just to clarify, DSGD is not a variant of ALS, but rather a
> different algorithm for solving the same the same bi-convex objective
> function.
>
> It would be a good thing to do add, but to the best of my knowledge, no
> one is actively working on this right now.
>
> Also, as you mentioned, the ALS implementation in mllib is more
> robust/scalable than the one in spark.examples.
>
> -Ameet
>
>
> On Thu, Jan 2, 2014 at 3:16 PM, Debasish Das <[email protected]>wrote:
>
>> Hi,
>>
>> I am not noticing any DSGD implementation of ALS in Spark.
>>
>> There are two ALS implementations.
>>
>> org.apache.spark.examples.SparkALS does not run on large matrices and
>> seems more like a demo code.
>>
>> org.apache.spark.mllib.recommendation.ALS looks feels more robust version
>> and I am experimenting with it.
>>
>> References here are Jellyfish, Twitter's implementation of Jellyfish
>> called Scalafish, Google paper called Sparkler and similar idea put forward
>> by IBM paper by Gemulla et al. (large-scale matrix factorization with
>> distributed stochastic gradient descent)
>>
>> https://github.com/azymnis/scalafish
>>
>> Are there any plans of adding DSGD in Spark or there are any existing
>> JIRA ?
>>
>> Thanks.
>> Deb
>>
>>
>

Re: Spark Matrix Factorization

Reply via email to