[jira] [Commented] (SPARK-2426) Quadratic Minimization for MLlib ALS

Debasish Das (JIRA) Fri, 31 Oct 2014 00:54:47 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191551#comment-14191551
 ]


Debasish Das commented on SPARK-2426:
-------------------------------------

[~mengxr] The matlab comparison scripts are open sourced over here:

https://github.com/debasish83/ecos/blob/master/matlab/admm/qprandom.m
https://github.com/debasish83/ecos/blob/master/matlab/pdco4/code/pdcotestQP.m

The detailed comparisons are on the REAME.md. Please look at the section on 
Matlab comparisons.

In a nutshell, for bounds MOSEK and ADMM are similar, for elastic net Proximal 
is 10X faster compared to MOSEK, for equality MOSEK is 2-3X faster than 
Proximal but both PDCO and ECOS produces much worse result as compared to ADMM. 
Accelerated ADMM also did not work as good as default ADMM. Increasing the 
over-relaxation parameter helped accelerated ADMM but I have not explored it 
yet.

ADMM and PDCO are in Matlab but ECOS and MOSEK are both using mex files so they 
are expected to be more efficient.

Next I will add the performance results of running positivity, box, sparse 
coding / regularized lsi and robust-plsa on MovieLens dataset and validate 
product recommendation using the MAP measure...In terms of RMSE, default < 
positive < sparse coding...

What's the largest datasets LDA PRs are running? I would like to try that on 
sparse coding as well...From these papers sparse coding/RLSI should give 
results at par with LDA:

https://www.cs.cmu.edu/~xichen/images/SLSA-sdm11-final.pdf
http://web.stanford.edu/group/mmds/slides2012/s-hli.pdf

The same randomized matrices can be generated and run in the PR as follows:

./bin/spark-class org.apache.spark.mllib.optimization.QuadraticMinimizer 1000 1 
1.0 0.99

rank=1000, equality=1.0 lambda=1.0 beta=0.99
L1regularization = lambda*beta L2regularization = lambda*(1-beta)

Generating randomized QPs with rank 1000 equalities 1
sparseQp 88.423 ms iterations 45 converged true
posQp 181.369 ms iterations 121 converged true
boundsQp 175.733 ms iterations 121 converged true
Qp Equality 2805.564 ms iterations 2230 converged true

> Quadratic Minimization for MLlib ALS
> ------------------------------------
>
>                 Key: SPARK-2426
>                 URL: https://issues.apache.org/jira/browse/SPARK-2426
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>    Affects Versions: 1.0.0
>            Reporter: Debasish Das
>            Assignee: Debasish Das
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> Current ALS supports least squares and nonnegative least squares.
> I presented ADMM and IPM based Quadratic Minimization solvers to be used for 
> the following ALS problems:
> 1. ALS with bounds
> 2. ALS with L1 regularization
> 3. ALS with Equality constraint and bounds
> Initial runtime comparisons are presented at Spark Summit. 
> http://spark-summit.org/2014/talk/quadratic-programing-solver-for-non-negative-matrix-factorization-with-spark
> Based on Xiangrui's feedback I am currently comparing the ADMM based 
> Quadratic Minimization solvers with IPM based QpSolvers and the default 
> ALS/NNLS. I will keep updating the runtime comparison results.
> For integration the detailed plan is as follows:
> 1. Add QuadraticMinimizer and Proximal algorithms in mllib.optimization
> 2. Integrate QuadraticMinimizer in mllib ALS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-2426) Quadratic Minimization for MLlib ALS

Reply via email to