[ https://issues.apache.org/jira/browse/SPARK-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191551#comment-14191551 ]
Debasish Das commented on SPARK-2426: ------------------------------------- [~mengxr] The matlab comparison scripts are open sourced over here: https://github.com/debasish83/ecos/blob/master/matlab/admm/qprandom.m https://github.com/debasish83/ecos/blob/master/matlab/pdco4/code/pdcotestQP.m The detailed comparisons are on the REAME.md. Please look at the section on Matlab comparisons. In a nutshell, for bounds MOSEK and ADMM are similar, for elastic net Proximal is 10X faster compared to MOSEK, for equality MOSEK is 2-3X faster than Proximal but both PDCO and ECOS produces much worse result as compared to ADMM. Accelerated ADMM also did not work as good as default ADMM. Increasing the over-relaxation parameter helped accelerated ADMM but I have not explored it yet. ADMM and PDCO are in Matlab but ECOS and MOSEK are both using mex files so they are expected to be more efficient. Next I will add the performance results of running positivity, box, sparse coding / regularized lsi and robust-plsa on MovieLens dataset and validate product recommendation using the MAP measure...In terms of RMSE, default < positive < sparse coding... What's the largest datasets LDA PRs are running? I would like to try that on sparse coding as well...From these papers sparse coding/RLSI should give results at par with LDA: https://www.cs.cmu.edu/~xichen/images/SLSA-sdm11-final.pdf http://web.stanford.edu/group/mmds/slides2012/s-hli.pdf The same randomized matrices can be generated and run in the PR as follows: ./bin/spark-class org.apache.spark.mllib.optimization.QuadraticMinimizer 1000 1 1.0 0.99 rank=1000, equality=1.0 lambda=1.0 beta=0.99 L1regularization = lambda*beta L2regularization = lambda*(1-beta) Generating randomized QPs with rank 1000 equalities 1 sparseQp 88.423 ms iterations 45 converged true posQp 181.369 ms iterations 121 converged true boundsQp 175.733 ms iterations 121 converged true Qp Equality 2805.564 ms iterations 2230 converged true > Quadratic Minimization for MLlib ALS > ------------------------------------ > > Key: SPARK-2426 > URL: https://issues.apache.org/jira/browse/SPARK-2426 > Project: Spark > Issue Type: New Feature > Components: MLlib > Affects Versions: 1.0.0 > Reporter: Debasish Das > Assignee: Debasish Das > Original Estimate: 504h > Remaining Estimate: 504h > > Current ALS supports least squares and nonnegative least squares. > I presented ADMM and IPM based Quadratic Minimization solvers to be used for > the following ALS problems: > 1. ALS with bounds > 2. ALS with L1 regularization > 3. ALS with Equality constraint and bounds > Initial runtime comparisons are presented at Spark Summit. > http://spark-summit.org/2014/talk/quadratic-programing-solver-for-non-negative-matrix-factorization-with-spark > Based on Xiangrui's feedback I am currently comparing the ADMM based > Quadratic Minimization solvers with IPM based QpSolvers and the default > ALS/NNLS. I will keep updating the runtime comparison results. > For integration the detailed plan is as follows: > 1. Add QuadraticMinimizer and Proximal algorithms in mllib.optimization > 2. Integrate QuadraticMinimizer in mllib ALS -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org