[ 
https://issues.apache.org/jira/browse/SPARK-1543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuo Xiang updated SPARK-1543:
------------------------------

    Description: 
This PR introduces the Alternating Direction Method of Multipliers (ADMM) for 
solving Lasso (elastic net, in fact) in mllib. 

ADMM is capable of solving a class of composite minimization problems in a 
distributed way. Specifically for Lasso (if only L1-regularization) or 
elastic-net (both L1- and L2- regularization), in each iteration, it requires 
solving independent systems of linear equations on each partition and a 
subsequent soft-threholding operation on the driver machine. Unlike SGD, it is 
a deterministic algorithm (except for the random partition). Details can be 
found in the [S. Boyd's 
paper](http://www.stanford.edu/~boyd/papers/admm_distr_stats.html).

The linear algebra operations mainly rely on the Breeze library, particularly, 
it applies `breeze.linalg.cholesky` to perform cholesky decomposition on each 
partition to solve the linear system.

I tried to follow the organization of existing Lasso implementation. However, 
as ADMM is also a good fit for similar optimization problems, e.g., (sparse) 
logistic regression, it may be worth reorganizing and putting ADMM into a 
separate section.


  was:
This PR introduces the Alternating Direction Method of Multipliers (ADMM) for 
solving Lasso (elastic net, in fact) in mllib. 

ADMM is capable of solving a class of composite minimization problems in a 
distributed way. Specifically for Lasso (if only L1-regularization) or 
elastic-net (both L1- and L2- regularization), it requires solving independent 
systems of linear equations on each partition and a soft-threholding operation 
on the driver. Unlike SGD, it is a deterministic algorithm (except for the 
random partition). Details can be found in the [S. Boyd's 
paper](http://www.stanford.edu/~boyd/papers/admm_distr_stats.html).

The linear algebra operations mainly rely on the Breeze library, particularly, 
it applies `breeze.linalg.cholesky` to perform cholesky decomposition on each 
partition to solve the linear system.

I tried to follow the organization of existing Lasso implementation. However, 
as ADMM is also a good fit for similar optimization problems, e.g., (sparse) 
logistic regression, it may worth to re-organize and put ADMM into a separate 
section.

PR: https://github.com/apache/spark/pull/458



> Add ADMM for solving Lasso (and elastic net) problem
> ----------------------------------------------------
>
>                 Key: SPARK-1543
>                 URL: https://issues.apache.org/jira/browse/SPARK-1543
>             Project: Spark
>          Issue Type: New Feature
>            Reporter: Shuo Xiang
>            Priority: Minor
>              Labels: features
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> This PR introduces the Alternating Direction Method of Multipliers (ADMM) for 
> solving Lasso (elastic net, in fact) in mllib. 
> ADMM is capable of solving a class of composite minimization problems in a 
> distributed way. Specifically for Lasso (if only L1-regularization) or 
> elastic-net (both L1- and L2- regularization), in each iteration, it requires 
> solving independent systems of linear equations on each partition and a 
> subsequent soft-threholding operation on the driver machine. Unlike SGD, it 
> is a deterministic algorithm (except for the random partition). Details can 
> be found in the [S. Boyd's 
> paper](http://www.stanford.edu/~boyd/papers/admm_distr_stats.html).
> The linear algebra operations mainly rely on the Breeze library, 
> particularly, it applies `breeze.linalg.cholesky` to perform cholesky 
> decomposition on each partition to solve the linear system.
> I tried to follow the organization of existing Lasso implementation. However, 
> as ADMM is also a good fit for similar optimization problems, e.g., (sparse) 
> logistic regression, it may be worth reorganizing and putting ADMM into a 
> separate section.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to