[ 
https://issues.apache.org/jira/browse/MAHOUT-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134388#comment-13134388
 ] 

Jonathan Traupman commented on MAHOUT-672:
------------------------------------------

> Jonathan, what size inputs have you run this on, with what running time in 
> comparison to the other algorithms we have? From what I can see, this looks 
> good to commit as well.

The largest dataset I've run it on was a synthetic set of about the same size 
as the old distributed Lanczos test matrix (before it was made smaller to speed 
up the tests). I ended up using a smaller matrix in the test because it was 
taking too long to run. I haven't tested it on truly huge matrices, but I can 
do that at some point if people are interested.

As for comparisons to other algorithms, both CG and LSMR ran faster than the QR 
decomp/solve that's already in Mahout on most of the test inputs I was playing 
with. They also ran faster than a Cholesky decomp/solve that I had implemented 
but haven't submitted. I don't have numbers on me right now, though. 

Between the two, LSMR often converges faster than CG, and when it does, it's 
faster. For problems where they both take the same number of steps, performance 
is about the same, with CG sometimes being a bit quicker since it does less 
computation per iteration. The big advantage of the CG implementation is that 
it can use m/r, so should be scalable to larger matrices.

                
> Implementation of Conjugate Gradient for solving large linear systems
> ---------------------------------------------------------------------
>
>                 Key: MAHOUT-672
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-672
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Math
>    Affects Versions: 0.5
>            Reporter: Jonathan Traupman
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 0001-MAHOUT-672-LSMR-iterative-linear-solver.patch, 
> 0001-MAHOUT-672-LSMR-iterative-linear-solver.patch, MAHOUT-672.patch, 
> mahout-672-111023.patch, mahout-672.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> This patch contains an implementation of conjugate gradient, an iterative 
> algorithm for solving large linear systems. In particular, it is well suited 
> for large sparse systems where a traditional QR or Cholesky decomposition is 
> infeasible. Conjugate gradient only works for matrices that are square, 
> symmetric, and positive definite (basically the same types where Cholesky 
> decomposition is applicable). Systems like these commonly occur in statistics 
> and machine learning problems (e.g. regression). 
> Both a standard (in memory) solver and a distributed hadoop-based solver 
> (basically the standard solver run using a DistributedRowMatrix a la 
> DistributedLanczosSolver) are included.
> There is already a version of this algorithm in taste package, but it doesn't 
> operate on standard mahout matrix/vector objects, nor does it implement a 
> distributed version. I believe this implementation will be more generically 
> useful to the community than the specialized one in taste.
> This implementation solves the following types of systems:
> Ax = b, where A is square, symmetric, and positive definite
> A'Ax = b where A is arbitrary but A'A is positive definite. Directly solving 
> this system is more efficient than computing A'A explicitly then solving.
> (A + lambda * I)x = b and (A'A + lambda * I)x = b, for systems where A or A'A 
> is singular and/or not full rank. This occurs commonly if A is large and 
> sparse. Solving a system of this form is used, for example, in ridge 
> regression.
> In addition to the normal conjugate gradient solver, this implementation also 
> handles preconditioning, and has a sample Jacobi preconditioner included as 
> an example. More work will be needed to build more advanced preconditioners 
> if desired.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to