[ 
https://issues.apache.org/jira/browse/MAHOUT-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134327#comment-13134327
 ] 

Dmitriy Lyubimov commented on MAHOUT-672:
-----------------------------------------


bq. Basically, at the core of the CG solver is a matrix/vector multiply, so we 
get the map/reduce implementation by using a DistributedRowMatrix instead of an 
in-memory matrix. Since we do one matrix/vector multiply per iteration, it will 
require one map/reduce job per iteration, which somewhat limits its performance 
– there's a large range of data sizes that could benefit from distributed 
computation but that get bogged down by Hadoop's slow job setup/teardown. 
Essentially, we're looking at many of the same tradeoffs we have with the 
distributed Lanczos decomposition stuff.

Thank you, Jonathan.

Yeah, so i figured. that's my concern. That's the Achilles' heel of much of 
distributed stuff in Mahout. I.e. space of iterations (I feel) must be very 
close to O(1), otherwise it severely affects stuff. Even using side information 
is not that painful it seems compared to iteration growth. That severely 
decreases pragmatical use. 

My thinking is that we need to keep algorithms we recommend accountable to some 
standard. My understanding that there's similar problem with ALS WR 
implementation right now. I.e. we can have it in the codebase but Sebastien 
stops short of recommending it to folks on the list.

That's kind of the problem i touched recently : Mahout stuff is different in a 
sense that it requires deeper investigation of best parallelization strategy 
than just let's throw-it-at-it approach. Even with matrix multiplications 
ther're notions that increase computation time tenfold compared to DRM approach 
for some cases which are less than general but in practice are suprisingly 
often (example of such are multiplication steps in SSVD). Hadoop sorting is not 
as inexpensive as its pitch suggests. 

Anyway. I am ok with committing this with @Maturity(Experimental) until we 
confirm running time on some input. I will probably even have to check this out 
soon, i may have a use case for it soon.

                
> Implementation of Conjugate Gradient for solving large linear systems
> ---------------------------------------------------------------------
>
>                 Key: MAHOUT-672
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-672
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Math
>    Affects Versions: 0.5
>            Reporter: Jonathan Traupman
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: 0001-MAHOUT-672-LSMR-iterative-linear-solver.patch, 
> 0001-MAHOUT-672-LSMR-iterative-linear-solver.patch, MAHOUT-672.patch, 
> mahout-672-111023.patch, mahout-672.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> This patch contains an implementation of conjugate gradient, an iterative 
> algorithm for solving large linear systems. In particular, it is well suited 
> for large sparse systems where a traditional QR or Cholesky decomposition is 
> infeasible. Conjugate gradient only works for matrices that are square, 
> symmetric, and positive definite (basically the same types where Cholesky 
> decomposition is applicable). Systems like these commonly occur in statistics 
> and machine learning problems (e.g. regression). 
> Both a standard (in memory) solver and a distributed hadoop-based solver 
> (basically the standard solver run using a DistributedRowMatrix a la 
> DistributedLanczosSolver) are included.
> There is already a version of this algorithm in taste package, but it doesn't 
> operate on standard mahout matrix/vector objects, nor does it implement a 
> distributed version. I believe this implementation will be more generically 
> useful to the community than the specialized one in taste.
> This implementation solves the following types of systems:
> Ax = b, where A is square, symmetric, and positive definite
> A'Ax = b where A is arbitrary but A'A is positive definite. Directly solving 
> this system is more efficient than computing A'A explicitly then solving.
> (A + lambda * I)x = b and (A'A + lambda * I)x = b, for systems where A or A'A 
> is singular and/or not full rank. This occurs commonly if A is large and 
> sparse. Solving a system of this form is used, for example, in ridge 
> regression.
> In addition to the normal conjugate gradient solver, this implementation also 
> handles preconditioning, and has a sample Jacobi preconditioner included as 
> an example. More work will be needed to build more advanced preconditioners 
> if desired.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to