Re: [jira] [Issue Comment Edited] (MAHOUT-672) Implementation of Conjugate Gradient for solving large linear systems

Sebastian Schelter Mon, 24 Oct 2011 11:49:33 -0700

PageRank and RandomWalkWithRestart also rely  on the problematic iterative
matrix vector multiplications...
Am 24.10.2011 20:32 schrieb "Dmitriy Lyubimov (Issue Comment Edited) (JIRA)"
<j...@apache.org>:


>
>    [
> https://issues.apache.org/jira/browse/MAHOUT-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134327#comment-13134327]
>
> Dmitriy Lyubimov edited comment on MAHOUT-672 at 10/24/11 6:28 PM:
> -------------------------------------------------------------------
>
> bq. Basically, at the core of the CG solver is a matrix/vector multiply, so
> we get the map/reduce implementation by using a DistributedRowMatrix instead
> of an in-memory matrix. Since we do one matrix/vector multiply per
> iteration, it will require one map/reduce job per iteration, which somewhat
> limits its performance – there's a large range of data sizes that could
> benefit from distributed computation but that get bogged down by Hadoop's
> slow job setup/teardown. Essentially, we're looking at many of the same
> tradeoffs we have with the distributed Lanczos decomposition stuff.
>
> Thank you, Jonathan.
>
> Yeah, so i figured. that's my concern. That's the Achilles' heel of much of
> distributed stuff in Mahout. I.e. space of iterations (I feel) must be very
> close to O(1), otherwise it severely affects stuff. Even using side
> information is not that painful it seems compared to iteration growth. That
> severely decreases pragmatical use.
>
> My thinking is that we need to keep algorithms we recommend accountable to
> some standard. My understanding is that there's similar problem with ALS WR
> implementation right now. I.e. we can have it in the codebase but Sebastien
> stops short of recommending it to folks on the list.
>
> That's kind of the problem i touched recently : Mahout stuff is different
> in a sense that it requires deeper investigation of best parallelization
> strategy than just let's throw-it-at-it approach. Even with matrix
> multiplications ther're notions that decrease computation time tenfold
> compared to DRM approach for some cases which are less than general but in
> practice are suprisingly common (example of such are multiplication steps in
> SSVD). Hadoop sorting is not as inexpensive as its pitch may suggest. And
> tenfold is sort of interesting...it's a difference between 10 hours and 1
> hour.
>
> Anyway. I am ok with committing this with @Maturity(Experimental) until we
> confirm running time on some input. I will probably even have to check this
> out soon, i may have a use case for it soon.
>
>
>      was (Author: dlyubimov):
>    bq. Basically, at the core of the CG solver is a matrix/vector multiply,
> so we get the map/reduce implementation by using a DistributedRowMatrix
> instead of an in-memory matrix. Since we do one matrix/vector multiply per
> iteration, it will require one map/reduce job per iteration, which somewhat
> limits its performance – there's a large range of data sizes that could
> benefit from distributed computation but that get bogged down by Hadoop's
> slow job setup/teardown. Essentially, we're looking at many of the same
> tradeoffs we have with the distributed Lanczos decomposition stuff.
>
> Thank you, Jonathan.
>
> Yeah, so i figured. that's my concern. That's the Achilles' heel of much of
> distributed stuff in Mahout. I.e. space of iterations (I feel) must be very
> close to O(1), otherwise it severely affects stuff. Even using side
> information is not that painful it seems compared to iteration growth. That
> severely decreases pragmatical use.
>
> My thinking is that we need to keep algorithms we recommend accountable to
> some standard. My understanding that there's similar problem with ALS WR
> implementation right now. I.e. we can have it in the codebase but Sebastien
> stops short of recommending it to folks on the list.
>
> That's kind of the problem i touched recently : Mahout stuff is different
> in a sense that it requires deeper investigation of best parallelization
> strategy than just let's throw-it-at-it approach. Even with matrix
> multiplications ther're notions that decrease computation time tenfold
> compared to DRM approach for some cases which are less than general but in
> practice are suprisingly often (example of such are multiplication steps in
> SSVD). Hadoop sorting is not as inexpensive as its pitch suggests. And
> tenfold is sort of interesting...it's a difference between 10 hours and 1
> hour.
>
> Anyway. I am ok with committing this with @Maturity(Experimental) until we
> confirm running time on some input. I will probably even have to check this
> out soon, i may have a use case for it soon.
>
>
> > Implementation of Conjugate Gradient for solving large linear systems
> > ---------------------------------------------------------------------
> >
> >                 Key: MAHOUT-672
> >                 URL: https://issues.apache.org/jira/browse/MAHOUT-672
> >             Project: Mahout
> >          Issue Type: New Feature
> >          Components: Math
> >    Affects Versions: 0.5
> >            Reporter: Jonathan Traupman
> >            Priority: Minor
> >             Fix For: 0.6
> >
> >         Attachments: 0001-MAHOUT-672-LSMR-iterative-linear-solver.patch,
> 0001-MAHOUT-672-LSMR-iterative-linear-solver.patch, MAHOUT-672.patch,
> mahout-672-111023.patch, mahout-672.patch
> >
> >   Original Estimate: 48h
> >  Remaining Estimate: 48h
> >
> > This patch contains an implementation of conjugate gradient, an iterative
> algorithm for solving large linear systems. In particular, it is well suited
> for large sparse systems where a traditional QR or Cholesky decomposition is
> infeasible. Conjugate gradient only works for matrices that are square,
> symmetric, and positive definite (basically the same types where Cholesky
> decomposition is applicable). Systems like these commonly occur in
> statistics and machine learning problems (e.g. regression).
> > Both a standard (in memory) solver and a distributed hadoop-based solver
> (basically the standard solver run using a DistributedRowMatrix a la
> DistributedLanczosSolver) are included.
> > There is already a version of this algorithm in taste package, but it
> doesn't operate on standard mahout matrix/vector objects, nor does it
> implement a distributed version. I believe this implementation will be more
> generically useful to the community than the specialized one in taste.
> > This implementation solves the following types of systems:
> > Ax = b, where A is square, symmetric, and positive definite
> > A'Ax = b where A is arbitrary but A'A is positive definite. Directly
> solving this system is more efficient than computing A'A explicitly then
> solving.
> > (A + lambda * I)x = b and (A'A + lambda * I)x = b, for systems where A or
> A'A is singular and/or not full rank. This occurs commonly if A is large and
> sparse. Solving a system of this form is used, for example, in ridge
> regression.
> > In addition to the normal conjugate gradient solver, this implementation
> also handles preconditioning, and has a sample Jacobi preconditioner
> included as an example. More work will be needed to build more advanced
> preconditioners if desired.
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators:
> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>

Re: [jira] [Issue Comment Edited] (MAHOUT-672) Implementation of Conjugate Gradient for solving large linear systems

Reply via email to