Re: Mahout Lanczos SVD complexity

Jake Mannix Mon, 12 Dec 2011 10:04:53 -0800

On Mon, Dec 12, 2011 at 10:00 AM, Danny Bickson <[email protected]>wrote:


> Also don't forget that when you compute timeSquared() it results in two map
> reduce phases where in each of them the matrix A is fully read from disk.
> So I would not call it a single pass.
>

This is not true.  timesSquared() reads the data in *one* map-reduce pass.


>
> On Mon, Dec 12, 2011 at 7:58 PM, Danny Bickson <[email protected]
> >wrote:
>
> > K passes over the data - where in each pass you multiply once by A and
> > once by A' I call 2K passes over the data.
> >
> >
> > On Mon, Dec 12, 2011 at 7:48 PM, Jake Mannix <[email protected]
> >wrote:
> >
> >> On Mon, Dec 12, 2011 at 9:10 AM, Danny Bickson <[email protected]
> >> >wrote:
> >>
> >> > I meant to write: twice in case of a rectangular matrix.
> >> > By the way, if you want to have the two sides matrices [U,D,V]=svd(A)
> >> > You will need to run Lanczos twice: once with A and another time with
> >> A'.
> >> > So run time should be doubled.
> >> >
> >>
> >> Neither of these statements are actually correct: for a rectangular
> >> matrix,
> >> if you want the
> >> top K singular vectors and values, you will make K passes over the data
> >> (not 2K) each
> >> one being an operation of (A'A)*v without ever computing A'A itself.
>  This
> >> operation
> >> "timesSquared(Vector)", for a matrix with row i having d_i nonzero
> >> entries,
> >> will scale like
> >> sum_i(d_i^2), but still only one pass over the data.
> >>
> >> Also, once you have run Lanczos over A, and gotten the matrix V out, you
> >> can recover
> >> U in O(1) map-reduce operations, by use of the identity: U = A * V *
> D^-1
> >>
> >>  -jake
> >>
> >>
> >> >
> >> > On Mon, Dec 12, 2011 at 7:08 PM, Danny Bickson <
> [email protected]
> >> > >wrote:
> >> >
> >> > > In each Lanczos iteration you multiply by the matrix A (in case of a
> >> > > square matrix)
> >> > > or twice, by the matrix A' and A. Multiplication is linear in the
> >> number
> >> > > of non zero edges.
> >> > > See http://en.wikipedia.org/wiki/Lanczos_algorithm
> >> > > Finally a decomposition of a tridiagonal matrix T for extracting the
> >> > > eigenvalues.
> >> > > I think it is also linear in the number of iterations (since the
> size
> >> of
> >> > T
> >> > > is number of iterations+1). Note that this code is not distributed
> >> since
> >> > it
> >> > > can be efficiently done on a single node.
> >> > > The last step is the Ritz transformation - a product of the
> >> intermediate
> >> > > vectors v
> >> > > with the eigenvectors. This step may be heavy since those matrices
> are
> >> > > typically dense.
> >> > >
> >> > > Best,
> >> > >
> >> > > DB
> >> > >
> >> > >
> >> > > 2011/12/12 Fernando Fernández <
> [email protected]>
> >> > >
> >> > >> Hi all,
> >> > >>
> >> > >> This is a question for everybody, though it may be better answered
> by
> >> > Jake
> >> > >> Mannix. Do you guys know what is the complexity of the algorithm
> >> > >> implemented in mahout for Lancos SVD? Linear, quadratic, etc..
> >> > >>
> >> > >>
> >> > >> Thanks in advance!!
> >> > >> Fernando.
> >> > >>
> >> > >
> >> > >
> >> >
> >>
> >
> >
>

Re: Mahout Lanczos SVD complexity

Reply via email to