Re: Mahout Lanczos SVD complexity

Danny Bickson Mon, 12 Dec 2011 10:01:37 -0800

Also don't forget that when you compute timeSquared() it results in two map
reduce phases where in each of them the matrix A is fully read from disk.
So I would not call it a single pass.


On Mon, Dec 12, 2011 at 7:58 PM, Danny Bickson <[email protected]>wrote:

> K passes over the data - where in each pass you multiply once by A and
> once by A' I call 2K passes over the data.
>
>
> On Mon, Dec 12, 2011 at 7:48 PM, Jake Mannix <[email protected]>wrote:
>
>> On Mon, Dec 12, 2011 at 9:10 AM, Danny Bickson <[email protected]
>> >wrote:
>>
>> > I meant to write: twice in case of a rectangular matrix.
>> > By the way, if you want to have the two sides matrices [U,D,V]=svd(A)
>> > You will need to run Lanczos twice: once with A and another time with
>> A'.
>> > So run time should be doubled.
>> >
>>
>> Neither of these statements are actually correct: for a rectangular
>> matrix,
>> if you want the
>> top K singular vectors and values, you will make K passes over the data
>> (not 2K) each
>> one being an operation of (A'A)*v without ever computing A'A itself.  This
>> operation
>> "timesSquared(Vector)", for a matrix with row i having d_i nonzero
>> entries,
>> will scale like
>> sum_i(d_i^2), but still only one pass over the data.
>>
>> Also, once you have run Lanczos over A, and gotten the matrix V out, you
>> can recover
>> U in O(1) map-reduce operations, by use of the identity: U = A * V * D^-1
>>
>>  -jake
>>
>>
>> >
>> > On Mon, Dec 12, 2011 at 7:08 PM, Danny Bickson <[email protected]
>> > >wrote:
>> >
>> > > In each Lanczos iteration you multiply by the matrix A (in case of a
>> > > square matrix)
>> > > or twice, by the matrix A' and A. Multiplication is linear in the
>> number
>> > > of non zero edges.
>> > > See http://en.wikipedia.org/wiki/Lanczos_algorithm
>> > > Finally a decomposition of a tridiagonal matrix T for extracting the
>> > > eigenvalues.
>> > > I think it is also linear in the number of iterations (since the size
>> of
>> > T
>> > > is number of iterations+1). Note that this code is not distributed
>> since
>> > it
>> > > can be efficiently done on a single node.
>> > > The last step is the Ritz transformation - a product of the
>> intermediate
>> > > vectors v
>> > > with the eigenvectors. This step may be heavy since those matrices are
>> > > typically dense.
>> > >
>> > > Best,
>> > >
>> > > DB
>> > >
>> > >
>> > > 2011/12/12 Fernando Fernández <[email protected]>
>> > >
>> > >> Hi all,
>> > >>
>> > >> This is a question for everybody, though it may be better answered by
>> > Jake
>> > >> Mannix. Do you guys know what is the complexity of the algorithm
>> > >> implemented in mahout for Lancos SVD? Linear, quadratic, etc..
>> > >>
>> > >>
>> > >> Thanks in advance!!
>> > >> Fernando.
>> > >>
>> > >
>> > >
>> >
>>
>
>

Re: Mahout Lanczos SVD complexity

Reply via email to