Re: Mahout Lanczos SVD complexity

Dmitriy Lyubimov Mon, 12 Dec 2011 11:49:26 -0800

without computing it is it is 2 jobs, one of them is map-only.

U+V computations run in separate jobs but in parallel so sequentially
it is +1 more step.



With power iterations it is + 2 more for each new power iteration.
Power iterations seem to be expensive in cases when A is very sparse
(size of (A) ~= size of (B) then power iteration essentially is
equivalent to computing AA' although i believe i manage to do it a
little bit more efficient here with MAHOUT-922 then
DRM.timesSquaired(A) would do).

If A is dense, then power iterations make much more sense and not that
expensive.



On Mon, Dec 12, 2011 at 11:07 AM, Ted Dunning <[email protected]> wrote:
> For reference, the SSVD runs in a fixed number of map-reduce phases
> (Dmitriy can say exactly how many, but it is on the order of 3-4 without
> computing U or V and without power iterations).  I think that the cost of
> each map-reduce is roughly O(N d^2).
>
> On Mon, Dec 12, 2011 at 11:57 AM, Jake Mannix <[email protected]> wrote:
>
>> On Mon, Dec 12, 2011 at 10:46 AM, Danny Bickson <[email protected]
>> >wrote:
>> >
>> >
>> > Regarding the Ritz transformation - it depends on the sparsity of A and
>> the
>> > number of iterations whether the final product for computing U should be
>> > done using your technique of
>> > U ~= A V D^-1 or not.
>> >
>>
>> Almost everything in DistributedRowMatrix assumes that matrices are sparse.
>> If they are dense, many of these operations will blow up with OOM in
>> unexpected
>> places if the dimensions are at all large, but I don't know, I don't ever
>> run on
>> completely dense matrices.
>>
>> Mahout SVD is optimized for input matrix being bounded in numCols, small in
>> truncated rank, and sparse.  numRows can be effectively unbounded, given
>> enough hardware.  But numCols * truncatedRank must < RAM of the launching
>> JVM (not mappers/reducers).
>>
>>  -jake
>>
>>
>> > On Mon, Dec 12, 2011 at 8:08 PM, Jake Mannix <[email protected]>
>> > wrote:
>> >
>> > > For reference, look in DistributedRowMatrix#timesSquared(Vector), which
>> > is
>> > > contained on lines 227-247.  JobClient.runJob() is called only one
>> time,
>> > > running
>> > > the TimesSquaredJob (a single map-reduce job).'
>> > >
>> > >  -jake
>> > >
>> > > On Mon, Dec 12, 2011 at 9:58 AM, Danny Bickson <
>> [email protected]
>> > > >wrote:
>> > >
>> > > > K passes over the data - where in each pass you multiply once by A
>> and
>> > > once
>> > > > by A' I call 2K passes over the data.
>> > > >
>> > > > On Mon, Dec 12, 2011 at 7:48 PM, Jake Mannix <[email protected]>
>> > > > wrote:
>> > > >
>> > > > > On Mon, Dec 12, 2011 at 9:10 AM, Danny Bickson <
>> > > [email protected]
>> > > > > >wrote:
>> > > > >
>> > > > > > I meant to write: twice in case of a rectangular matrix.
>> > > > > > By the way, if you want to have the two sides matrices
>> > [U,D,V]=svd(A)
>> > > > > > You will need to run Lanczos twice: once with A and another time
>> > with
>> > > > A'.
>> > > > > > So run time should be doubled.
>> > > > > >
>> > > > >
>> > > > > Neither of these statements are actually correct: for a rectangular
>> > > > matrix,
>> > > > > if you want the
>> > > > > top K singular vectors and values, you will make K passes over the
>> > data
>> > > > > (not 2K) each
>> > > > > one being an operation of (A'A)*v without ever computing A'A
>> itself.
>> > > >  This
>> > > > > operation
>> > > > > "timesSquared(Vector)", for a matrix with row i having d_i nonzero
>> > > > entries,
>> > > > > will scale like
>> > > > > sum_i(d_i^2), but still only one pass over the data.
>> > > > >
>> > > > > Also, once you have run Lanczos over A, and gotten the matrix V
>> out,
>> > > you
>> > > > > can recover
>> > > > > U in O(1) map-reduce operations, by use of the identity: U = A * V
>> *
>> > > D^-1
>> > > > >
>> > > > >  -jake
>> > > > >
>> > > > >
>> > > > > >
>> > > > > > On Mon, Dec 12, 2011 at 7:08 PM, Danny Bickson <
>> > > > [email protected]
>> > > > > > >wrote:
>> > > > > >
>> > > > > > > In each Lanczos iteration you multiply by the matrix A (in case
>> > of
>> > > a
>> > > > > > > square matrix)
>> > > > > > > or twice, by the matrix A' and A. Multiplication is linear in
>> the
>> > > > > number
>> > > > > > > of non zero edges.
>> > > > > > > See http://en.wikipedia.org/wiki/Lanczos_algorithm
>> > > > > > > Finally a decomposition of a tridiagonal matrix T for
>> extracting
>> > > the
>> > > > > > > eigenvalues.
>> > > > > > > I think it is also linear in the number of iterations (since
>> the
>> > > size
>> > > > > of
>> > > > > > T
>> > > > > > > is number of iterations+1). Note that this code is not
>> > distributed
>> > > > > since
>> > > > > > it
>> > > > > > > can be efficiently done on a single node.
>> > > > > > > The last step is the Ritz transformation - a product of the
>> > > > > intermediate
>> > > > > > > vectors v
>> > > > > > > with the eigenvectors. This step may be heavy since those
>> > matrices
>> > > > are
>> > > > > > > typically dense.
>> > > > > > >
>> > > > > > > Best,
>> > > > > > >
>> > > > > > > DB
>> > > > > > >
>> > > > > > >
>> > > > > > > 2011/12/12 Fernando Fernández <
>> > > [email protected]
>> > > > >
>> > > > > > >
>> > > > > > >> Hi all,
>> > > > > > >>
>> > > > > > >> This is a question for everybody, though it may be better
>> > answered
>> > > > by
>> > > > > > Jake
>> > > > > > >> Mannix. Do you guys know what is the complexity of the
>> algorithm
>> > > > > > >> implemented in mahout for Lancos SVD? Linear, quadratic, etc..
>> > > > > > >>
>> > > > > > >>
>> > > > > > >> Thanks in advance!!
>> > > > > > >> Fernando.
>> > > > > > >>
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>

Re: Mahout Lanczos SVD complexity

Reply via email to