Re: Mahout Lanczos SVD complexity

Ted Dunning Sat, 17 Dec 2011 23:55:53 -0800

Radim,

I looked at the discussion that you referenced and have a few reservations
about your conclusions.

You are correct that the randomized projection methods are of limited
accuracy when the singular values are not decreasing rapidly, but I think
that you were asking for too many singular values (500) from a relatively
small matrix to make much sense.  My argument for saying this is that the
singular values at this point have become quite small and are likely to be
modeling the randomized variation in your corpus rather than the systematic
patterns of occurrence.  As such, the "accuracy" of the decomposition is of
little interest.

It may well be that using a high number of singular values increases
apparent accuracy in your application over using a smaller number, but it
is likely that augmenting a small decomposition with random vectors instead
of true singular vectors would provide virtually identical performance.  As
such, computing these extra singular vectors more or less accuracy is
hardly of the essence.

In order to determine whether this is the case, the interesting comparison
to make is whether you get higher application accuracy on held out data
from, say, 100 singular vectors + 400 random vectors or your 500 singular
vectors.  It is common with LSI-like applications to test 100 singular
vectors against 500 singular vectors, but does not really measure how much
information is being extracted in the singular decomposition.

Regarding the accuracy of the decomposition, your choice of k is well below
the bound suggested by the theoretical considerations in the Halko, et al,
paper.  For small matrices with the size measured in thousands or tens of
thousands and very tight desired accuracies, this bound is not all that
helpful and it isn't surprising that you may have some problems.

In any, case I think that your experience is divergent enough from the
applications being considered in Mahout, that conclusions should be drawn
with some caution.

On Sat, Dec 17, 2011 at 10:00 PM, Radim Rehurek <[email protected]>wrote:

> Ted Dunning wrote:
> > Also, it isn't entirely clear yet whether power iterations are more
> > efficient than simply increasing the fudge factor p. Power iterations are
> > very effective, and increasing p increases costs in the cube, but running
> > MR passes is expensive enough that some increase in p might be sufficient
> > and still faster than a power iteration.
>
> Don't even think about it -- on large matrices (significant truncation),
> the randomized SVD without power iterations is complete rubbish, regardless
> of any sane value of `p`.
>
> (assuming your fudge factor p = oversampling parameter `p` from Halko et
> al.).
>
> You can see some accuracy experiments (wikipedia, images) here:
> http://groups.google.com/group/gensim/msg/1f3106b85837ce9c and here
> http://groups.google.com/group/gensim/msg/240a348b70f18b30 .
>
> Also see the end of that (longish) thread for some notes on accuracy in
> face of many power iterations (~ beware of numeric overflows).
>
> Best,
> Radim
>

Re: Mahout Lanczos SVD complexity

Reply via email to