Ok . I guess we'll have to see how it plays out in scale. Current version
does computation on Q blocks that have to be k+p wide. With hadoop default
setting which i think is -Xmx200M, and constraint of m>=n for Q block, that
puts upper limit on k+p into the area of ~1.4 K for completely square dense
Q blocks, other expenses notwithstanding, with default child process
settings.i am to guess it is certainly going to be enough for my personal
purposes :-). I will expect somebody to provide correction on that  for
Mahout goals.

On Thu, Nov 18, 2010 at 11:41 AM, Ted Dunning <[email protected]> wrote:

> There is an ironic tension with these.  Using the power iterations is
> generally bad numerically, but having a small
> p is much worse for accuracy.  That means that factoring (A' A)^q A will
> get
> much more accurate values for the same
> value of p.  Alternately phrased, getting the same accuracy would require a
> much larger value of p and thus would
> overcome the cost of the initial power iteration.
>
> How this works out in practice on truly massive scale is totally up in the
> air.  The result of the stochastic projection
> can actually be *larger* than the original sparse matrix which would seem
> to
> imply that the power method might
> actually save time sometimes.
>
> On Thu, Nov 18, 2010 at 11:07 AM, Dmitriy Lyubimov <[email protected]
> >wrote:
>
> > Further work on this may include implementation of power iterations
> > (although i doubt there's much to be had of them on such big volumes).
> >
>

Reply via email to