Re: Mahout LanczosSolver explanation

Jake Mannix Tue, 31 Jul 2012 21:03:34 -0700

On Tue, Jul 31, 2012 at 6:00 PM, Aniruddha Basak <[email protected]>wrote:


> Hi,****
>
> I am working on Spectral Kmeans which involves an eigen-decomposition step
> ****
>
> using Lanczos. As I did not get exact similar results as expected, I tried
> to understand the****
>
> implementation.****
>
> I have one question about “LanczosSolver .java” :****
>
> In the “solve” method, while building the “tridiagonal matrix” there is a
> step just****
>
> after the multiplication job (performed on Hadoop as TimesSquaredJob) as**
> **
>
> ------------****
>
> nextVector.assign(*new* Scale(1.0 / state.getScaleFactor()));****
>
> -----------****
>
> I could not understand why this scaling is performed?
>

It's to help with floating point overflow issues in naive Lanczos
implementations:
each time you multiply the basis vectors by the matrix^2, it grows roughly
like the
largest singular of the matrix (squared).  Do this a fifty or sixty times
and you
blow over 10^308 and you're screwed.  Scaling the basis vectors down after
each iteration is equivalent to normalizing the input matrix to have
smaller singular
values (uniformly), and turns floating point overflow problems into
underflow
problems.  Thankfully, when you have entries in your basis vectors which are
close to zero, it's ok to have them be *actually* zero.  Unless you do it
to all of
them, naturally.

Either way, uniform scaling doesn't change the output singular vectors, so
the
result is unchanged if you don't end up running into floating point
precision
issues.


> ****
>
> **(When I compared the results on a small matrix to an equivalent Matlab
> script,
>
> **
>
> I found the results are exactly similar WITHOUT this scaling. Including
> this scaling****
>
> makes the results different from the Matlab results.  )****
>
> ** **
>
> ** **
>
> Thanks,****
>
> Aniruddha****
>
> ** **
>



-- 

  -jake

Re: Mahout LanczosSolver explanation

Reply via email to