I think that the solver actually does an SVD, but most of what you say
follows.

THere is one strangeness, I think in that the DistributedRowMatrix.times is
doing a transposeTimes operation, not the normal times.

Jake should comment.

On Thu, Sep 2, 2010 at 8:28 PM, Jeff Eastman <[email protected]>wrote:

>  On 9/2/10 7:41 PM, Jeff Eastman wrote:
>
>>  Hopefully answering my own question here but ending up with another. The
>> svd matrix I'd built from the eigenvectors is the wrong shape as I built it.
>> Taking Jake's "column space" literally and building a matrix where each of
>> the columns is one of the eigenvectors does give a matrix of the correct
>> shape. The math works with DenseMatrix, producing a new data matrix which is
>> 15x7; a significant dimensionality reduction from 15x39.
>>
>> In this example, with 15 samples having 39 terms and 7 eigenvectors:
>>    A = [15x39]
>>    P = [39x7]
>>    A P = [15x7]
>> <snip>
>>
> Representing the eigen decomposition math in the above notation, A P is the
> projection of the data set onto the eigenvector basis:
>
> If:
> A = original data matrix
> P = eigenvector column matrix
> D = eigenvalue diagonal matrix
>
> Then:
> A P = P D => A = P D P'
>
> Since we have A and P is already calculated by DistributedLanczosSolver it
> is easy to compute A P and we don't need the eigenvalues at all. This is
> good because the DLS does not output them. Is this why it doesn't bother?
>
>

Reply via email to