Thank you, Nathan.

On Wed, Feb 22, 2012 at 7:01 PM, Nathan Halko <nat...@spotinfluence.com> wrote:
> Hi Dmitriy,
>
>  Just a few comments:
>
> --the computed factors are approximate  A \approx U\SigmaV^{T}

Thanks, agreed.

>
> -- the projection steps seemed transposed to me but they are consistent
> throughout ie.
> (2)  \tilde{u} = \tilde{c}_{r} V \Sigma^{-1}

Yes this is probably an earlier error, but in section 3 fold in
expressions I beleive should be correct. I assume the convention is
that all vectors in equations have columnar orientation (i.e. x'x is
inner product, xx' is always outer product). I will check it

>
> p. 3:  transpose \xi to emphasize row vector
>
> - 'mean of all rows' is a bit misleading, \xi entries are the mean of each
> column  (column-wise mean as you state below)
>

Yeah this keeps coming up. means of rows is the same as column mean.
Column mean seems to sound more familiar to people, but mean of rows
seems to be more visual: if we have a bunch of data points in multiple
dimensions and compute their 'center' (mean) then we say "center of
points', or applying to pca situation it converts to 'mean of rows'.
But i think concensus is growing that we should always opt out for
'column mean' or at least not mix the two to prevent confusion.


> - dimention -> dimension
>
> I haven't code dived into the new pca code to be familiar with it so the
> above comments are just picky notational stuff.  I did however, do some
> extensive analysis on the standard decomposition part (as of 0.6 SNAPSHOT)
> which can be found here


Yeah i meant validation of PCA approach. There seems to be somewhat
different ways to do it. Some people run eigendecomposition on a
covariance matrix which i guess would be adjusted for 1/n. which
should be technically equivalent to running svd and then adjusting
singular values for n^-0.5 but since nobody really cares about
singular values after PCA is done, it seems to be moot. Also it
doesn't seem to affect the transformational equations in any way.

I was also not sure if i could safely label U rows as original
datapoints converted into PCA space (is there is such a thing as  PCA
space anyway? I saw this concept in some texts i think but i now not
sure what was meant by it back there).

>
> http://amath.colorado.edu/faculty/martinss/Pubs/2012_halko_dissertation.pdf
> (starting page 139)

This is all cool stuff. I will read it as soon as i get a spare time
window. Great!

once again, thank you for doing this.

-d

Reply via email to