Thank you, Nathan. On Wed, Feb 22, 2012 at 7:01 PM, Nathan Halko <nat...@spotinfluence.com> wrote: > Hi Dmitriy, > > Just a few comments: > > --the computed factors are approximate A \approx U\SigmaV^{T}
Thanks, agreed. > > -- the projection steps seemed transposed to me but they are consistent > throughout ie. > (2) \tilde{u} = \tilde{c}_{r} V \Sigma^{-1} Yes this is probably an earlier error, but in section 3 fold in expressions I beleive should be correct. I assume the convention is that all vectors in equations have columnar orientation (i.e. x'x is inner product, xx' is always outer product). I will check it > > p. 3: transpose \xi to emphasize row vector > > - 'mean of all rows' is a bit misleading, \xi entries are the mean of each > column (column-wise mean as you state below) > Yeah this keeps coming up. means of rows is the same as column mean. Column mean seems to sound more familiar to people, but mean of rows seems to be more visual: if we have a bunch of data points in multiple dimensions and compute their 'center' (mean) then we say "center of points', or applying to pca situation it converts to 'mean of rows'. But i think concensus is growing that we should always opt out for 'column mean' or at least not mix the two to prevent confusion. > - dimention -> dimension > > I haven't code dived into the new pca code to be familiar with it so the > above comments are just picky notational stuff. I did however, do some > extensive analysis on the standard decomposition part (as of 0.6 SNAPSHOT) > which can be found here Yeah i meant validation of PCA approach. There seems to be somewhat different ways to do it. Some people run eigendecomposition on a covariance matrix which i guess would be adjusted for 1/n. which should be technically equivalent to running svd and then adjusting singular values for n^-0.5 but since nobody really cares about singular values after PCA is done, it seems to be moot. Also it doesn't seem to affect the transformational equations in any way. I was also not sure if i could safely label U rows as original datapoints converted into PCA space (is there is such a thing as PCA space anyway? I saw this concept in some texts i think but i now not sure what was meant by it back there). > > http://amath.colorado.edu/faculty/martinss/Pubs/2012_halko_dissertation.pdf > (starting page 139) This is all cool stuff. I will read it as soon as i get a spare time window. Great! once again, thank you for doing this. -d