On Wed, 2009-04-01 at 18:21 +0200, Edzer Pebesma wrote: > Markus, a few notes: > > - if you do PCA on uncentered data, by computing the eigenvalues of the > uncentered covariance matrix, this implies that bands with a larger mean > will get more influence on the final PCAs. I have sofar not managed > finding an argument why this would be desirable. > - if you do PCA on (band-mean)/sd(band), it means that you first > normalize (scale) each variable to mean zero and unit variance. This > procedure is identical to doing PCA on the correlation matrix. It means > that, unlike for unscaled variables, variables with larger variance will > not get more influence on the PCA than others. For image analysis I can > see a place for both; if bands with low variance indicate insignificant > and perhaps noisy information, you may downweight them. Or not, if they > contain (equally) important information. Scaling becomes urgent when you > compute PCAs from a mix of things with uncomparable units, such as image > bands and DTMs. > - Only in case of normalized variables, or equivalently PCA on > correlations, it makes sense to select PC's with an eigenvalue larger > than 1. The reasoning is fairly weak, but goes like this: if a PC has > eigenvalue > 1, it explains more variance than any of the original > variables, which all have variance 1. > > Maybe I should Cc: this to the wiki. > -- > Edzer
Nice to see this expert-comments! Really helpful to understand the process better. Thanks, Nikos _______________________________________________ grass-user mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/grass-user
