Edzer Pebesma wrote:
Markus, a few notes:

- if you do PCA on uncentered data, by computing the eigenvalues of the
uncentered covariance matrix, this implies that bands with a larger mean
will get more influence on the final PCAs. I have sofar not managed
finding an argument why this would be desirable.
Add it to wiki? E.g. bands entered in a PCA should have the same mean, but normalization is also an option.
- if you do PCA on (band-mean)/sd(band), it means that you first
normalize (scale)
I think scale and normalize are two different things.
each variable to mean zero and unit variance. This
procedure is identical to doing PCA on the correlation matrix. It means
that, unlike for unscaled variables, variables with larger variance will
not get more influence on the PCA than others. For image analysis I can
see a place for both; if bands with low variance indicate insignificant
and perhaps noisy information, you may downweight them.
Variance is dependent on range, I would rather use something like coefficient of variation (stddev/mean) to get some scale-independent indicator on the amount of information in a given band. A downscaled band (e.g. MODIS scale of 0.0001) has still the same information but lower variance.
- Only in case of normalized variables, or equivalently PCA on
correlations, it makes sense to select PC's with an eigenvalue larger
than 1. The reasoning is fairly weak, but goes like this: if a PC has
eigenvalue > 1, it explains more variance than any of the original
variables, which all have variance 1.
Sounds good to me, why should I use a component that explains less than any of the original bands? And the whole purpose of a PCA is variable reduction to get a new set of variables, each explaining the whole dataset better than one of the original variables/bands. A PCA produces as many components as input variables, so some selection is usually necessary for further processing, could also be % explained variance. OTOH, sometimes only the first component is of interest. There may be exceptions for imagery processing, e.g. haze reduction (would have to read up on imagery processing too to say anything more about where components with eigenvalue < 1 could be useful).

_______________________________________________
grass-stats mailing list
grass-stats@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-stats

Reply via email to