As Brian Ripley said, the short answer is no, it depends on what you're doing. A
slightly longer
answer is that this is, yet again, an example of parsimonious model building, so it
depends on the
model and use to which it is being put. My experience with PCA is that with modest
amounts of high
dimensional data (>10D, say) it's rare to get much beyond one or two components
(eigenvectors)
that are useful. Also, beware: PCA based on the usual least-squares covariance matrix
is quite
sensitive to outliers; I almost always prefer a very resistant covariance calculation
such as that
given by cov.rob, at least to start with to explore the data in the space of the first
few
projections. This "outlier" sensitivity (and note, it is very difficult to know what
an outlier is
in many dimensions) also affects most of the "automatic" tests for dimensionality.
Cross
validation and separate test sets may be a better way to determine model
dimensionality, although
these are also not without peril.
Cheers,
Bert
--
Bert Gunter
Non-Clinical Biostatistics
Genentech
MS: 240B
Phone: 650-467-7374
"The business of the statistician is to catalyze the scientific learning process."
-- George E.P. Box
>
[[alternative HTML version deleted]]
______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html