As Brian Ripley said, the short answer is no, it depends on what you're doing. A 
slightly longer
answer is that this is, yet again, an example of parsimonious model building, so it 
depends on the
model and use to which it is being put. My experience with PCA is that with modest 
amounts of high
dimensional data (>10D, say)  it's rare to get much beyond one or two components 
(eigenvectors)
that are useful. Also, beware: PCA based on the usual least-squares covariance matrix 
is quite
sensitive to outliers; I almost always prefer a very resistant covariance calculation 
such as that
given by cov.rob, at least to start with to explore the data in the space of the first 
few
projections. This "outlier" sensitivity (and note, it is very difficult to know what 
an outlier is
in many dimensions) also affects most of the "automatic" tests for dimensionality. 
Cross
validation and separate test sets may be a better way to determine model 
dimensionality, although
these are also not without peril.

Cheers,
Bert

--

Bert Gunter

Non-Clinical Biostatistics
Genentech
MS: 240B
Phone: 650-467-7374


"The business of the statistician is to catalyze the scientific learning process."

 -- George E.P. Box

>

        [[alternative HTML version deleted]]

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to