In article <[EMAIL PROTECTED]>, Elliot Cramer <[EMAIL PROTECTED]> wrote:
>In sci.stat.edu Eric Zivot <[EMAIL PROTECTED]> wrote: >: In the finance literature, it is common to do pca in a situation in which >: there are more variables than observation (e.g returns on 1000 assets and >: 500 observations). >This says alot about the finance literature > >In this case, one uses what has been called "asymptotic >: principal component analysis". In stead of eigenvalue analysis on the >: non-invertible N x N covariance RR' (R is N x T, N >> T), do eigen value >: analysis on the smaller T x T matrix R'R. > >There is nothing asymptotic about it; it is simple mathematics > >(A'A)x = kx (x an eigen vector, k the value) > >(AA')(Ax) = k(Ax) > >the eigenvalues are the same and you solve for the original vectors. >so what? it's still dumb to do anything with more variables than >observations You don't know what you are talking about. There are many, many situations in which data is analysed when there are more variables than observations. Two more examples are spectroscopic data and DNA microarray data, both of which commonly involve at most a few hundred observations, with each observation being a vector of thousands or even tens of thousands of variables. The absurdity of saying you can't do anything with more variables than observations is well illustrated by the case of spectroscopic data, where the number of variables is just the number of frequencies (or masses, or whatever) at which the spectrum has been observed. If you have a high resolution instrument, you'll end up with more variables than if you have a low resolution instrument. (These variables tend to be highly correlated, of course.) It would be ridiculous to say that you have to throw away the extra data from the better instrument before analysing it. PCA isn't necessarily the best way of analysing such data, but it isn't senseless. Radford Neal ---------------------------------------------------------------------------- Radford M. Neal [EMAIL PROTECTED] Dept. of Statistics and Dept. of Computer Science [EMAIL PROTECTED] University of Toronto http://www.cs.utoronto.ca/~radford ---------------------------------------------------------------------------- . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
