In article <[EMAIL PROTECTED]>, Elliot Cramer  <[EMAIL PROTECTED]> wrote:

>In sci.stat.edu Eric Zivot <[EMAIL PROTECTED]> wrote:
>: In the finance literature, it is common to do pca in a situation in which
>: there are more variables than observation (e.g returns on 1000 assets and
>: 500 observations). 

>This says alot about the finance literature
>
>In this case, one uses what has been called "asymptotic
>: principal component analysis". In stead of eigenvalue analysis on the
>: non-invertible N x N covariance RR' (R is N x T, N >> T), do eigen value
>: analysis on the smaller T x T matrix R'R. 
>
>There is nothing asymptotic about it;  it is simple mathematics
>
>(A'A)x = kx  (x an eigen vector, k the value)
>
>(AA')(Ax) = k(Ax)
>
>the eigenvalues are the same and you solve for the original vectors.
>so what?  it's still dumb to do anything with more variables than 
>observations


You don't know what you are talking about.  There are many, many
situations in which data is analysed when there are more variables
than observations.  Two more examples are spectroscopic data and DNA
microarray data, both of which commonly involve at most a few hundred
observations, with each observation being a vector of thousands or
even tens of thousands of variables.

The absurdity of saying you can't do anything with more variables than
observations is well illustrated by the case of spectroscopic data,
where the number of variables is just the number of frequencies (or
masses, or whatever) at which the spectrum has been observed.  If you
have a high resolution instrument, you'll end up with more variables
than if you have a low resolution instrument.  (These variables tend
to be highly correlated, of course.)  It would be ridiculous to say
that you have to throw away the extra data from the better instrument
before analysing it.

PCA isn't necessarily the best way of analysing such data, but it
isn't senseless.

   Radford Neal

----------------------------------------------------------------------------
Radford M. Neal                                       [EMAIL PROTECTED]
Dept. of Statistics and Dept. of Computer Science [EMAIL PROTECTED]
University of Toronto                     http://www.cs.utoronto.ca/~radford
----------------------------------------------------------------------------
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to