For this, I think a nice approach is the cross validation method
discussed by Svante Wold in a Technometrics paper in the seventies
or early eighties.


At 16:15 -0400 05/04/2000, [EMAIL PROTECTED] wrote:
>Greetings,
>
>I'm hoping that you could help me resolve a problem that I have been
>working on for quite sometime.  
>
>We are using PCA, Principal Components Analysis, with arbitrary large
>datasets. Thus, we are working only with correlation matrices, and
>naturally we hope that PCA will significantly reduce the
>dimensionality of our dataset.  Currently we employ proportion of
>trace explained to determine which pc's to keep.  However, after
>considerable reading on the subject, we discovered that this isn't the
>best route to take. Thus we have decided to look into using stopping
>rules based on how much residual variability that one is willing to
>accept.  This is where confusion sets in. 
>
>Now my primary source of information has been J. Edward Jackson's "A
>User's Guide to Principal Component Analysis."  In this book he
>describes a number of methods to determine if an outlier exists within
>the data, using residual analysis.  However, it is unclear to me how
>this factors into determining which pc's to keep, since most of the
>statistics in regard to residual analysis deal specifically with
>scores obtained from a data vector.  I suspect that we can determine
>this by continually testing the diagonal of the residual matrix until
>it meats our criteria, but I'm not entirely sure.  Is this a 'good'
>approach to take in determining which pc's to keep? Keep in mind that
>this needs to be determined without user intervention.   I also
>understand that appropriate significance tests must be performed
>before this operation.  
>
>Also, just out of curiousity, if we added a sample in principal
>component space, reduced dimensionality, and applied an inversion on
>this data element, would we get a value in original space that closely
>matched, given our residual criteria, what the actual data element
>should be.  It would seem true given that we were willing to
>sacrifice so much variability performing a PCA.  Thus inverting it
>would be close to the actual value but off by criteria given. 
>
>Many thanks for your assistance,
>
>Jason Walter
>CSC Graduate Student
>[EMAIL PROTECTED]
>
>
>
>
>
>===========================================================================
>This list is open to everyone.  Occasionally, less thoughtful
>people send inappropriate messages.  Please DO NOT COMPLAIN TO
>THE POSTMASTER about these messages because the postmaster has no
>way of controlling them, and excessive complaints will result in
>termination of the list.
>
>For information about this list, including information about the
>problem of inappropriate messages and information about how to
>unsubscribe, please see the web page at
>http://jse.stat.ncsu.edu/
>===========================================================================

-- 
===
Jan de Leeuw; Professor and Chair, UCLA Department of Statistics;
US mail: 8142 Math Sciences Bldg, Box 951554, Los Angeles, CA 90095-1554
phone (310)-825-9550;  fax (310)-206-5658;  email: [EMAIL PROTECTED]
    http://www.stat.ucla.edu/~deleeuw and http://home1.gte.net/datamine/
============================================================================
          No matter where you go, there you are. --- Buckaroo Banzai
                   http://webdev.stat.ucla.edu/sounds/nomatter.au
============================================================================


===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================

Reply via email to