Donald F. Burrill wrote: 
> For a completely different approach, the following references may be 
> useful.  Jason Walter's interest appears to lie mainly in reducing the 
> dimensionality of the data set, if I read his post correctly.  While it 
> is true that one can decide, more or less rationally, to use fewer than 
> the  p  principal components drived from  p  original variables, this 
> does not imply that one can work with fewer than  p  variables, since 
> every variable loads on every component.  (Sometimes one more or less 
> arbitrarily drops from a component those variables whose loadings are 
> less than some threshold value;  but the principle remains.)

Yes, we are interested in reducing the dimensionality of a
dataset. And we are aware that one cannot assume that n < p principal
components implies n < p variables.  Thus, when 'working' with the
lower dimensional data we convert the original data into principal
component space, where we know that we can represent s% of the
variablity in n dimensions.  We have successfully applied this to
actual data, and it has worked for our desired task.  

We seek better stopping rules in choosing 'n' because we want a
guareentee that if we apply an inversion in principal component space
that we can come 'close' to the what the value should be in the
original space with p dimensions.  For this reason, a stopping rule
based on residual variability sounds appealing.  Specifically, if x is
what the value should be, and x' is the result of an inversion of the
principal component model then x - x' should be within a user defined
tolerance.  Naturally, we hope that x - x' = 0 but this is extremely
unlikely because we are unprojecting from a lower dimensional space.
However, a guareentee that x - x' is small would be nice (using 'n'
principal components).  A property as such is very desirable because
while working in principal component space (PCS) we can introduce new
data, based on data in PCS, and insure that it will map back, within a
tolerance, into the original space.  

Given above, I have two obstacles that I have yet to overcome.  First,
it is still unclear to me how to 'look' at the principal components
and determine which one's to keep to insure this property,
i.e. residual variability.  Second, I haven't entirely convinced
myself that the above logic will work. Any insite on either would be
appreciated.

Regardless, your post was inciteful, and I will be sure to look into
Prof. R.P. Bhargava's work.    

Thanks, 

Jason Walter
NCSU CSC Graduate Student
[EMAIL PROTECTED] 






===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================

Reply via email to