In article <[EMAIL PROTECTED]>,
Rich Ulrich  <[EMAIL PROTECTED]> wrote:
>On 24 Nov 2000 15:50:11 -0800, [EMAIL PROTECTED] (Alex Yu)
>wrote:


>> I ran a SAS macro and output a tetrachoric correlation matrix of 236
>> variables successfully. However, when I ran a factor analsyis using the
>> matrix as the infile, it fails. Although I have specified 'corr' for
>> _type_, SAS said that: 

                        ..............

>Okay, I can accept that someone might propose that tetrachoric
>correlations will do a better job of representing relations (than the
>ordinary Pearson).  However, if you want testing, I think I recall
>advice that you use the test on the Pearson, or have at least 4 times
>the N.  

I doubt that this is the important part.  What is of major
importance is that covariance does not depend in any manner
on normality, while tetrachoric correlation may be an
approximation for one pair of characteristics if the
underlying continuous variables are only approximately
jointly normal, doing it for many makes little sense.

It is not a matter of sample size, but of the accuracy of the
ASSUMPTIONS.  As to which assumptions are important, normality
is not particularly important for regression, but linearity is
of major importance.

>Here are related questions.  I don't care for the idea, much, of
>trying to factor 236 variables.  But if someone wants to do so, should
>the rule of thumb that says "use 10 times the N as the number of
>variables"  be modified to read "use 40 times ...?"  Some other
>number?

There are situations where not more observations than
variables are needed, and one can even get reasonable
estimates of a factor structure with not too many factors
with less observations than variables, with a linear
structure.  For dichotomous variables, a totally different
approach is needed.  It is only the user who understands
probability modeling and who does not try to force
statistical techniques who should even suggest such an
approach.  Statistics must not be allowed to do your 
thinking for you.

>And, will the eigen values all be non-negative?  
>(I suspect that the answer is No, and that you shouldn't worry about
>them.  But I wouldn't mind hearing for sure.)

Unless one uses product-moment covariances, or something
with similar mathematical properties, there is no good
reason to assume this.  Even if one does, but uses
different observations to estimate different covariances,
such as with missing values, the matrix will not
necessarily be positive semi-definite.  
-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED]         Phone: (765)494-6054   FAX: (765)494-0558


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to