Dear Edstat-listers,

I have 8 variables per observation, all count data (integers>0), and I want to be able 
to run an R factor analysis to obtain factor scores.  The data have the following 
attributes:

(1) Hundreds of thousands of observations at my disposal, from which I can sample if 
nec.
(2) Significantly non-normal, apparently not very amenable to transformations
(3) Significant portions of the observations have zeros "across the board"

As I understand it, the assumption of normality is less important as number of 
observations increase.  I am aware that recent theoretical work has been done on 
explicitly factor analyzing count data (i.e. Chib and Winkelmann; GLLAMM), but I was 
wondering if my simple use of a program like STATA to obtain factor scores here in the 
meantime is defensible (or whether other parameters are necessary for this decision).

Thanks in advance,
Chihmao Hsieh

Olin School of Business
Washington University in St. Louis

.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to