Dear Edstat-listers, I have 8 variables per observation, all count data (integers>0), and I want to be able to run an R factor analysis to obtain factor scores. The data have the following attributes:
(1) Hundreds of thousands of observations at my disposal, from which I can sample if nec. (2) Significantly non-normal, apparently not very amenable to transformations (3) Significant portions of the observations have zeros "across the board" As I understand it, the assumption of normality is less important as number of observations increase. I am aware that recent theoretical work has been done on explicitly factor analyzing count data (i.e. Chib and Winkelmann; GLLAMM), but I was wondering if my simple use of a program like STATA to obtain factor scores here in the meantime is defensible (or whether other parameters are necessary for this decision). Thanks in advance, Chihmao Hsieh Olin School of Business Washington University in St. Louis . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
