> >> I have 8 variables per observation, all count data > >> (integers>0), and I want to be able to run an R factor > >> analysis to obtain factor scores. The data have the > >> following attributes: > > >> (1) Hundreds of thousands of observations at my disposal, from which I can sample if nec. > >> (2) Significantly non-normal, apparently not very amenable to transformations > > Normality is essentially irrelevant for the validity of > factor models. It is linearity, and it is this which > essentially excludes count data.
You may want to try mixture modeling approaches based on the multinomial distribution: http://www.hiit.fi/u/buntine/ais03.html http://www.hiit.fi/u/buntine/ecml02.html Each distribution in the mixture can be interpreted as a factor. Best regards, Aleks -- mag. Aleks Jakulin http://ai.fri.uni-lj.si/aleks/ Artificial Intelligence Laboratory, Faculty of Computer and Information Science, University of Ljubljana. . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
