[EMAIL PROTECTED] (nothanks) wrote in message news:<[EMAIL PROTECTED]>... > I ...am curious why SEM > practitioners seem to assess model fit by only comparing the > observed and predicted covariance matrices. As opposed to (also) > using statistics based on observed vs. predicted outcomes.
A closely related question is why estimate an SEM from covariance matrices instead of raw data. The former would appear to use only some of the information (as in limited information maximum likelihood or LIML estimation), whereas the latter would be a full-information (as in full-information maximum likelihood or FIML estimation) approach. Concerning this, I recall a statement in the SEM literature to the effect that if data are multivariate normal, the covariance matrix contains all the information in the data relevant to SEM; whether that is true I don't know. (Of course, much of the time data are not multivariate normal--so that's hardly general rationale.) AMOS and LISREL both have a feature called "FIML estimation" which seems to perform a type of imputation on missing data. Whether this also estimates the parameters of the SEM via full-information ML, I don't know. That is, I don't know whether they use the term in the sense outlined above. In any case, it is possible to write the likelihood function for the probability of observing the observed data given the model and a set of parameter values. And this likelihood function can be used as a basis for iterative, FIML estimation of parameters. BUT, that does not lead to a test of model fit, per se. Potentially one could compare two different models based on their difference in -2*loglikelihood. But a test of *absolute* fit for a given model cannot be made. For me, the practical advantage of fitting covariance matrices is precisely that it provides a simple test of absolute model fit. Now since the test of model fit is done at the level of covariance matrices, it seems logical that one should be trying to fit the covariance matrix, not the raw data. Another consideration is that fitting covariance matrices is a lot faster computationally than performing true FIML on raw data. If you have 100,000 cases, and 10 variables, I'd rather be fitting a 10 x 10 covariance matrix than a 100,000 x 10 data array! In short, I think that from a purely statistical standpoint, there are probably more accurate methods than fitting a covariance matrix. But from a practical standpoint it seems like a very reasonable thing to do. Note there are other areas in statistics where, for practical reasons, models are fit based on marginal frequencies rather than raw data. -------------------------------------------------------------------------------- John Uebersax, PhD (858) 597-5571 La Jolla, California (858) 625-0155 (fax) email: [EMAIL PROTECTED] Statistics: http://ourworld.compuserve.com/homepages/jsuebersax/agree.htm Psychology: http://members.aol.com/spiritualpsych -------------------------------------------------------------------------------- . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
