Kathryn Campbell-Kibler wrote: > Hi Roger! > >> Good question. I think this is an empirical issue: including random >> effects of recording choice could cost you a little inferential power >> due to the requirement of estimating the variance and correlations of >> the random effects, but it could also improve your model. At least for >> your linear mixed-effects model, you can treat it as an empirical issue: >> you can use a likelihood ratio test to compare models with and without >> random effects of recording choice, and if including the random effect >> doesn't lead to a significant improvement in the model, you could be >> justified in dropping it. > > That makes sense, and thanks for the reference, I will look at it. > Off-list, the issue was raised that since recording predicts speaker > (i.e. if the recording is about volleyball, you know it's Bonnie) that > there is a problem including both in the model. Are nested variables > like this a problem? (Unless that's explained in the paper, at which > point I will hopefully soon know the answer!)
Hi Kathryn, Good question! The answer is that for random effects, nesting variables is not problematic -- in fact, this kind of nesting is one of the reason that the terms "mixed-effects model" and "multilevel" are often used interchangeably. The way to think about this is that you have the following nesting structure: speaker -> recording -> expected_mean_observation If recording were a fixed effect, then it would indeed be the case that when you estimate parameters for recording, it would wipe out any role of parameter estimates for speaker (whether speaker is a random or fixed effect). However, since recording is a random effect, you are estimating only its variance rather than its actual parameter estimates, and the combination of speaker+recording can be thought of as the mean+variance of the normally distributed overall effect on your observations. >> One other point I neglected to mention. Technically it is not really > correct to treat data on a 6-point scale with a linear model, because > the error in your data cannot be normally distributed. This problem > will probably be worst in cases where the predicted response rate is > close to the extreme values, where the distribution is likely to be > skewed. Ordinal regression would probably be the most natural approach, > but the bad news is that I believe there is no current means within R to > include mixed effects in an ordinal regression model. > > You're right, of course. I've been avoiding this issue, on the > grounds that everyone in the field treats these responses as linear. > But I shouldn't. Do you have any ideas? My sense is that dealing > with random effects is very important, likely more so than the ordinal > issues, but that may be just because that's what I know more about. > Do you (or anyone else) know of another platform that could cope with > ordinal and mixed effects? It could be that I'm wrong and R does have a facility for mixed-effects modeling with ordinal regression -- anyone know? From the following page it looks like SAS might be able to do it: http://tigger.uic.edu/~hedeker/long.html Otherwise...within R, you might do the analysis both (1) with ordinal regression and without random effects, and (2) with linear regression and with random effects, and see what the differences are. Also, the results from a linear regression are less likely to be horribly wrong if the response means are never near the extremes of the scale. You might plot a histogram of the responses for various combinations of your fixed effects and see whether they're roughly normally distributed. > Thanks for all the advice! Sure -- hope it's useful! Best Roger -- Roger Levy Email: [EMAIL PROTECTED] Assistant Professor Phone: 858-534-7219 Department of Linguistics Fax: 858-534-4789 UC San Diego Web: http://ling.ucsd.edu/~rlevy _______________________________________________ R-lang mailing list [email protected] https://ling.ucsd.edu/mailman/listinfo.cgi/r-lang
