Kathryn Campbell-Kibler wrote: > Hi all, > > I've recently been exploring beyond my established comfort zone with > mixed models, and am looking for some correction or reassurance. I am > working with experimental data on social perceptions of linguistic > variation. I've got two types of dependent variables: ratings on a 6 > point scale (e.g. not at all intelligent-very intelligent), which I've > been treating as linear variables and binary variables, based on > whether a given term was selected as a good description of a speaker > (e.g. hardworking). > > The independent variables (well, some of them) were: > speaker (8) > recording (nested, 4 for each speaker) -- which recording was being responded > to > (ING) (3) -- crossed with recording, indicates which guise of the > variable (ING) was used (e.g. working or workin') > two measures of listener mood pleasant and arousal > > The structure of the experiment was such that every subject heard one > recording (which represented also one (ING) guise) from each speaker. > > In the past with similar data, I have been using nlme for linear mixed > models, and using subject id as a random effect. (ING) effects and the > interaction of (ING) with the other variables, such as speaker, is the > main point of interest. I have two questions. > > 1) Is it more appropriate to build in both subject id and the > recording choice as random effects, rather than only including just > the subject id?
Hi Kathryn, Good question. I think this is an empirical issue: including random effects of recording choice could cost you a little inferential power due to the requirement of estimating the variance and correlations of the random effects, but it could also improve your model. At least for your linear mixed-effects model, you can treat it as an empirical issue: you can use a likelihood ratio test to compare models with and without random effects of recording choice, and if including the random effect doesn't lead to a significant improvement in the model, you could be justified in dropping it. There's a nice discussion of this issue in the following paper: Baayen, R.H., Davidson, D.J. and Bates, D.M. (submitted). Mixed-effects modeling with crossed random effects for subjects and items. > I am treating speakers as fixed effects, > deliberately-- I have no expectation that these particular speakers > are representative of anyone except themselves. But the recordings > within each speaker were randomly assigned to listeners. Just a note on this -- choosing to treat speaker as a random effect doesn't really commit you strongly to the particular speaker being representative of the population at large. Rather, it says that the effects individual speakers have on the outcome of the response variables of interest come from a normal distribution, and that the effects of your individual speakers come from a random sampling of that distribution. But any given speaker could well be an outlier within this distribution. This might simply be an issue of how you worded your point. If you chose your speakers with the idea that some of them might inherently be perceived as more or less intelligent, that would certainly justify treating speaker as a fixed effect. > 2) When doing an analysis of the binary variables, how can I tell > whether overdispersion and/or zero-inflation is an issue for me? I'm not so familiar with the issue of zero-inflation but I thought that was a concern for count data rather than binomial data, no? With respect to overdispersion: you can only talk about overdispersion in binomial data with respect to a potential variable that you think might have some effect on the outcomes. When you have such a variable in mind, you can look at whether there are large differences in the proportion of positive outcomes for different values of that variable. Finding that there are indeed large differences could justify adding the variable to your model as a random (or fixed) effect. > > Bringing these two questions together, I have been looking at using > lmer for both the "linear" and the binary variables, with something > like these: > > lmer(intellect~speaker*ining*(pleasant_mood+mood_arousal)+(1|subject_id)+(1|recording), > data=whitenoise) > > lmer(hardworking~speaker*ining*(pleasant_mood+mood_arousal)+(1|subject_id)+(1|recording), > family = binomial, data=whitenoise, method="AGQ") > > Does this make sense, do I need the "recording" term? This looks right -- you also might consider adding in random interaction terms between subject/recording with your fixed-effect variables. Out of curiosity, why are you looking at all possible interactions between fixed-effect variables except for those between pleasant_mood and mood_arousal? > And how can I > determine if I need to be concerned about zero-inflation and if so, is > glmmADMB my only option for the binary variables (a pain, since I > mostly use Macs)? Not sure -- see my above question about zero-inflation! Best Roger -- Roger Levy Email: [EMAIL PROTECTED] Assistant Professor Phone: 858-534-7219 Department of Linguistics Fax: 858-534-4789 UC San Diego Web: http://ling.ucsd.edu/~rlevy _______________________________________________ R-lang mailing list [email protected] https://ling.ucsd.edu/mailman/listinfo.cgi/r-lang
