Re: [R] researcher with highly skewed data set seeks help finding practical GLMM tutorial

Achim Zeileis Tue, 30 Nov 2010 02:02:38 -0800

On Tue, 30 Nov 2010, Ben Kenward wrote:

Hi!


I am a psychologist who suspects that the only sensible way to analyse
a particular data set is to use generalised linear mixed models. I am
hoping that someone might be able to point me in the right direction
to find some very practical hands on documentation that might be able
to talk me through actually doing such an analysis?

So far in my searches the most useful document I have turned up is
Bolker et al. (2008, TREE) Generalized linear mixed models: a
practical guide for ecology and evolution. As a general guide it
doesn't give enough practical information about how to get the job
done. The R documentation is obviously practical, but doesn't help to
decide what kind of analysis is appropriate. Apart from those sources
I am mainly finding quite theoretical treatments going over my head,
for example: 
http://www.cmm.bristol.ac.uk/learning-training/multilevel-m-software/reviewr.pdf.

I am moderately competent programming in R, having coded custom
permutation tests before (which in contrast to GLMM I find intutive).

In case anyone is kind enough to give me any specific pointers, here
is the nature of my data set. With an N of 42 subjects, I have a
highly left skewed (about half the data points are zero) frequency
variable as dependent variable. This variable is measured in each
subject in three different task types. There is furthermore a context
variable with two levels. Each task was administered in each context,
but not for every single subject.

So the design is quite simple - two fixed factors (task and context),
one random factor (subject), and an untransformably skewed dependent
variable. I might want to add some additional fixed factors (age
group) in future but for now I would like to keep it simple. I guess
this is straightforward for those in the know. Any help at all much
appreciated!

Given that you have frequency data with many zeros, some zero-augmentedcount data model might be useful. For example a hurdle model or azero-inflated Poisson or negative binomial model. Both lead often tosimilar fits but the hurdle model is typically easier to interpret. Anoverview using the "pscl" package is given inhttp://www.jstatsoft.org/v27/i08/

This implementation currently does not support random effects though. Butfor a start a hurdle() model with sandwich standard errors should beuseful to find out whether this type of model is useful for your data.

If so, you might also want to have a look at the "gamlss" package thatsuports a somewhat different implementation of ZIP models but has randomeffects. See http://www.jstatsoft.org/v23/i07/


hth,
Z

Cheers,

Ben

--
Dr. Ben Kenward
Department of Psychology, Uppsala University, Sweden
http://www.benkenward.com

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] researcher with highly skewed data set seeks help finding practical GLMM tutorial

Reply via email to