Here is my suggestion.
Let P_i denote the true proportion in the ith study and p_i the corresponding
observed proportion based on a sample of size n_i. Then we know that p_i is an
unbiased estimate of P_i and if n_i is sufficiently large, we know that p_i is
approximately normally distributed as long as P_i is not too close to 0 or 1.
Moreover, we can estimate the sampling variance of p_i with p_i(1-p_i)/n_i.
Alternatively, we can use the logit transformation, given by ln[p_i/(1-p_i)],
whose distribution is approximately normal and whose sampling variance is
closely approximated by 1/( n_i p_i (1-p_i) ).
So, let
y_i = p_i with the corresponding sampling variance v_i = p_i(1-p_i)/n_i
or let
y_i = ln[p_i/(1-p_i)] with the corresponding sampling variance v_i = 1/( n_i
p_i (1-p_i) ).
With y_i and v_i, you can use standard meta-analytic methodology (if the
observed proportions are close to 0 or 1, I would use the logit transformed
proportions). You can fit the random-effects model, if you want to assume that
the variability among the P_i values is entirely random (and normally
distributed) and you are interested in making inferences about the expected
value of P_i. Or you can try to account for the heterogeneity among the P_i
values by examining the influence of moderators.
You might find a function that I have written useful for this purpose. See:
http://www.wvbauer.com/downloads.html
Alternatively, you could fit a logistic regression model with a random
intercept to these data (i.e., a generalized linear mixed-effects model). In
other words, knowing p_i and n_i for each study, you actually have access to
the raw data (consisting of 0's and 1's). This approach is essentially an
individual patient data meta-analysis. Such a model may or may not contain
any moderators. You can find a discussion of this approach, for example, in:
Whitehead (2002). Meta-analysis of controlled clinical trials. Wiley.
Hope this helps,
--
Wolfgang Viechtbauer
Department of Methodology and Statistics
University of Maastricht, The Netherlands
http://www.wvbauer.com/
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Inman, Brant A.
M.D.
Sent: Tuesday, March 06, 2007 00:56
To: r-help@stat.math.ethz.ch
Cc: Weigand, Stephen D.
Subject: [R] Mixed effects multinomial regression and meta-analysis
R Experts:
I am conducting a meta-analysis where the effect measures to be pooled are
simple proportions. For example, consider this data from Fleiss/Levin/Paik's
Statistical methods for rates and proportions (2003,
p189) on smokers:
Study N Event P(Event)
1 86 830.965
2 93 900.968
3 136 1290.949
4 82 700.854
Total397 372
A test of heterogeneity for a table like this could simply be Pearson'
chi-square test.
--
smoke.data - matrix(c(83,90,129,70,3,3,7,12), ncol=2, byrow=F)
chisq.test(smoke.data, correct=T)
X-squared = 12.6004, df = 3, p-value = 0.005585
--
Now this test implies that the data is heterogenous and that pooling might be
inappropriate. This type of analysis could be considered a fixed effects
analysis because it assumes that the 4 studies are all coming from one
underlying population. But what if I wanted to do a mixed effects (fixed +
random) analysis of data like this, possibly adjusting for an important
covariate or two (assuming I had more studies, of course)...how would I go
about doing it? One thought that I had would be to use a mixed effects
multinomial logistic regression model, such as that reported by Hedeker (Stat
Med 2003, 22: 1433), though I don't know if (or where) it is implemented in R.
I am certain there are also other ways...
So, my questions to the R experts are:
1) What method would you use to estimate or account for the between study
variance in a dataset like the one above that would also allow you to adjust
for a variable that might explain the heterogeneity?
2) Is it implemented in R?
Brant Inman
Mayo Clinic
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.