Re: Use of multiple imputations in hypothesis tests other than t-tests

David Crow Wed, 11 Nov 2009 11:37:49 -0800


Dear David-

Although the multiple imputation estimator for coefficients is simplythe average of coefficients across imputed data sets, standard errorscannot be treated that way. This is so because beyond thevariability associated with the coefficient estimates in eachindividual imputed data set (the "within-imputation" component ofvariance), additional variability arises when estimating *across* thedifferent data sets--the "between-imputation" component of variance.



The formula for calculating standard errors is:


se-MI = u-bar-M + ((M+1)/M)*b-M),


where:


se-MI is the standard error of a multiply imputed coefficient;


M is the number of imputed data sets;

u-bar-M is mean variance across imputed data sets (i.e., 1/M * sigmas^^2-i, where "sigma" is a summation, "s" is the standard error, "^^"means squared, and "i" indexes imputed data sets);

and b-M is the variance of coefficient estimates around their mean,with an adjustment factor that reduces b-M in proportion to thenumber of multiply imputed data sets (i.e., b-M = ((1/M-1)*sigma(e-I- e-bar-MI)^^2), where e-I is the coefficient estimate for imputeddata set i and e-bar-MI is the mean of coefficients across imputed data sets).

Sorry for the awkward notation. You can get a much prettier versionof these equations in pp. 108-109 of:

Raghunathan T.E. (2004). "What do we do with missing data? Someoptions for analysis of incomplete data", Annual Review of PublicHealth, 25, 99-117.

I see that others have already suggested implementations in softwarepackages. That great news: I've always done it manually in aspreadsheet program!



Hope this helps,

David



At 09:38 AM 11/11/2009, David Judkins wrote:

Well, I think this is the first question to the group since listownership changed. I wonder how many people are signed up now? Ithasn't been a very active list for a long time. Anyway, here is my question.
I have a dataset with multiple imputations. It is from a five-armGRT. One arm is a control and the other four are active. I want totest for variation in mean responses across the four activearms. Proc Mixed will give me a test statistic based on eachmultiple imputation. But how do I combine these?
One of colleagues found something in the HLM manual that wouldsuggest that the replicates of test statistics other thant-statistics are averaged with no attention paid to the variabilityamong them. Sound accurate about HLM? Is that the best we can do?
David Judkins
Senior Statistician
Westat
1650 Research Boulevard
Rockville, MD 20850
(301) 315-5970
[email protected]



==============================
David Crow
Associate Director
Survey Research Center
University of California, Riverside
900 University Avenue
1419 Spieth Hall
Riverside, CA  92521
Tel.:  (951) 827-4028
Fax:  (951) 827-4035
Web: survey.ucr.edu
==============================

"It is the mark of an educated mind to rest satisfied with the degreeof precision which the nature of the subject admits and not to seekexactness where only an approximation is possible." Aristotle

Re: Use of multiple imputations in hypothesis tests other than t-tests

Reply via email to