>>> Peter Hannan <[email protected]> 09/23/05 2:05 PM >>> Paul, try a null model in GLM. Peter
Does anyone have a neat recipe for estimating descriptive statistics (means and standard deviations) from multiply imputed data using SAS. I've done this a number of different ways, but they all seem like more trouble than they should be. It appears that getting means and standard deviations is substantially harder than getting regression estimates! Best, Paul ----------------------------- Peter J Hannan Senior Research Fellow Division of Epidemiology and Community Health, SPH University of Minnesota 1300 South 2nd St. #300 Minneapolis, MN. 55454-1015 email: [email protected] voice: 612-624-6542 FAX : 612-624-0315 -------------- next part -------------- A non-text attachment was scrubbed... Name: Header Type: application/octet-stream Size: 1609 bytes Desc: not available Url : http://lists.utsouthwestern.edu/pipermail/impute/attachments/20050923/5b3f60b5/Header.obj From Howells_W <@t> bmc.wustl.edu Tue Sep 27 10:35:38 2005 From: Howells_W <@t> bmc.wustl.edu (Howells, William) Date: Tue Sep 27 10:36:06 2005 Subject: [Impute] the PE statistic with imputed data Message-ID: <2ada428b6944da4b8f8a2fdf4e60e52a197...@exchange.wusm-pcf.wustl.edu> I'm interested in calculating what some have referred to as the "proportion explained" statistic from two regression models with imputed data. This statistic comes up in the analysis of indirect effects (or surrogate variables, or mediation effects, depending on the literature, eg. Freedman and Schatzkin, Am J Epi 1992). PE = (C-C')/C where C = the unadjusted effect of some independent variable and C' = the same effect adjusted by the putative mediator. PE quantifies the proportion reduction in the independent variable due to mediation. If PE = 1 there is complete mediation. If PE = 0, there is no mediation. Calculation of standard errors for PE is controversial, depending on the outcome, in my case a time to event outcome. I'm using the method due to Lin, Fleming, and DeGruttola (Stats in Medicine 1997). But with 50 imputed datasets. My question is whether I am combining the imputations correctly. I first impute my 50 datasets, n=600 each. I run the two regression models within each imputed dataset and obtain C and C', and apply the Lin et al formula to obtain both PE and se(PE). Then I use the usual formulas as implemented in SAS PROC MIANALYZE to obtain the combined PE over the m=50 imputations and the combined variance using the within and between imputation variance. All seems well. I think this is the right approach. The other way of doing it is to first calculate C and C' separately by averaging over the imputations and then find PE from these C and C'. Note that mathematically this produces a different result than the above method. For example, with m=2 imputations that produced (C,C') = (6,4) and (3,1) then PE_1 = (6-4)/6 = 1/3 and PE_2 = (3-1)/3 = 2/3. The first method produces (1/3 + 2/3)/2 = 1/2. The second method produces [(6+3)/2 - (4+1)/2] / [(6+3)/2] = (9/2-5/2)/ 9/2 = 4/9. I'm just looking for confirmation that the second method is incorrect. Thanks. Bill Howells, MS Wash U Med School, St Louis <br/>The materials in this message are private and may contain Protected Healthcare Information. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail.
