IMPUTE: Re: combining multiply imputed estimates of R-squared

jmohr Sun Jun 26 08:25:01 2005

Jean-Phillippe:
I had a similar question in an off-list message exchange with Rod. Here is the 
set of messages:


>>> Rod Little [email protected]> 11/28/03 12:56PM >>
Jonathan: MI methods are based on asymptotic theory, and transforming to
something more normal and then back-transforming is a good idea, since it
improves the validity of the asymptotic theory in moderate samples. The
method is still valid asymptotically without the transformation, which is
more a small-sample refinement. Rod

On Fri, 28 Nov 2003, Jonathan Mohr
wrote:

> Rod: Thanks much for your response; I never thought I'd get advice from
> such an authoritative source!
>
> My confusion regarding this issue began after reading the PROC MIANALYZE
> documentation, which begins, "For some parameters of interest, it is not
> straightforward to compute estimates and associated covariance matrices
> with standard statistical SAS procedures. Examples include correlation
> coefficients between two variables and ratios of variable means. Special
> cases such as these are described in the "Examples of the Complete-Data
> Inferences" section." The example for the correlation coefficient
> suggests that the proper procedure for combining multiply imputed
> bivariate correlations is to use the Fisher r-to-z transformation prior
> to averaging (and then back-transforming). Similarly, I've seen a few
> articles by Don Rubin and his colleagues where a number of procedures
> were proposed for obtaining an accurate p-value for tests on an overall
> model (e.g., F tests for regression models).
>
> Perhaps such strategies are only necessary when one wants to obtain
> p-values or confidence intervals for the statistic. For example, if, for
> a regression analysis, one wants R-squared only for a measure of
> explained variance, then (as you suggest) it is fine to average the
> multiply imputed values of R-squared. Similarly, if, for a structural
> equation model, one wants the model chi squared value only to calculate
> fit indices, then it may be fine to average the multiply imputed chi
> squared values.
>
> I fear that my confusion about this may reveal my lack of statistical
> sophistication. However, I've been in touch with a number of similarly
> naive "users" of multiple imputation who are also confused about this
> issue. I know that I and others would be grateful for any clarification
> on this issue that you or anyone could provide on this topic.
>
> Thanks again for your kind attention!
> Best,
> Jon


>>> "Laurenceau, Jean-Philippe" <[email protected]> 11/28/03 02:35PM >>>
Rod--Would that also be the case even with a simple correlation coefficient?  
If so, why wouldn't something like an r-to-z transormation be involved with 
Rubin's rule aggregation?  Thanks for your thoughts, J-P

    -----Original Message----- 
    From: [email protected] on behalf of Rod Little 
    Sent: Thu 11/27/2003 9:19 PM 
    To: Jonathan Mohr 
    Cc: [email protected] 
    Subject: IMPUTE: Re: combining multiply imputed estimates of R-squared
    
    

    Jonathan: R-squared is just another estimand, and the correct MI procedure
    is to simply average the values from each MI data set. Rod Little
    
    On Mon, 24 Nov 2003, Jonathan Mohr wrote:
    
    > I am in the midst of using multiple imputation with multiple
    > regression. The literature I've seen focuses on combining regression
    > coefficients and corresponding standard errors. However, I've seen
    > nothing on combining the estimates of R-squared. I would appreciate
    > any guidance or leads that list members can offer. Best, Jon
    >
    > __________________________________
    >
    > Jonathan Mohr, Ph.D.
    > Assistant Professor
    > Department of Psychology
    > Loyola College
    > 4501 North Charles Street
    > Baltimore, MD  21210-2699
    >
    > E-mail: [email protected]
    > Phone: 410-617-2452
    > Fax: 410-617-5341
    > __________________________________
    >
    >
    
    
___________________________________________________________________________________
    Roderick Little
    Richard D. Remington Collegiate Professor                  (734) 936-1003
    Department of Biostatistics                          Fax:  (734) 763-2215
    U-M School of Public Health                        
    M4045 SPH II                            [email protected]
    1420 Washington Hgts                    http://www.sph.umich.edu/~rlittle/
    Ann Arbor, MI 48109-2029
    
    
    
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://lists.utsouthwestern.edu/pipermail/impute/attachments/20031128/1cb05e2c/attachment.htm
From wv <@t> isd.sdu.dk  Sat Nov 29 14:18:56 2003
From: wv <@t> isd.sdu.dk (Werner Vach)
Date: Sun Jun 26 08:25:01 2005
Subject: IMPUTE: Re: combining multiply imputed estimates of R-squared
References: <pine.wnt.4.21.0311272118090.1112-100...@little-home>
Message-ID: <[email protected]>

Dear Jonathan and Rod,

in principle I agree with Rod.  

However, I think in using MI one should be aware, that
measure for the predictive accuracy like R-squared should be handeled 
with greater care
than regression parameter estimates.

The reason for this is, that measures of  predictive accuracy are more 
sensitive to the choice of the
model we use (implicitely) in generating the imputations than regression 
parameters. If we apply
MI to regression models (with missing values in the covariates) many 
people use procedures, which
assume that the regression  model is correctly specified (and 
negelecting the general advice, that
the model we using to generate the MIs should be more general than the 
model we would like to
analyse). So it will happen frequently, that if we have for example in 
reality a quadratic model,
we draw imputations still in a way assuming a linear model. This is no 
big problem, as long
as we look on the regression parameters, as one does not introduce bias 
in the estimation this
way (although confidence intervals will be too optimistic). However, 
with respect to measures of
predictive accuracy we will introduce a bias, because the imputations 
make the data as looking
like the model.

So whenever one would like to use MI to measure predictive accuracy, I 
recommend to base the generation
of the MIs on models, which are much more general than the regression 
model to be analysed, e.g.
including quadratic terms and interactions and perhaps heterogeneous 
variances.

Best

   Werner







Rod Little schrieb:

>Jonathan: R-squared is just another estimand, and the correct MI procedure
>is to simply average the values from each MI data set. Rod Little
>
>On Mon, 24 Nov 2003, Jonathan Mohr wrote:
>
>  
>
>>I am in the midst of using multiple imputation with multiple
>>regression. The literature I've seen focuses on combining regression
>>coefficients and corresponding standard errors. However, I've seen
>>nothing on combining the estimates of R-squared. I would appreciate
>>any guidance or leads that list members can offer. Best, Jon
>>
>>__________________________________
>>
>>Jonathan Mohr, Ph.D.
>>Assistant Professor
>>Department of Psychology
>>Loyola College
>>4501 North Charles Street
>>Baltimore, MD  21210-2699
>>
>>E-mail: [email protected]
>>Phone: 410-617-2452
>>Fax: 410-617-5341
>>__________________________________
>>
>>
>>    
>>
>
>___________________________________________________________________________________
>Roderick Little
>Richard D. Remington Collegiate Professor                  (734) 936-1003
>Department of Biostatistics                         Fax:  (734) 763-2215
>U-M School of Public Health                         
>M4045 SPH II                            [email protected]
>1420 Washington Hgts                    http://www.sph.umich.edu/~rlittle/
>Ann Arbor, MI 48109-2029
>
>
>
>  
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://lists.utsouthwestern.edu/pipermail/impute/attachments/20031129/85c6e36a/attachment.htm

IMPUTE: Re: combining multiply imputed estimates of R-squared

Reply via email to