--- A Rangel wrote: > I am hoping to get some expert opinions on a > relatively simple problem. Suppose that you want to > combine Pearson r and r-squared values from 20 imputed > data sets. It seems that standard advice (in small to > moderate samples) is to first transform r using >Fisher's (1915) r-to-z transformation. Similarly, > ln(r-squared) seems to be an appropriate > transformation. > > After combining and back-transforming, it is quite > possible -- perhaps likely -- that you get > inconsistent estimates. What I mean by that is that > squaring the combined Pearson r value can be quite > different from the combined R-square value that you > get from back-transforming ln(R-sq).
I would be surprised if the difference were big. The two statistics are closely related and use the same imputed datasets. There is likely to be some difference due to the fact that you are computing means in combination with non-linear transformations, but if both are reasonable than both should give you similar results. One way to get a feel for this is to do a number of simulations. Below I put some Stata code for one such simulation. In this simulation there is a systematic difference between the two methods, but only in the third digit. I would consider that sufficiently small to be ignorable. This simulation uses the Stata port of MICE, called ice by Patrick Royston, which can be downloaded by typing: ssc install ice . Notice that the Fisher's Z transformation is the arc-hyperbolic tangent (Stata function atanh) and its inverse is the hyperbolic tangent (Stata function tanh). Some of the other tricks used are explained in: http://home.fsw.vu.nl/m.buis/stata/exampleFAQ.html and http://home.fsw.vu.nl/m.buis/wp/discrete.html Hope this helps, Maarten *------------ begin simulation --------------------- set more off capture program drop sim program define sim, rclass drop _all matrix C = ( 1, .5 \ /// .5, 1 ) drawnorm x y, n(1000) corr(C) replace x = . if uniform() < invlogit(-1 + y) cd h:\temp ice x y using imp, m(5) replace use imp.dta, clear scalar r = 0 scalar rsq = 0 forvalues i = 1/5 { corr y x if _mj == `i' scalar r = r + atanh(`r(rho)') reg y x scalar rsq = rsq + ln(`e(r2)') } return scalar diff = sqrt(exp(rsq/5)) - tanh(r/5) end simulate diff=r(diff), reps(1000): sim hist diff *---------------- end simulation ---------------------- ----------------------------------------- Maarten L. Buis Department of Social Research Methodology Vrije Universiteit Amsterdam Boelelaan 1081 1081 HV Amsterdam The Netherlands visiting address: Buitenveldertselaan 3 (Metropolitan), room Z434 +31 20 5986715 http://home.fsw.vu.nl/m.buis/ ----------------------------------------- _______________________________________________ Impute mailing list Impute@lists.utsouthwestern.edu http://lists.utsouthwestern.edu/mailman/listinfo/impute