Yes. This makes sense. If the variable used to impute missing data is already in the final model, it will probably add little to use it to improve imputation of another covariate in the final model with missing data (as also suggested by Alan Zazlavsky). (It does keep the case with the missing data in the final analysis and may help simply by increasing the effective N.) But if the accessory variable isn't in the final model, either because of collinearity and a need to limit design df (as in my straw model) or because of known biases in the variable as In your example below, then including this information in the imputation may have an important impact on the final analysis.
If you still have the data set that you refer to below, it might be instructive to present this as a demonstration of the potential benefits of imputation using auxiliary covariates for imputation - even if there is no market for the underlying analysis itself. Larry Hunsicker From: David Judkins [mailto:david_judk...@abtassoc.com] Sent: Tuesday, April 16, 2013 8:36 AM To: Hunsicker, Lawrence Cc: IMPUTE@LISTSERV.IT.NORTHWESTERN.EDU; paulvonhippel.utaus...@gmail.com Subject: Re: "Accessory" variables in imputation I think that the impact on the variance of target parameters of using a class of variables in the imputation will be stronger for the class of adjunct variables than for the class of causally prior covariates in the target model. Parallel or alternate outcomes are particularly good examples of this. People who favor nesting variables with nonresponse within flags for missingness as an alternative to imputation fail to realize these gains in precision and possibly in bias reduction. (Obviously, they cannot include parallel outcomes in their analytic models.) It harkens back to one of the central themes in the debate between imputation and ANCOVA. The imputer frequently has access to a richer set of auxiliary information than does the downstream analyst. If we are shy about using that information in the imputation, then we have surrendered most of the advantage of imputation over the alternatives. To give an example from my own work, I had a longitudinal sample of 8th graders with parent interviews for the fall of the normative freshman year of college. Parent nonresponse was high with the 4.5 year gap. The primary outcome of interest was college admission. We matched students to administrative datasets about college going. We could not report the administrative data directly because of known biases (e.g. no coverage of children from families who do not require financial aid). Using the match status as an adjunct variable in imputation of the parent responses, however, had a huge impact on the final estimates. In addition to strong variance reduction, we also discovered that parent nonresponse was strongly nonignorable. Those whose children did not go to college were far less likely to respond to the survey. I would send you a reference but unfortunately the evaluation was cancelled without a report. --Dave Judkins ________________________________ Notice: This UI Health Care e-mail (including attachments) is covered by the Electronic Communications Privacy Act, 18 U.S.C. 2510-2521, is confidential and may be legally privileged. If you are not the intended recipient, you are hereby notified that any retention, dissemination, distribution, or copying of this communication is strictly prohibited. Please reply to the sender that you have received the message in error, then delete it. Thank you. ________________________________