> Subject: IMPUTE: The art of imputation: thinking about MAR > Date: Wed, 14 Jan 2004 10:09:16 -0600 > From: "Howells, William" <[EMAIL PROTECTED]> > > We put a lot of thought into > building the imputation model and were careful to include other > covariates that were highly correlated with X2 and all those that we > want in the analysis model (note: did not include time to death because > of censoring and not MVN).
This is the feature of the analysis that most concerns me. Essentially what this does is to assume conditional independence of X2 and time to death given the other variables, which of course attenuates the relationship when you analyze a dataset including both observed and imputed values of X2. (The impact of this on estimated effect of the variable of interest X1 is not at all obvious, although you might be able to figure it out from looking at relationships of X1 and X2, etc.) I appreciate that modeling missing data with censored survival data is nonstandard and therefore messy (perhaps impossible to do "correctly" with any available standard software), but you are better off including this crucial relationship with some kind of approximate model than leaving it out altogether. To do this using PROC MI, one idea would be to create a few indicators for survival for 3 months, 6 months, 9 months, etc. (or whatever is appropriate to the time scale of your disease process). Censored observations have missing indicators for the time points later than time of censoring. Then throw this into PROC MI. You will not use the imputed values of the missing indicators, but this is a mechanism for using the censored survival data within an MVN imputation framework. (There might be some obvious reason why this won't work, but try it and see.) Of course this model is "wrong" but if higher X2 is actually associated with better prognosis, then using the outcomes in predicting X2 should help to predict this. > 2. We are confused by the fact that the patients missing X2 have > characteristics that are associated with better prognosis yet the > imputed values of X2 are lower than the observed data, which implies a > worse prognosis. Is this something to worry about? What the model (as you fit it) is using is the relationship between X2 and the other characteristics, not the relationship between X2 and outcomes. So it is possible that X2 predicts better prognosis (conditional on everything else) yet is associated with other characteristics that predict worse prognosis. > 3. Is there any justification for reversing our initial thoughts about > the MAR assumption and now argue for MNAR, eg., because the patients in > the missing X2 group have important differences from the fully observed > patients, there might be unmeasured covariates that independently depend > on X2mis? MNAR can never be demonstrated statistically (by definition) unless the model is identified by some other unverifiable assumptions, but if the results under a good MAR model (including survival outcomes) are scientifically implausible then it is reasonable to want to think hard about whether the process is MNAR. What is actually going on when the patient is discharged without measurement of X2? Is this because the recovery was unusually quick, or because the clinician didn't want to subject a very sick patient to the test? If you put missingness of X2 into the model as a predictor (instead of X2 itself), is it associated with survival? What can the clinicians tell you about what is going on? These are challenging problems ... good luck with your analysis.
