IMPUTE: thinking about MAR

Alan Zaslavsky Thu, 15 Jan 2004 06:40:23 -0800

> Subject: IMPUTE: The art of imputation: thinking about MAR
> Date: Wed, 14 Jan 2004 10:09:16 -0600
> From: "Howells, William" <[EMAIL PROTECTED]>
>
> We put a lot of thought into
> building the imputation model and were careful to include other
> covariates that were highly correlated with X2 and all those that we
> want in the analysis model (note: did not include time to death because
> of censoring and not MVN).


This is the feature of the analysis that most concerns me.  Essentially
what this does is to assume conditional independence of X2 and time to
death given the other variables, which of course attenuates the
relationship when you analyze a dataset including both observed and
imputed values of X2.  (The impact of this on estimated effect of the
variable of interest X1 is not at all obvious, although you might be able
to figure it out from looking at relationships of X1 and X2, etc.)  I
appreciate that modeling missing data with censored survival data is
nonstandard and therefore messy (perhaps impossible to do "correctly"
with any available standard software), but you are better off including
this crucial relationship with some kind of approximate model than
leaving it out altogether.

To do this using PROC MI, one idea would be to create a few indicators
for survival for 3 months, 6 months, 9 months, etc.  (or whatever is
appropriate to the time scale of your disease process).  Censored
observations have missing indicators for the time points later than time
of censoring.  Then throw this into PROC MI.  You will not use the
imputed values of the missing indicators, but this is a mechanism for
using the censored survival data within an MVN imputation framework.
(There might be some obvious reason why this won't work, but try it and
see.) Of course this model is "wrong" but if higher X2 is actually
associated with better prognosis, then using the outcomes in predicting
X2 should help to predict this.

> 2.  We are confused by the fact that the patients missing X2 have
> characteristics that are associated with better prognosis yet the
> imputed values of X2 are lower than the observed data, which implies a
> worse prognosis.  Is this something to worry about?  

What the model (as you fit it) is using is the relationship between X2
and the other characteristics, not the relationship between X2 and outcomes.
So it is possible that X2 predicts better prognosis (conditional on
everything else) yet is associated with other characteristics that predict
worse prognosis.

> 3.  Is there any justification for reversing our initial thoughts about
> the MAR assumption and now argue for MNAR, eg., because the patients in
> the missing X2 group have important differences from the fully observed
> patients, there might be unmeasured covariates that independently depend
> on X2mis?  

MNAR can never be demonstrated statistically (by definition) unless the
model is identified by some other unverifiable assumptions, but if the
results under a good MAR model (including survival outcomes) are
scientifically implausible then it is reasonable to want to think hard
about whether the process is MNAR.  What is actually going on when the
patient is discharged without measurement of X2?  Is this because the
recovery was unusually quick, or because the clinician didn't want to
subject a very sick patient to the test?  If you put missingness of X2
into the model as a predictor (instead of X2 itself), is it associated
with survival?  What can the clinicians tell you about what is going on?

These are challenging problems ... good luck with your analysis.

IMPUTE: thinking about MAR

Reply via email to