> Does anyone know whether interaction terms (for categorical variables)
> should be included in the multiple imputation process, versus 
> just being created after the dataset has been imputed?  I have been 
> taking the latter approach out of concern about potential internal 
> inconsistencies in the data (e.g. separately imputed interaction term 
> is imputed as a "1" when one of
> the individual effect terms is imputed as a "0" for the same 
> observation).

Craig, if your complete-data model includes interactions, then I would say
that your imputation model should also have them. Omitting them during
imputation will bias the interaction term towards zero. The amount of bias
depends on the amount of missing data.
As you noticed, consistency problems can occur, especially if you use a
variable-by-variable imputation approach. There are several routes you could
follow. You may opt for loglinear modelling of the entire distribution
including any interactions you like, as in Schafer's cat approach. Another
possibility is to add the interaction term to the predictor set, and make
sure that it is updated as soon as the original variable is imputed. The
latter is possible with the 'passive option' in MICE. See pages 12-13 of
http://web.inter.nl.net/users/S.van.Buuren/mi/docs/Manual.pdf for examples.
Best,
Stef van Buuren.


**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.

**********************************************************************

Reply via email to