Hi all Here is a (maybe) simple question as I am new in using imputation methods. My question does not deal with very sharp statistical techniques, rather it concerns the core sense of this method.
I do studies with families and I usually get data from fathers, mothers and children. Most often data from fathers is missing because of divorced families or fathers who don't want to complete questionnaires. In other words when data from fathers is missing, all the answers are missing and not only few items. Then imagine a simple design with only one wave of data collected. If I want to use data from both fathers, mothers and children, I have to limit my sample to complete families which can be seen as using a listwise deletion. But if I use an imputation method in order to enhance the sample size, I am likely to generate data for fathers from nothing because most often they don't answer AT ALL to the questionnaire. How that kind of data can be described ? Are they MAR or MNAR ? In this case is imputation required and does it works reasonnably well ? Does it exists special methods of imputation for that kind of situations that are easily available ? Does anyone has a reference on this topic ? Any help is appreciated. Julien BOIS Julien Laboratoire Sport et Environnement Social Universit? Joseph Fourier 38400 Saint Martin d'H?res FRANCE 00 (33) (0)4 76 63 50 97 mailto : [email protected] Site Web : http://www.ujf-grenoble.fr/ufraps/Recherche/SENS/Membres/Page_Bois.htm -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.utsouthwestern.edu/pipermail/impute/attachments/20030821/aa1466dc/attachment.htm From newgardc <@t> ohsu.edu Thu Aug 21 13:23:53 2003 From: newgardc <@t> ohsu.edu (Craig Newgard) Date: Sun Jun 26 08:25:00 2005 Subject: IMPUTE: Re: MNAR or MAR ? Message-ID: <[email protected]> Julien, What you describe does not sound like it would fit the MAR or MCAR assumptions required. When you have reason to suspect that one group within a sample (e.g., fathers) is different than other groups within a sample (e.g., mothers, children), then you may think about performing MI separately on the groups, then re-combining the data. The ability to do this is at least partially contingent on having sufficient data within each group to build a MI model. Craig Craig D. Newgard, MD, MPH Assistant Professor Department of Emergency Medicine Department of Public Health & Preventative Medicine Oregon Health & Science University 3181 Sam Jackson Park Road Mail Code CR-114 Portland, OR 97201-3098 (503) 494-1668 (Office) (503) 494-4640 (Fax) [email protected] >>> Julien Bois <[email protected]> 08/21/03 12:48AM >>> Hi all Here is a (maybe) simple question as I am new in using imputation methods. My question does not deal with very sharp statistical techniques, rather it concerns the core sense of this method. I do studies with families and I usually get data from fathers, mothers and children. Most often data from fathers is missing because of divorced families or fathers who don't want to complete questionnaires. In other words when data from fathers is missing, all the answers are missing and not only few items. Then imagine a simple design with only one wave of data collected. If I want to use data from both fathers, mothers and children, I have to limit my sample to complete families which can be seen as using a listwise deletion. But if I use an imputation method in order to enhance the sample size, I am likely to generate data for fathers from nothing because most often they don't answer AT ALL to the questionnaire. How that kind of data can be described ? Are they MAR or MNAR ? In this case is imputation required and does it works reasonnably well ? Does it exists special methods of imputation for that kind of situations that are easily available ? Does anyone has a reference on this topic ? Any help is appreciated. Julien BOIS Julien Laboratoire Sport et Environnement Social Universit? Joseph Fourier 38400 Saint Martin d'H?res FRANCE 00 (33) (0)4 76 63 50 97 mailto : [email protected] Site Web : http://www.ujf-grenoble.fr/ufraps/Recherche/SENS/Membres/Page_Bois.htm -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.utsouthwestern.edu/pipermail/impute/attachments/20030821/2ff528bd/attachment.htm From DMcLaughlin <@t> air.org Thu Aug 21 15:16:16 2003 From: DMcLaughlin <@t> air.org (McLaughlin, Don) Date: Sun Jun 26 08:25:00 2005 Subject: IMPUTE: Re: MNAR or MAR ? Message-ID: <[email protected]> It would seem to be appropriate to consider the family as the responding unit, with subvectors of father,mother, and child responses. If there are relations between the responses of the different members of the same family, then mother and child responses should be useful in imputing father variables. Don McLaughlin Chief Scientist American Institutes for Research (650) 493-3550 -----Original Message----- From: Craig Newgard [mailto:[email protected]] Sent: Thursday, August 21, 2003 11:24 AM To: [email protected]; [email protected] Subject: IMPUTE: Re: MNAR or MAR ? Julien, What you describe does not sound like it would fit the MAR or MCAR assumptions required. When you have reason to suspect that one group within a sample (e.g., fathers) is different than other groups within a sample (e.g., mothers, children), then you may think about performing MI separately on the groups, then re-combining the data. The ability to do this is at least partially contingent on having sufficient data within each group to build a MI model. Craig Craig D. Newgard, MD, MPH Assistant Professor Department of Emergency Medicine Department of Public Health & Preventative Medicine Oregon Health & Science University 3181 Sam Jackson Park Road Mail Code CR-114 Portland, OR 97201-3098 (503) 494-1668 (Office) (503) 494-4640 (Fax) [email protected] >>> Julien Bois <[email protected]> 08/21/03 12:48AM >>> Hi all Here is a (maybe) simple question as I am new in using imputation methods. My question does not deal with very sharp statistical techniques, rather it concerns the core sense of this method. I do studies with families and I usually get data from fathers, mothers and children. Most often data from fathers is missing because of divorced families or fathers who don't want to complete questionnaires. In other words when data from fathers is missing, all the answers are missing and not only few items. Then imagine a simple design with only one wave of data collected. If I want to use data from both fathers, mothers and children, I have to limit my sample to complete families which can be seen as using a listwise deletion. But if I use an imputation method in order to enhance the sample size, I am likely to generate data for fathers from nothing because most often they don't answer AT ALL to the questionnaire. How that kind of data can be described ? Are they MAR or MNAR ? In this case is imputation required and does it works reasonnably well ? Does it exists special methods of imputation for that kind of situations that are easily available ? Does anyone has a reference on this topic ? Any help is appreciated. Julien BOIS Julien Laboratoire Sport et Environnement Social Universit? Joseph Fourier 38400 Saint Martin d'H?res FRANCE 00 (33) (0)4 76 63 50 97 mailto : [email protected] Site Web : <http://www.ujf-grenoble.fr/ufraps/Recherche/SENS/Membres/Page_Bois.htm> http://www.ujf-grenoble.fr/ufraps/Recherche/SENS/Membres/Page_Bois.htm -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.utsouthwestern.edu/pipermail/impute/attachments/20030821/a803903f/attachment.htm From zaslavsk <@t> hcp.med.harvard.edu Fri Aug 22 16:13:20 2003 From: zaslavsk <@t> hcp.med.harvard.edu (Alan Zaslavsky) Date: Sun Jun 26 08:25:00 2005 Subject: IMPUTE: Re: MNAR or MAR ? Message-ID: <[email protected]> > But if I use an imputation method in order to enhance the sample size, I am > likely to generate data for fathers from nothing because most often they don't > answer AT ALL to the questionnaire. > How that kind of data can be described ? Are they MAR or MNAR ? Just one point that should be made clear: the MAR assumption in itself is never verifiable or falsifiable from the data alone without some additional identifying assumptions. For example if you assume that the complete data are normally distributed or have linear relationships or are Markovian, you might attribute deviations from those assumptions in the observed data to a MNAR missing data process. However you would have to be willing to act as if you are pretty sure of the complete data modeling assumptions to proceed in this way. Otherwise the MAR assumption is by definition only about aspects of the process that cannot be observed. In your example, as another correspondent pointed out, the unit is the family and you have much information in the spouse's and children's responses to help with the imputation. Also, if you can modify the complete-data analysis to accomodate this type of missingness you might save yourself from having to do the imputation.
