[I sent a similar message to the SRMS listserv. Apologies to those with dual subscriptions.]
I think that I have an application for the methods that Bob published in 1986, and I am curious to see how much usage there has been for the methodology. So if you have used it, please write back and say a little about the experience. Even if you have not, I am interested in other reactions to the potential application. My application concerns high school graduation status for a sample of youth sampled from old middle school rosters. We tried contacting their parents in the fall following the spring that would have seen these kids graduate if they followed the standard track. Parent nonresponse was high, often because we could not find the parents based on the 5-year old school records. In this day and age of caller ID, we also suspect that many parents appeared to be unlocated by virtue of never answering the phone. But the data collection is now complete, so we have to figure out how to use what we have. Partly in reaction to the response rate, we decided to contact states and districts to get high school graduation from official records. However, this was no panacea. State record systems often contained no data about the child - at least not in terms of the identifying information given to us by the middle schools. Call these disavowals. Worse, even when they were able to recognize the child, state record systems appear unable to reliably distinguish between a failure to graduate from high school on a normal schedule and a transfer between school systems (interstate moves, public-private transfers, even public-public transfers). So the high school graduation status from state records can be best though of as having two values: yes and maybe. While the high school graduation from parent interviews has the values: yes and no. So we cannot use the administrative data to directly fill in high school graduation status. Some sort of imputation appears appropriate. However, the auxiliary data for nonrespondents and nonmatches is very weak. We got little more than names from the schools. Given this paucity of auxiliary data, an assumption of MAR is little different from an assumption of MCAR. Neither seems reasonable. We observed that the yes rate in the administrative data is far lower among the parent survey nonrespondents than among the parent survey respondents. It appears that parents were easier to find and more willing to respond to a survey about education if their children persisted in high school. It is also true that state disavowal rate was higher among youth reported by their parents not yet to have attained high school graduation than among youth reported to have graduated. Perhaps mobility and name changing are more common among families whose children drop out of high school. So it seems to fit the Fay 1986 framework well. There are two binary substantive variables v1 and v2. There are two binary response indicators, r1 and r2. Fay's graphic model M3 in his figure 5 might be a fairly reasonable model. After some more research, I see that several statisticians have extended Fay's ideas since 1986. There is Baker and Laird in 1988, Baker, Rosenberger, and DerSimonian in 1992, Brown and Taesung Park in 1994, Taesung Park in 1998, Paul Green and Taesung Park in 2003, and Boseung Choi and Yousung Park in 2005. What about these? Has anyone applied these methods in a production environment for a federal study? David Judkins Senior Statistician Westat 1650 Research Boulevard Rockville, MD 20850 (301) 315-5970 [email protected] -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.utsouthwestern.edu/pipermail/impute/attachments/20080402/55816a10/attachment.htm
