An HTML attachment was scrubbed...
URL: 
http://lists.utsouthwestern.edu/pipermail/impute/attachments/20090704/c6d0eada/attachment.htm
From allison <@t> soc.upenn.edu  Sat Jul  4 10:04:19 2009
From: allison <@t> soc.upenn.edu (Paul Allison)
Date: Sat Jul  4 10:04:28 2009
Subject: [Impute] weird question
In-Reply-To: <[email protected]>
Message-ID: <[email protected]>

I completely agree with Frank Harrell that this is not, in general, a good
method. I haven't checked out all his references but, for me, the definitive
refutation was Jones' 1996 paper in the Journal of the American Statistical
Association. 

Nevertheless, I still believe that this method may be useful in two
situations:

1. Data are "missing" because a variable doesn't apply or is undefined for
some fraction of cases.  For example, suppose you have a measure of marital
happiness, dichotomized as high or low, but your sample contains some
unmarried people. Then it is entirely appropriate to have a 3-category
variable with values high, low, and unmarried.

2. The goal is to build a forecasting model, and it is anticipated that a
substantial fraction of the new cases to be forecast will have missing data
on one or more variables. Here, the goal is not to get unbiased estimates of
population parameters but to minimize some function of prediction errors. A
workable forecasting model must have some way of dealing with the cases that
have missing data. Maybe there are better ways, but I've found almost no
literature on this topic (with the exception of Warren Sarle's unpublished
paper). 

-----------------------------------------------------------------
Paul D. Allison, Professor
Department of Sociology
University of Pennsylvania
581 McNeil Building
3718 Locust Walk
Philadelphia, PA  19104-6299
215-898-6717
215-573-2081 (fax)
http://www.ssc.upenn.edu/~allison
 

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Maria da
Conceicao-Saraiva
Sent: Saturday, July 04, 2009 9:19 AM
To: [email protected]
Subject: [Impute] weird question




Sorry about this question,

I have been discussing with some people I am working about the need of 
imputation with some of our data. What some of analysist are doing is just 
to creating a category of missing values inside some variables, they argue 
this is enough. It has been hard to argue with them that this is not the 
best way to do. Specially in our variable income, we have about 30% of 
missings.
Does anybody know about  refereces discussing this approach of just 
creating a category for missing values inside a variable?

Maria




~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~
Maria da Conceicao P. Saraiva DDS, MSc, Ph.D
Departamento de Clinica Infantil e Odontologia Social e Preventiva
Faculdade de Odontologia de Ribeirao Preto-Universidade de Sao Paulo

  Aviso: Esta mensagem destina-se exclusivamente ao destinatario, sendo
  confidencial. Se V. Sa. nao eh o destinatario, fique advertido de que a
divulgacao, distribuicao ou copia desta mensagem eh estritamente proibida.
Caso tenha recebido esta mensagem por engano, por favor avise
imediatamente seu remetente atraves de resposta por e-mail. Obrigado.
________________________________________________________
Warning: This message is intended exclusively for its addressee and
contain confidential information. If you are not the addressee, you are
hereby notified that any dissemination, distribution or copying of this
communication is strictly prohibited. If you have received this
communication  by mistake, please immediately notify the sender by reply
transmission. Thank you.



_______________________________________________
Impute mailing list
[email protected]
http://lists.utsouthwestern.edu/mailman/listinfo/impute


Reply via email to