This is a very complex problem.  Years ago, I worked on a similar problem of 
imputing cost and payment share data about medical events. In the United 
States, the cost for medical care is often shared by several parties.  In the 
problem I worked on, there were cases for which respondents were able to report 
how much they personally paid but for which they could not report either the 
amounts paid by other parties, the total cost, or either.  There were also 
cases in which they knew the total cost but had no payment share data, perhaps 
because negotiations were continuing.  Working with others here, I developed 
algorithms that would impute partial compositional data.  Here are some 
references.  

Marker, D. A., Judkins, D. R., and Winglee, M. (2001).  Large-scale imputation 
for complex surveys, in Survey Nonresponse, Eds. R. M. Groves, D. A. Dillman, 
E. L. Eltinge, and R. J. A. Little.  New York: Wiley.  

England, A., Hubbell, K., Judkins, D., and Ryaboy, S. (1994).  Imputation of 
medical cost and payment data.  Proceedings of the Section on Survey Research 
Methods of the American Statistical Association, 406-411. 

Judkins, D. R., Hubbell, K.A., and England, A.M.  (1993).  The imputation of 
compositional data.  Proceedings of the Section on Survey Research Methods of 
the American Statistical Association, 458-462. 
 

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Drechsler J?rg
Sent: Tuesday, February 05, 2008 11:37 AM
To: [email protected]
Subject: [Impute] Imputation under logical constraints

Hi all,

I have some questions for imputation under logical constraints:

I am multiply imputing missing values for variables from an establishment 
survey using sequential regression.

Now  let's start with an easy case: I have to make sure that the condition 
y1<=y2 always holds for my imputed values. In this case, I compute y1 as the 
fraction of y2 for the observed part of the data and impute these fractions 
instead of the real values and whenever my imputed values are outside the 
bounds [0%;100%], I simply redraw the value for this observation until the 
condition is fulfilled.

Any other ideas how to do that?


It gets more difficult if I have to make sure that the condition 
y.total=y1+y2+y3 is fulfilled. If I just impute y1,y2, and y3 and then simply 
define y.total=y1+y2+y3 I expect that I will overestimate the total number. 
Another idea would be to impute all the variables independently and then 
downweight y1, y2 and y3 to make sure that the above condition is fulfilled. 
But I find neither of the two ideas to be satisfying. 

Are there other ways to do it?


Things start to get real funny, if the above conditions also have to be 
fulfilled for subpopulations. Say y.total is the total number of employees and 
y1,y2,and y3 are number of employees for different levels of qualification. 
What if the question is: How many of these employees are females?

Then I have to make sure that   y.total=y1+y2+y3 
                                y.total.f=y1.f+y2.f+y3.f
                                y.total.f<=y.total
                                y1.f<=y1
                                y2.f<=y2
                                y3.f<=y3


I am in real trouble here and any ideas or comments are highly appreciated.


Joerg

Institute for Employment Research
Nuremberg, Germany


_______________________________________________
Impute mailing list
[email protected]
http://lists.utsouthwestern.edu/mailman/listinfo/impute

Reply via email to