Hello! If I understand this listserve correctly, I can email this address
to get help when I am struggling with code. If this is inaccurate, please
let me know, and I will unsubscribe.
I have been struggling with the same error message for a while, and I can't
seem to get past it.
Here is the issue:
I am using a data set that uses -1:-9 to indicate various kinds of missing
data. I changed all of these to NA, regardless of the cause of the missing
data. I am trying to do propensity score matching with this data, but it
will not calculate the propensity scores, regardless of which method I have
tried. I have tried the following methods:
1. Optimal propensity score matching, using the MatchIt library:
m.out<-matchit(assignment~totalexp + yrschool+new+cert+age+STratio +
percminority+urbanicity+povproblem+numthreats+numbattack+weight, data =
data, distance="logit", method = "optimal", ratio = 1)
2. Nearest neighbor propensity score matching, using the MatchIt library:
mout<-matchit(assignment~totalexp +
yrschool+new+cert+age+STratio+percminority+urbanicity+povproblem+numthreats+numbattack,
distance = "logit", replace = T, data = data, method = "nearest",
m.order="largest", caliper = 0.10)
3. Just calculating the propensity scores using the glm function:
ps.model = glm(assignment~totalexp +
yrschool+new+cert+age+STratio+percminority+urbanicity+povproblem+numthreats+numbattack,
family = "binomial", data = data)
data$propensityscores = fitted(ps.model)
In each case, I have tried running the code after having performed zero
imputations, 1 imputation, and 5 imputations. A colleague looked at my
code and assured me that I was doing the imputations correctly. However,
even after performing the imputation, one of the continuous variables still
has NAs. This is the code that I am using for 5 imputations:
library(mice)
#Remove weights
data$weight<-NULL
#perform the imputation
imputed.data = mice(data, m = 5, diagnostics = F)
#reinsert the weights
imputed.data.final=complete(imputed.data)
imputed.data.final$weight=lbdata$weight
#rename the imputed dataset "data"
data = imputed.data.final
When I perform optimal propensity score matching or nearest neighbor
matching (regardless of how many imputations I perform), I get the
following error:
Error in matchit(assignment ~ totalexp + yrschool + new + cert + age + :
Missing values exist in the data
I tried running these with just two of the categorical covariates, but I
still got this error, even though there is no missing data for those
variables.
When I perform the glm function to get the propensity scores, I get this
error, indicating that, for some reason, it is reducing the number of rows
in my data set, which makes me think that it is doing list-wise deletion:
Error in `$<-.data.frame`(`*tmp*`, "propensityscores", value =
c(0.116801691392172, :
replacement has 15934 rows, data has 16844
However, this method works if I remove the covariate that has missing data.
So, I guess my question is, how do I get the code to impute for the
variable that it is not imputing? Or, do I just need to chuck this
variable? And, if I just need to chuck this variable, how do I get the
optimal propensity score method to work? Currently it doesn't work even
when I chuck this variable.
Thank you for any help or advice!
Liz
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.