Re: [R] Re: Re: Find Closest 5 Cases?

Chuck Cleland Fri, 13 Feb 2004 14:04:18 -0800

[EMAIL PROTECTED] wrote:

I'm doing this as a form of missing value analysis. Approximately 30% of the cases are missing data for one variable. To impute values for those cases, I'd like to match those cases that are missing the variable to all other cases and then take an average of those to infill.

I realize there are many methods for imputing data. I'm not well versed on any in particular (expect regression and cluster analysis). That said, given that I have an extensive data set already with most variables populated, I can find the closest observations in N-dimentional space and impute the value that way - by focusing on the best matches.

If there are any other thoughts on how to do this (relatively easily), I'm open to suggestions and being educated.

You might have a look at impute.knn() in the impute package on CRAN.

mymat <- matrix(rbinom(50000*20, 1, .5), ncol=20)
mymat[sample(50000, 50000*.30),5] <- NA
summary(mymat)
summary(impute.knn(mymat, k=5)$data)

hope this helps,

Chuck Cleland

--
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 452-1424 (M, W, F)
fax: (917) 438-0894

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Re: Re: Find Closest 5 Cases?

Reply via email to