Re: [R] EM for missing data

Tal Galili Sat, 21 Jul 2012 08:47:49 -0700

Hello Ya.

I am no expert, so I am eager to read suggestions from other people in the
mailing list.  But just a few pointers I am (somewhat) sure of -

You can try using this package:
http://cran.r-project.org/web/packages/imputation/imputation.pdf
And use something like kNNImpute.  KNN solving is a type of EM.

In any event, an imputation based on EM is also based on some assumption of
the underlying distribution of the data (observable and missing).  From
what I see here:
http://www.youtube.com/watch?v=xEkJxl6mmQ0
It seems that the EM of SPSS often assumed a (multi?!) normal distribution
of the data.  Which is a stronger assumption than what knn will use.  Also
the function I linked to has a CV option to check how stable the imputation
process is.

If you are looking for more options just google R+imputation.  There are
numerous packages and functions for this.

Good luck,
Tal

----------------Contact
Details:-------------------------------------------------------
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
----------------------------------------------------------------------------------------------

On Sat, Jul 21, 2012 at 2:55 PM, ya <xinxi...@163.com> wrote:

> Hi list,
>
> I am wondering if there is a way to use EM algorithm to handle missing
> data and get a completed data set in R?
>
> I usually do it in SPSS because EM in SPSS kind of "fill in" the estimated
> value for the missing data, and then the completed dataset can be saved and
> used for further analysis. But I have not found a way to get the a
> completed data set like this in R or SAS. With Amelia or MICE, the missing
> data set were imputed a couple of times, and the new imputed datasets were
> not combined. I understand that the parameter estimation can still be done
> in the way of combination of estimates from each imputed data set, but it
> would be more convenient to have a combined dataset to do some analysis,
> for example, ANOVA with IVs having more than two categories. In this case,
> the only way to get the main effect of the whole IV is to estimate
> parameters in a single data set(as far as I know). If the separated imputed
> data sets were used, then the main effect showed in the result were for
> each category of the IV, respectively. I figured sometimes the readers and
> reviewers would like to see how bi!
>  g the effect for the whole IV instead of the effect of each category of
> that IV.
>
> This is one of the reasons I can not fully move to R from SPSS. So any
> suggestions?
>
> Thank you very much.
>
>
>
>
> ya
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] EM for missing data

Reply via email to