(Ted Harding) <[EMAIL PROTECTED]> writes: > On 12-May-04 Rolf Turner wrote: >> Anne Piotet wrote: >> >>> What R functionnalities are there to do missing values imputation >>> (substantial proportion of missing data)? I would prefer to use >>> maximum likelihood methods ; is the EM algorithm implemented? in >>> which package? >> >> The so-called ``EM algorithm'' is ***NOT*** an >> algorithm. It is a methodology or a unifying concept. >> It would be impossible to ``implement'' it. (Except >> possibly by means of some extremely advanced and >> sophisticated Artificial Intelligence software.) > > Do we understand the same thing by "EM Algorithm"? > > The one I'm thinking of -- formulated under that name by Dempster, > Laird and Rubin in 1977 ("Maximum likelihood estimation from incomplete > data via the EM algorithm", JRSS(B) 39, 1-38) -- is indeed an algorithm > in exactly the same sense as any iterative search for the maximum of a > function. > > Essentially, in the context of data modelled by an underlying exponential > family distribution where there is incomplete information about the > values which have this distribution, it proceeds by > > Start: Choose starting estimates for the parameters of the distribution > E: Using the current parameter values, compute the expected vaues > of the sufficient statistics conditional on the observed information > M: Solve the maximum-likelihood equations (which are functions of the > sufficient statistics) using the expected values computed in (E) > If sufficently converged, stop. Otherwise, make the current parameter > values equal to the values estimated in (M) and return to (E). > > Algorithm, this, or not???? > > And where does "extremely advanced and sophisticated Artificial > Intelligence software" come into it? You can, in some cases, perform > the above EM algorithm by hand. > > Which "EM Algorithm" are you thinking of?
Thanks, Ted :-) -- to extend it a bit, one can imagine the use of approximate solutions to the 2 steps (simulation methods to get expected values, similar range of approaches for the maximization) and get a general (but possibly not robust) computational solution for the parametric problem. Just plug in a formula for the likelihood and the sufficient statistics... Of course, thousands of papers have been written on these variations (likelihood, specific implementations of the E and M steps). best, -tony -- [EMAIL PROTECTED] http://www.analytics.washington.edu/ Biomedical and Health Informatics University of Washington Biostatistics, SCHARP/HVTN Fred Hutchinson Cancer Research Center UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable FHCRC (M/W): 206-667-7025 FAX=206-667-4812 | use Email CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}} ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html