Paul: the rate of convergence is directly tied to the largest eigenvalue of the fraction of missing information, as discussed in the original EM paper, or see chapter 8 of my book with Rubin. Slow linear convergence when the fraction of missing information is large is a problem with EM, and it can be hard to determine whether it has converged. When feasible, switching to a faster algorithm like scoring or Newton when close to convergence seems sensible. Ways of speeding EM include ECM, when the M step is iterative, and PXEM, clever but only seems available in special problems.
Ken Lange's books on numerical analysis and optimization are useful sources on this topic. Best, Rod Rod Little, [email protected] Richard D. Remington Distinguished University Professor of Biostatistics University of Michigan Department of Biostatistics M4071 SPH II, 1415 Washington Heights, Ann Arbor MI 48109 Survey Research Center, Room 4064 ISR, 426 Thompson St, Ann Arbor MI 48106 On Mon, Mar 9, 2020 at 4:37 PM Paul von Hippel < [email protected]> wrote: > I am thinking about ways to make MI faster. One bottleneck is the EM > algorithm commonly used to initialize the MCMC chain. I have a few > questions: > > 1. Is there guidance regarding how many iterations are typically > needed for convergence? I have heard that the number of iterations > increases with the fraction of missing information, but I haven't seen the > relationship quantified. > 2. Are the EM algorithms implemented in popular MI software the > fastest versions? > 3. Is there any justification for the convergence criteria used by the > EM algorithms implemented in popular software? Are they unnecessarily > strict? > > > > Best wishes, > Paul von Hippel > >
