Hi I decided to go with the mixture model for EM. I have modified my proposal and submit it both on gsoc website and apache wiki.
Best Regards Yifan 2009/4/1 Yifan Wang <heavens...@gmail.com>: > I will choose Mixture Model for the EM implementation. > > Yifan > > 2009/4/1 Ted Dunning <ted.dunn...@gmail.com>: >> Yifan, >> >> EM is a highly non-specific term and covers a huge range of very different >> algorithms. For example, pLSI, HMM's, and mixture models can all be >> estimated using EM. >> >> What exactly did you mean to address with an EM implementation? >> >> On Wed, Apr 1, 2009 at 1:05 PM, Grant Ingersoll <gsing...@apache.org> wrote: >> >>> Hi Yifan, >>> >>> I think both are good candidates, although AIUI, SVM is a bit harder to >>> parallelize, so maybe it would make sense to focus on EM. Of course, we >>> don't have to be distributed, so you could propose a non-distributed SVM >>> implementation as a first cut and then work on the distributed part as the >>> project develops. >>> >>> ... >>>> >>>> >>>> For EM, it is a generalization of the k-means algorithm, and we already >>>> have >>>> k-means in the Mahout library. >>>> >>>> >> >