[Mt-list] EM in model 1 question

Michael Zedeler Fri, 26 May 2006 07:49:45 -0700

Hi everybody.

I am trying to understand what the purpose of using the EM algorithm fortraining IBM Model 1 (MTS system described in "The mathematics ofStatistical Machine Translation: Parameter Estimation" by Brown et al).To me it seems that the result will be overfitted.

By collecting the number of occurences of wordpairs (ie. (e,f)) from thetwo sentences and normalizing just once (is running just one step of theEM algorithm), why is it that the result can't be used? Running the EMalgorithm subsequently means that wordpairs with a higher number ofoccurences (and thus also higher resulting t-values) will slowlyincrease their t-score, taking the probability mass from lower scores.But why is this a good thing to do, and doesn't this mean that thevalues eventually will converge to 1 for the single pair with thehighest number of occurences (for each word in the source text, that is)?


Any help will be appreciated.

Regards,

Michael.

--
Which is more dangerous? TV guided missiles or TV guided families?
Visit my home page at http://michael.zedeler.dk/
Get my vcard at http://michael.zedeler.dk/vcard.vcf

_______________________________________________
Mt-list mailing list

[Mt-list] EM in model 1 question

Reply via email to