Hi,

 it's an in-sentence normalization factor
so that count sums up "prob-like statistics" 
of all possible word-pairs in each sentence.
Notice that total_s() is an array indexed over 
the full target dictionary. 
Greetings,
Marcello
Marcello Federico
FBK-irst Trento, Italy
+39 0461 314 552 (office)
+39 331 6222673 (mobile)


----- Messaggio originale -----
Da: [email protected] <[email protected]>
A: [email protected] <[email protected]>
Inviato: Fri Jul 24 01:23:56 2009
Oggetto: [Moses-support] EM Model 1 Pseudocode

Hi,

I've been looking at the pseudocode for Model 1 as provided in Koehn's  
lecture notes online. I can't help noticing what seems to be a bug in  
the pseudocode

for all sentence pairs (e_s,f_s)
   for all words e in e_s
     total_s(e) = 0
     for all words f in f_s
       total_s(e) += t(e|f)
   for all words e in e_s
     for all words f in f_s
       count(e|f)+= t(e|f) / total_s(e)
       total(f) += t(e|f) / total_s(e)

If I am to understand this literally then total_s(e) gets reset to  
zero each time it occurs in a sentence. This make all of the  
total_s(e) += t(e|f) calls irrelevant except for the last time the  
word occurs in a sentence.

However, when taken literally, the later calls to count(e|f)+= t(e|f)  
/ total_s(e) and total(f) += t(e|f) / total_s(e) will have effect for  
every occurence of a word in the sentence.

Is this intentional? Or am I reading this wrong?

James

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to