Hi.

I have a question which is not directly about Moses but more generally about 
phrase extraction in phrase-based statistical machine translation. I hope it is 
not considered off-topic! I haven't been able to easily locate a satisfactory 
answer.

In state-of-the-art phrase-based machine translation, once the sentence pair 
has 
been aligned, all possible phrase pairs are extracted and it is assumed that 
all 
of them have exactly been seen once. Counts are collected for all sentence 
pairs 
in the training corpus and then used to compute a crude estimate of translation 
probability Phi(f|e)  — in Philipp Koehn's book 'Statistical Machine 
Translation', p. 136, eq. (5.4). I was thinking about the possibility that 
Philipp himself hints at after this equation, that is, considering each 
possible 
segmentation completely (perfectly) covering *both* the source sentence and the 
target sentence, counting how many such complete coverings there are for that 
sentence pair, considering all of them equally likely, and assigning the 
corresponding "fractional counts" to the phrase pairs used in each covering, 
and 
then using the fractional counts to obtain a better estimate of Phi(f|e) (which 
could be iteratively refined by using it to estimate the likelihood of each 
covering, in a sort of "poor man's" expectation maximization, more crude than 
the alignment-less "rich man's" EM phrase extraction by Marcu and Wong (2002) 
or 
the alignment-constrained EM phrase extraction by Birch, Callison-Burch and 
Koehn (2006)).

The "fractional counts" idea looks like somehting that could be easily done but 
before I explore the idea further I would appreciate it very much if someone in 
this list could tell me if it has been done.

Thanks a million!

Mikel

Mikel L. Forcada <[email protected]>
Dept. Llenguatges i Sistemes Informàtics
Universitat d\\\'Alacant, E-03071 Alacant (Spain)
Tel.: +34 96 590 9776    Fax: +34 96 590 9326
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to