[Moses-support] Penalizing unknown words during bilingual scoring

André Lynum Mon, 25 Feb 2008 01:57:54 -0800

Hi, I'm working on modifying Moses to provide translation model scores  
for a given source translation sentence pair.



I'm using the decoder, constraining the hypothesises it generates, and  
then I examine the hypothesis stack which covers the most of the  
source input (I'm wondering if I should look at all generated  
hypothesises). Here I look for the highest scoring hypothesis, but I  
will need to account for the parts of the source and translation that  
is not covered by the hypothesis.

I was thinking of adding a penalizing factor to the total score of the  
hypothesises for each unknown word in the input pair. This is  
motivated by the notion that each unknown word may be generated from a  
"null" phrase pair. But I'm unsure about what factor to use and the  
scoring part of the Moses code is a bit complicated and I would  
appreciate any insight in what factor to use and how to apply it to  
the hypothesis score.

My initial notion was to use the same score as is used for translation  
options generated for unknown word but I can't quite see where this is  
set (as the negative word weight in TargetPhrase::SetScore() or as - 
inf in the translationoption constructor ?). Any help that would help  
me understand this part of the code would be greatly appreciated.


Regards


-andré lynum
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

[Moses-support] Penalizing unknown words during bilingual scoring

Reply via email to