Hi there,

I have a question on the calculation of the lexical weighting model.
For phrase pairs has several different alignments, how does Moses
compute their lexical weighting score?

For example: in a (fr-en) corpus, there is a phrase pair: (le ||| it
the). And I can find two alignments given by GIZA++:
* 0-0 0-1
* 0-1

The strategy described in Philipp Koehn's book (2010) is to calculate
the lexical weighting score for each possible alignment and to take
the one with maximal score.

For the first alignment (0-0 0-1), the lexical weighting score is:

lex(f|e) = (w(le|it)+w(le|the))/2 = (0.0330916+0.1952182)/2=0.114155

For the second, (0-0), it is: lex(f|e) = w(le|the) = 0.1952182

So we should according to the book take the second alignment as the
alignment between this phrase pair (le ||| it the). However, and here
Moses took the first one (0-0 0-1).

Does Moses consider different alignments between a phrase pair? If
yes, then how does Moses choose the best alignment? If no, then which
alignment Moses will take? (the first one, the most frequent one, or
other strategy)

Also, I'd be interested to hear any experience about the potential
impact on each strategy.

sincerely,

-- 
Gong Li
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to