Hi there, I have a question on the calculation of the lexical weighting model. If a phrase pair has several different alignments, then how does MOSES to compute its lexical weighting score. For example: in the corpus (fr-en), there is a phrase pair: (le -- it the). And I can find two alignments given by GIZA++: * 0-0 0-1 (at sentence No. 53493) * 0-1 (at sentence No. 39167)
The strategy of Philipp Koehn is to calculate lexical weighting score for each possible alignment and take the one with maximal score. For the first alignment (0-0 0-1), its lexical weighting score: lex(f|e) = (w(le|it)+w(le|the))/2 = (0.0330916+0.1952182)/2=0.114155 For the second, (0-0), its lexical weighting score: lex(f|e) = w(le|the) = 0.1952182 So we should takes the second alignment as the alignment between this phrase pair (le -- it the). However, Moses takes the first one (0-0 0-1). Does moses consider different alignments between a phrase pair on not? If yes, then how does moses choose the best alignment? If no, then which alignment moses will take? (the first one, the most frequent one, or other strategy) sincerely, -- Gong Li _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
