hi marco
Your calculation certainly seem correct.
The only thing i can add is that the lex. trans probability is calculated
from the most common alignment for a particular source+target phrase.
However, if a translation has 2 different alignments that are equally
common, then it's a toss-up which 1 is used. The lex. prob can differ in
each case.
For example, the following might be the cause of the difference u see:
partido de ||| of ||| (0) (0) ||| (0,1) - 1000 training examples
partido de ||| of ||| () (0) ||| (1) - 1000 training examples
lex0(of|partido) = 0.75
lex0(of|de) = 0.3623188
lex0(NULL|partido) = 0.75
Lexical Score E2F => (0.75+0.3623188)/2 = 0.5561594
OR
Lexical Score E2F => (0.75+0.75)/2 = 0.75
In my own re-implementation, this was the case for a small number of
phrase-pairs but I didn't worry to much because it happened rarely.
Hieu Hoang
www.hoang.co.uk/hieu
________________________________
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of marco turchi
Sent: 28 October 2008 16:30
To: moses-support
Subject: [Moses-support] lexical translation probability
Dear all,
I'm trying to rewrite in Python the lexical translation probability code
following the code inside score.cpp.
In particular case, I get different results from the score.cpp output
#output From Score.cpp
partido de ||| of ||| (0) (0) ||| (0,1) ||| 0.666667 0.75
lex0(of|partido) = 0.75
# aligned word to "of" = 1 =>
LexicalScore E2F = 0.75/1 =0.75
#output my code
partido de ||| of ||| (0) (0) ||| (0,1) ||| 0.666667 0.5561594
lex0(of|partido) = 0.7500000
lex0(of|de) = 0.3623188
# aligned word to "of" = 2
Lexical Score E2F => (0.75+0.3623188)/2 = 0.5561594
I'm using the same lexical files
I do not understand the reason because the otutput of the score.cpp does not
contain lex0(of|de)... can be a bug?
Thanks
Marco
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support