Hi Hieu,
thanks a lot...

I got the point... the only thing is that I recompute the lexical score from
the phrase table. For each line, I use the alignment to recompute both the
lexical scores.
What is possible, but I'm not sure,  is that the lexical score in some
particular situation is computed on an alignment that is different from the
alignment reported on the phrase table.

Anyway, you are right, I have checked on a phrase table made of 50 millions
associations and my computation differ from the score.cpp only in 200,000
association (< 0.4 %). I can survive :-).

Thanks a lot
Marco



On Wed, Oct 29, 2008 at 10:01 AM, Hieu Hoang <[EMAIL PROTECTED]> wrote:

> hi marco
>
> Your calculation certainly seem correct.
>
> The only thing i can add is that the lex. trans probability is calculated
> from the most common alignment for a particular source+target phrase.
> However, if a translation has 2 different alignments that are equally
> common, then it's a toss-up which 1 is used. The lex. prob can differ in
> each case.
>
> For example, the following might be the cause of the difference u see:
>    partido de ||| of ||| (0) (0) ||| (0,1)      - 1000 training examples
>    partido de ||| of ||| () (0) ||| (1)         - 1000 training examples
>
>    lex0(of|partido) = 0.75
>     lex0(of|de) = 0.3623188
>     lex0(NULL|partido) = 0.75
>
>        Lexical Score E2F => (0.75+0.3623188)/2 =  0.5561594
>                 OR
>        Lexical Score E2F => (0.75+0.75)/2 =  0.75
>
> In my own re-implementation, this was the case for a small number of
> phrase-pairs but I didn't worry to much because it happened rarely.
>
> Hieu Hoang
> www.hoang.co.uk/hieu
>
>
> ________________________________
>
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> On Behalf Of marco turchi
> Sent: 28 October 2008 16:30
> To: moses-support
> Subject: [Moses-support] lexical translation probability
>
>
> Dear all,
> I'm trying to rewrite in Python the lexical translation probability code
> following the code inside score.cpp.
> In particular case, I get different results from the score.cpp output
>
> #output From Score.cpp
> partido de ||| of ||| (0) (0) ||| (0,1) ||| 0.666667  0.75
> lex0(of|partido) = 0.75
> # aligned word to "of"  =  1  =>
> LexicalScore E2F = 0.75/1 =0.75
>
> #output my code
> partido de ||| of ||| (0) (0) ||| (0,1) ||| 0.666667 0.5561594
> lex0(of|partido) = 0.7500000
> lex0(of|de) = 0.3623188
> # aligned word to "of"  = 2
> Lexical Score E2F => (0.75+0.3623188)/2 =  0.5561594
>
> I'm using the same lexical files
> I do not understand the reason because the otutput of the score.cpp does
> not
> contain lex0(of|de)... can be a bug?
>
> Thanks
> Marco
>
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to