Hi Felipe,
Please run $recent_moses_build/kenlm/query langmodel.lm <text and post
the output (you didn't need the statistics, just the line containing
"Total:"). That will tell you the score and n-gram length at each word.
Kenneth
On 10/28/10 12:42, Felipe Sánchez Martínez wrote:
> Hello all,
>
> My question is about SRILM and IRSTLM, it is not directly related to
> Moses, but I did not know where to ask.
>
> I am scoring individual sentences with a 5-gram language model and I get
> different scores with SRILM and IRSTLM.
>
> The language model was trained with SRILM through the following command
> line:
>
> $ srilm/bin/i686-m64/ngram-count -order $(LM_ORDER) -interpolate
> -kndiscount -text text.txt -lm langmodel.lm
>
> I do not know why when scoring the same sentence I get different scores.
> In this regard I have a few questions:
> * Does SRILM introduces begin-of-sentence and end-of-sentence tokens
> during training?
> * and, during scoring (or decoding)?
> * Does IRSTLM introduces begin-of-sentence and end-of-sentence tokens
> during scoring (or decoding)?
> * I know SRILM uses log base 10. Does IRSTLM also use log base 10? (It
> seems so)
>
> When I score the English sentence "the fifth committee resumed its
> consideration of the item at its 64th and 74th meetings , on 15 may and
> 2 june 2000 ." the score (log prob) I get are:
> SRILM: -54.4623
> IRSTLM: -49.9141
>
> if I introduce <s> and </s> when scoring with IRSTLM I get a log prob of
> -55.3099 (very similar to that of SRILM).
>
> The code to score with IRSTLM was borrowed from Moses.
>
> Than you very much for your help.
>
> Regards.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support