Hello all,

My question is about SRILM and IRSTLM, it is not directly related to 
Moses, but I did not know where to ask.

I am scoring individual sentences with a 5-gram language model and I get 
different scores with SRILM and IRSTLM.

The language model was trained with SRILM through the following command 
line:

$ srilm/bin/i686-m64/ngram-count -order $(LM_ORDER) -interpolate 
-kndiscount -text text.txt -lm langmodel.lm

I do not know why when scoring the same sentence I get different scores. 
In this regard I have a few questions:
* Does SRILM introduces begin-of-sentence and end-of-sentence tokens 
during training?
* and, during scoring (or decoding)?
* Does IRSTLM introduces begin-of-sentence and end-of-sentence tokens 
during scoring (or decoding)?
* I know SRILM uses log base 10. Does IRSTLM also use log base 10? (It 
seems so)

When I score the English sentence "the fifth committee resumed its 
consideration of the item at its 64th and 74th meetings , on 15 may and 
2 june 2000 ." the score (log prob) I get are:
SRILM: -54.4623
IRSTLM: -49.9141

if I introduce <s> and </s> when scoring with IRSTLM I get a log prob of 
-55.3099 (very similar to that of SRILM).

The code to score with IRSTLM was borrowed from Moses.

Than you very much for your help.

Regards.
-- 
Felipe Sánchez Martínez
Departamento de Lenguajes y Sistemas Informáticos
Universidad de Alicante, E-03071 Alicante (Spain)
Tel.: +34 965 903 400, ext: 2966 Fax: +34 965 909 326
http://www.dlsi.ua.es/~fsanchez
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to