Re: [Moses-support] Different scores with SRILM and IRSTLM

Hieu Hoang Fri, 29 Oct 2010 10:34:30 -0700

i think there's a difference between how sri and irst does backoff. Ken 
just follow sri to the letter.


i don't think there's a canonical way of doing it so they both implement 
it differently. As you saw from Tom's email, the results (in terms of 
BLEU) is pretty much the same.

On 29/10/2010 18:17, Felipe Sánchez Martínez wrote:
> Hi Kenneth,
>
> Just to tell you that after training SRILM with -unk and adding the
> following code to my SRILM load function
>
> _sri_ngramLM->skipOOVs() = false;
>
> I get the same score with SRILM and kenlm. Unfortunately this is not the
> case for IRSTLM. I'll look at my code because I think that there might
> be something wrong.
>
> Thanks again for your help.
> Regards
> --
> Felipe
>
>
> El 29/10/10 16:09, Kenneth Heafield escribió:
>> kenlm's query tool implicitly places<s>   at the beginning. It doesn't
>> appear in the output, but you can see the effect because the n-gram
>> length after the is 2, not 1.
>>
>> The difference between the kenlm result and SRILM is the unknown word
>> "74th".  -55.599 + 1.13665 = -54.46235.  The term -1.13665 appears to be
>> the LM's backoff weight for the unigram "and".  I think including the
>> backoff is the right thing to do here and it's how Moses configures
>> SRILM to operate (so you may want to look at LanguageModelSRI.cpp and
>> copy how it initializes SRI).
>>
>> As to IRST, I hope they find the n-gram lengths and probabilities after
>> each word useful in explaining that difference.
>>
>> Kenneth
>>
>> On 10/29/10 08:55, Felipe Sánchez Martínez wrote:
>>> Hi Kenneth,
>>>
>>> The output of kenlm/query is:
>>>
>>> Loading the LM will be faster if you build a binary file.
>>> Reading english.5gram.lm
>>> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
>>> ****************************************************************************************
>>> Language model is missing<unk>.  Substituting probability 0.
>>> ************
>>>
>>> Loading statistics:
>>> user    18.0001
>>> sys     0.00047
>>> rss   316632 kB
>>> the 2 -0.894835 fifth 3 -3.34651 committee 2 -3.04771 resumed 1 -5.3955
>>> its 2 -1.99768 consideration 2 -3.4901 of 3 -0.281781 the 4 -0.240104
>>> item 3 -4.40691 at 2 -2.55249 its 2 -2.06475 64th 1 -7.43317 and 1
>>> -2.20519 74th 0 -1.13665 meetings 1 -3.82205 , 2 -1.05335 on 3 -2.12476
>>> 15 3 -2.54839 may 4 -1.06142 and 4 -1.42049 2 3 -2.24962 june 4
>>> -0.381742 2000 2 -1.75696 . 3 -0.68658</s>   4 -0.000255845 Total: -55.599
>>> After queries:
>>> user    18.0001
>>> sys     0.00047
>>> rss   316656 kB
>>> Total time including destruction:
>>> user    18.0001
>>> sys     0.00051
>>> rss     1312 kB
>>>
>>> It seems that it is adding the end-of-sentence token, but not that of
>>> the begin of sentence.
>>>
>>> Score (-55.599) is different from SRILM (-54.4623) and from IRSTLM
>>> (-49.9141 or -55.3099 when adding<s>   and</s>).
>>>
>>> Thanks for your help
>>> --
>>> Felipe
>>>
>>> El 28/10/10 18:57, Kenneth Heafield escribió:
>>>> Hi Felipe,
>>>>
>>>>    Please run $recent_moses_build/kenlm/query langmodel.lm<text and post
>>>> the output (you didn't need the statistics, just the line containing
>>>> "Total:").  That will tell you the score and n-gram length at each word.
>>>>
>>>> Kenneth
>>>>
>>>> On 10/28/10 12:42, Felipe Sánchez Martínez wrote:
>>>>> Hello all,
>>>>>
>>>>> My question is about SRILM and IRSTLM, it is not directly related to
>>>>> Moses, but I did not know where to ask.
>>>>>
>>>>> I am scoring individual sentences with a 5-gram language model and I get
>>>>> different scores with SRILM and IRSTLM.
>>>>>
>>>>> The language model was trained with SRILM through the following command
>>>>> line:
>>>>>
>>>>> $ srilm/bin/i686-m64/ngram-count -order $(LM_ORDER) -interpolate
>>>>> -kndiscount -text text.txt -lm langmodel.lm
>>>>>
>>>>> I do not know why when scoring the same sentence I get different scores.
>>>>> In this regard I have a few questions:
>>>>> * Does SRILM introduces begin-of-sentence and end-of-sentence tokens
>>>>> during training?
>>>>> * and, during scoring (or decoding)?
>>>>> * Does IRSTLM introduces begin-of-sentence and end-of-sentence tokens
>>>>> during scoring (or decoding)?
>>>>> * I know SRILM uses log base 10. Does IRSTLM also use log base 10? (It
>>>>> seems so)
>>>>>
>>>>> When I score the English sentence "the fifth committee resumed its
>>>>> consideration of the item at its 64th and 74th meetings , on 15 may and
>>>>> 2 june 2000 ." the score (log prob) I get are:
>>>>> SRILM: -54.4623
>>>>> IRSTLM: -49.9141
>>>>>
>>>>> if I introduce<s>    and</s>    when scoring with IRSTLM I get a log prob 
>>>>> of
>>>>> -55.3099 (very similar to that of SRILM).
>>>>>
>>>>> The code to score with IRSTLM was borrowed from Moses.
>>>>>
>>>>> Than you very much for your help.
>>>>>
>>>>> Regards.
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> [email protected]
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Different scores with SRILM and IRSTLM

Reply via email to