Hi, happy 2009 to all of you!,

SRILM adds sentence boundaries by default (<s> and </s>); however, the
last version allows to avoid this by using the flags -no-sos -no-eos
when training the language model with ngram-count.

With respect to moses, I think that it does not add sentence boundaries
before computing the language model score. I do not know how it behaves
for the computation of the rest of scores.

Regards,
--
Felipe.

El lun, 05-01-2009 a las 17:33 +0100, Joerg Tiedemann escribió:
> happy new year to all of you!
> 
> I forgot to follow up on this topic of sentence boundaries in srilm and 
> moses. maybe I missed the answer - but I don't recall that someone 
> answered the discussion below.
> 
> how does moses do it? adding sentence boundaries or not? or does srilm 
> always assume that a string is a full sentence when called for computing 
> the LM score? what are the consequences for the incremental decoding 
> procedure in that case? and if sentence boundaries are added in the 
> internal calls to srilm - what happens when moses uses irstlm instead?
> 
> could someone clarify? thanks in advance!
> 
> jorg
> 
> 
> > 
> > El vie, 14-11-2008 a las 08:21 +0100, Marcello Federico escribió:
> >> Felipe,
> >>
> >> correct, irstlm does not add sentence boundaries.
> >> irstlm uses them only if you add them to the data.
> >>
> >> srilm adds sentence boundaries by default around each 
> >> text line but you can disable this operation (check proper
> >> option in the manual page of ngram-count and ngram).
> >>
> >> i'm not sure about how moses calls srilm internally.
> >> my guess is that only single n-grams are passes to
> >> srilm and that no sentence boundary symbols are
> >> introduced by moses.
> >>
> >> marcello
> >>
> >> ________________________________________
> >> From: [email protected] [[email protected]] On 
> >> Behalf Of J.Tiedemann [[email protected]]
> >> Sent: Thursday, November 13, 2008 11:06 PM
> >> To: [email protected]; [email protected]
> >> Subject: Re: [Moses-support] Translating words or phrases in isolation
> >>
> >> I'm not 100% sure but I think that IRSTLM does not add sentence
> >> boundary tokens. maybe that's an option?
> >>
> >> jorg
> >>
> >>
> >> On Thu, 13 Nov 2008 20:58:54 +0100
> >>   Felipe Sánchez Martínez <[email protected]> wrote:
> >>> Hi all,
> >>>
> >>> I am using Moses to obtain translation candidates (in the form of
> >>> n-best
> >>> lists) for phrases or words in isolation; that is, I am not
> >>> translating
> >>> whole (well-formed) sentences.
> >>>
> >>> Does SRILM (the language model I am using with Moses) introduce a
> >>> begin-of-sentence token before computing the likelihood of the input
> >>> sentence (in my case a phrase or a word).
> >>>
> >>> If the question to the previous question is yes. How could I avoid
> >>> that?
> >>>
> >>> Thank you very much in advance,
> >>>
> >>> Kind regards
> >>>
> >>> --
> >>> Felipe
> >>>
> >>> _______________________________________________
> >>> Moses-support mailing list
> >>> [email protected]
> >>> http://mailman.mit.edu/mailman/listinfo/moses-support
> >> _______________________________________________
> >> Moses-support mailing list
> >> [email protected]
> >> http://mailman.mit.edu/mailman/listinfo/moses-support
> > 
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> 
-- 
Felipe Sánchez Martínez <[email protected]>
Departamento de Lenguajes y Sistemas Informáticos
Universidad de Alicante, E-03071 Alicante (Spain)
Tel.: +34 965 903 400, ext: 2038 Fax: +34 965 909 326
http://www.dlsi.ua.es/~fsanchez

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to