Hi, Moses does in fact adds a begin of sentence token at the beginning of the input to provide proper language model context. However, the recommended Kneser-Ney smoothed language model is also not fully appropriate to compute unigram probabilities for the first word of the phrase, due the way smoothing of such back-off models works out.
Not sure what to recommend here. You could recompute the language model yourself with a better language model that suites your purposes. You would need to train a unigram, bigram, trigram, etc. language model. Then you could take n-best list output from Moses and re-rank it. -phi On Thu, Nov 13, 2008 at 7:58 PM, Felipe Sánchez Martínez <[EMAIL PROTECTED]> wrote: > > Hi all, > > I am using Moses to obtain translation candidates (in the form of n-best > lists) for phrases or words in isolation; that is, I am not translating > whole (well-formed) sentences. > > Does SRILM (the language model I am using with Moses) introduce a > begin-of-sentence token before computing the likelihood of the input > sentence (in my case a phrase or a word). > > If the question to the previous question is yes. How could I avoid that? > > Thank you very much in advance, > > Kind regards > > -- > Felipe > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
