Dear Moses expert, I currently experiment translating at *morpheme level* (each morpheme treated as a token). Instead of using a morpheme-based Language Model (LM), I want to modify the Moses code to use a *word-based* LM so as to better capture long-range word sequence constraint. I'd like to seek your advices on whether my modification is sufficient, and if there's any issue that I need to consider.
I intervened the Language Model scoring mechanism by modifying the two methods LanguageModel::CalcScore and LanguageModel::Evaluate. The modification idea is that in the two methods *CalcScore* and *Evaluate*, the phrase input is a morpheme sequence, so I'll first put them back into word form. The *contextFactor* that is used to query value from the LM will be supplied with word tokens (instead of morpheme tokens). As far as I know, the CalcScore method is only used for each target phrase separately, so modifying it is relatively straight forward. Modifying Evaluate method, on the other hand, is more complicated as it is for a hypothesis, and involves phrases from previous hypothesis. I ensure that every n-gram scored is in words and has part of it from previous hypothesis's phrase. * Here's an example about the method Evaluate: ** assume we have: prev phrase = A B1 , current phrase = B2 C1 C2 D1 D2 E (where B1 B2 are two morphemes of a word B, similarly for C and D. A and E are stand-alone morphemes. ** After concatenating into word form, they are A B C D E So the 3-gram LM score will be computed for "A B C" and "B C D" (since they all contain part of the previous phrase) The last n-gram C D E will be used to get the LMState What I do not really understand is the use of the LMState: why do we need to use the last n-gram to get the state? and how is it reused in later score calculation? Please correct me if my understanding is not correct. Any comments and advices are very much appreciated. Thank you in advance, Regards, Thang -- Luong Minh Thang WING group, School of Computing, National University of Singapore http://wing.comp.nus.edu.sg/~lmthang
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
