Hello all, I am trying to do something rather fancy with Moses by modifying the way Moses uses LMs. What I want to do is somewhat akin to the "LanguageModelSkip.h" code that is in the repository, in that I want to score sequences over only certain factors from the string (to extend the reach and, hopefully, the approximation to syntactic or dependency LMs). What I have is a way of getting a single label for each entry in the phrase table (yes, sounds crazy, but I managed to pull it off). I have distributed this label (identically) to each word in the MT phrase, and so I want to feed the LM the syntactic label factor of (1) the first word in the current phrase and (2) the label factors of the first words of the n-1 previous *phrases* (NOT *words*) in the search hypothesis that the current phrase is extending. This will essentially tell it the syntactic labels of the n phrases that make up the current search hypothesis.
This seems like it should be straightforward. I know I'll need to override the "Evaluate" and "CalcScore" member functions of the LanguageModel.cpp class (they compute the inter-phrase and intra-phrase LM scores, right?), but I also see from some comments in the code that I shouldn't access previous hypotheses directly from the Evaluate function. This apparently will get me in "trouble". Instead, I need to pass the n-1 previous phrases into the FFState argument to the Evaluate function. (These comments are in a comment from the online code documentation -- which isn't in my checked-out repos; could be out of date) This is similar to what the IRST LM asynchronous LM idea buys you, but without limiting what is fed to the LM by a fixed-length *word* window (the "<lmmacroSize>" parameter in the IRST LM chunkLM config file). The way I plan to implement things, IRST LM and SRILM will both be possible LMs to use on the back end -- all of the work will be done by tracking what the n-1 previous phrases are in each hypothesis. My question, then, is (at least) two-fold: (1) Is this the best way to go about this (where "this" is my whole crazy idea)? And (2): If so, am I right in thinking that (in addition to adding an LM type to the LanguageModelFactory class) all I need to to is override the "Evaluate" and "CalcScore". Or am I completely off-base? (Or is this not really even possible at all?) Any help is much appreciated. Best, D.N.
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support