I think you'd be better off implementing your own
StatefulFeatureFunction, bypassing LanguageModel.{h,cpp} which mostly
handles n-grams crossing phrase boundaries, and calling the
LanguageModelImplementation as the backend.  You'll probably want larger
beams too.

Kenneth

On 03/18/11 13:38, Dennis Mehay wrote:
> Hello all,
> 
> I am trying to do something rather fancy with Moses by modifying the way
> Moses uses LMs.  What I want to do is somewhat akin to the
> "LanguageModelSkip.h" code that is in the repository, in that I want to
> score sequences over only certain factors from the string (to extend the
> reach and, hopefully, the approximation to syntactic or dependency
> LMs).  What I have is a way of getting a single label for each entry in
> the phrase table (yes, sounds crazy, but I managed to pull it off).  I
> have distributed this label (identically) to each word in the MT phrase,
> and so I want to feed the LM the syntactic label factor of (1) the first
> word in the current phrase and (2) the label factors of the first words
> of the n-1 previous *phrases* (NOT *words*) in the search hypothesis
> that the current phrase is extending.  This will essentially tell it the
> syntactic labels of the n phrases that make up the current search
> hypothesis.
> 
> This seems like it should be straightforward.  I know I'll need to
> override the "Evaluate" and "CalcScore" member functions of the
> LanguageModel.cpp class (they compute the inter-phrase and intra-phrase
> LM scores, right?), but I also see from some comments in the code that I
> shouldn't access previous hypotheses directly from the Evaluate
> function.  This apparently will get me in "trouble".  Instead, I need to
> pass the n-1 previous phrases into the FFState argument to the Evaluate
> function.   (These comments are in a comment from the online code
> documentation -- which isn't in my checked-out repos; could be out of date)
> 
> This is similar to what the IRST LM asynchronous LM idea buys you, but
> without limiting what is fed to the LM by a fixed-length *word* window
> (the "<lmmacroSize>" parameter in the IRST LM chunkLM config file).  The
> way I plan to implement things, IRST LM and SRILM will both be possible
> LMs to use on the back end -- all of the work will be done by tracking
> what the n-1 previous phrases are in each hypothesis.
> 
> My question, then, is (at least) two-fold: (1) Is this the best way to
> go about this (where "this" is my whole crazy idea)?  And (2): If so, am
> I right in thinking that (in addition to adding an LM type to the
> LanguageModelFactory class) all I need to to is override the "Evaluate"
> and "CalcScore".
> 
> Or am I completely off-base? (Or is this not really even possible at all?)
> 
> Any help is much appreciated.
> 
> Best,
> D.N.
> 
> 
> 
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to