[Moses-support] Trying to do fancy things with LMs; need some advice.

Dennis Mehay Fri, 18 Mar 2011 10:39:58 -0700

Hello all,

I am trying to do something rather fancy with Moses by modifying the way
Moses uses LMs.  What I want to do is somewhat akin to the
"LanguageModelSkip.h" code that is in the repository, in that I want to
score sequences over only certain factors from the string (to extend the
reach and, hopefully, the approximation to syntactic or dependency LMs).
What I have is a way of getting a single label for each entry in the phrase
table (yes, sounds crazy, but I managed to pull it off).  I have distributed
this label (identically) to each word in the MT phrase, and so I want to
feed the LM the syntactic label factor of (1) the first word in the current
phrase and (2) the label factors of the first words of the n-1 previous
*phrases* (NOT *words*) in the search hypothesis that the current phrase is
extending.  This will essentially tell it the syntactic labels of the n
phrases that make up the current search hypothesis.


This seems like it should be straightforward.  I know I'll need to override
the "Evaluate" and "CalcScore" member functions of the LanguageModel.cpp
class (they compute the inter-phrase and intra-phrase LM scores, right?),
but I also see from some comments in the code that I shouldn't access
previous hypotheses directly from the Evaluate function.  This apparently
will get me in "trouble".  Instead, I need to pass the n-1 previous phrases
into the FFState argument to the Evaluate function.   (These comments are in
a comment from the online code documentation -- which isn't in my
checked-out repos; could be out of date)

This is similar to what the IRST LM asynchronous LM idea buys you, but
without limiting what is fed to the LM by a fixed-length *word* window (the
"<lmmacroSize>" parameter in the IRST LM chunkLM config file).  The way I
plan to implement things, IRST LM and SRILM will both be possible LMs to use
on the back end -- all of the work will be done by tracking what the n-1
previous phrases are in each hypothesis.

My question, then, is (at least) two-fold: (1) Is this the best way to go
about this (where "this" is my whole crazy idea)?  And (2): If so, am I
right in thinking that (in addition to adding an LM type to the
LanguageModelFactory class) all I need to to is override the "Evaluate" and
"CalcScore".

Or am I completely off-base? (Or is this not really even possible at all?)

Any help is much appreciated.

Best,
D.N.

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

[Moses-support] Trying to do fancy things with LMs; need some advice.

Reply via email to