Re: [Moses-support] question about recombination when trying to output phrase lattices

Adam Lopez Fri, 05 Feb 2010 05:01:10 -0800

Hi Kevin -- the answer, which you have already guessed, is 1.  This is
a pretty common optimization, see e.g. Zhifei Li's description in
Section 4.3 of this paper:
http://aclweb.org/anthology-new/W/W08/W08-0402.pdf


Cheers
Adam

On Fri, Feb 5, 2010 at 12:47 PM, Kevin Gimpel <[email protected]> wrote:
> Hey all,
>
> I'm trying to construct a phrase lattice as output from Moses.  I have been
> playing around with "-output-search-graph" and "-verbose 3" and have become
> confused about recombination and how it preserves language model states.
>
> For example, if I translate "hier ist ein bier" from German to English and
> use a 4-gram language model, I see the following lines as part of the output
> when using -output-search-graph:
>
> ...
> 0 hyp=17 stack=1 back=0 score=-5.57705 transition=-5.57705 forward=35
> fscore=-10.8062 covered=3-3 out=beer , pC=0.131725, c=-2.6988
> 0 hyp=18 stack=1 back=0 score=-8.39914 transition=-8.39914 forward=50
> fscore=-11.1884 covered=3-3 out=, beer , pC=-1.81449, c=-5.14484
> ...
> 0 hyp=47 stack=2 back=17 score=-11.4177 transition=-5.84061 forward=173
> fscore=-12.6408 covered=2-2 out=a , pC=-0.318772, c=-1.8764
> 0 hyp=62 stack=2 back=18 score=-13.6186 transition=-5.2195 recombined=47
> forward=173 fscore=-12.6408 covered=2-2 out=a , pC=-0.318772, c=-1.8764
>
> I am surprised that recombination occurs in the last line shown, because
> hypothesis 62 ends in ", beer a" while hypothesis 47 ends in "<s> beer a" --
> causing future hypotheses that come from 47 or 62 to have different 4-gram
> language model probabilities.  I had been thinking that recombination was a
> risk-free pruning method of the search space as described in the Moses
> background page / original Pharaoh paper
> (http://www.statmt.org/moses/?n=Moses.Background), but maybe my assumption
> is obsolete.
>
> I can see a couple possibilities here:
> 1. Moses checks all necessary LM probabilities for the given trailing
> trigrams in each hypothesis and determines that the recombination can take
> place safely (e.g., no possible phrases following ", beer a" would give
> lower cost than "<s> beer a").  This is indeed a risk-free strategy.
> 2. Moses only checks the trailing words in the _current_ "hypotheses" when
> deciding to recombine and doesn't look at previous hypotheses. So, 62 would
> recombine with 47 because they both end with "a", regardless of what 17 and
> 18 end in.
> 3. Moses only checks at most the last two words in each hypothesis when
> trying to recombine, regardless of what order of language model is used.
> 4. Something else?
>
> Thanks!
> Kevin
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] question about recombination when trying to output phrase lattices

Reply via email to