Hi all,

I've been looking into the extraction code for lexicalized reordering
models, and I can't seem to wrap my head around it. Looking at the
phrase-based lexicalized reordering extraction code, e.g., I see that a
phrase is considered monotone wrt the previous phrase iff there is a phrase
consistent with the alignment (and within the phrase size limit) whose
bottom right corner is adjacent to the top left corner of the phrase in
question.

>From github, the relevant code for
mosesdecoder/scripts/training/phrase-extract/extract.cpp is:**

-----------------------------
 if((connectedLeftTop && !connectedRightTop) ||
...
      ((it = inBottomRight.find(startE - unit)) != inBottomRight.end() &&
       it->second.find(startF-unit) != it->second.end()))
    return LEFT;
------------------------------

(This is assuming 'unit'=1, i.e., we are extracting wrt the previous
phrase.)

So far so, so good.  Next, however, a 'swap' orientation is predicted if
there is a phrase whose bottom left corner is adjacent to the bottom left
corner of the current phrase.  That doesn't make sense. Shouldn't it be the
*top right* of the adjacent phrase that is touching the bottom left of the
current phrase?

Again from github, the code here is:
------------------------------
 if((!connectedLeftTop && connectedRightTop) ||
      ((it = inBottomLeft.find(startE - unit)) != inBottomLeft.end() && it->
second.find(endF + unit) != it->second.end()))
    return RIGHT;
------------------------------

If this is correct, I don't understand it. If it isn't, that could explain
why moses is giving negative weights to my own home-brewed 'swap'
orientation feature (that I extract with another script) -- i.e., my notion
of a 'swap' and moses's notion of it differ in incompatible ways.

Can anyone advise?

--D.N.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to