Hi Dennis, Actually this is no bug, you only misinterpreted the variables.
The map inBottomRight stores the positions of the bottom right corners of all extracted phrase pairs. Similarly, inBottomLeft stores the bottom left corners of all extracted phrase pairs. (startF,startE) represents the top left corner of the current phrase pair, while (endF,startE) represents the top right corner. The orientation of the current phrase pair w.r.t. the previous phrase pair, is defined by comparing the top corners of the current phrase pair with the bottom corners of all the other extracted phrase pairs. Therefore, for a monotone orientation, we check if there is a phrase pair of which the bottom right corner falls in (startF-1,startE-1); i.e. touches the top left corner of the current phrase pair (startF,startE). Whereas, for a swap orientation, we check if there is a phrase pair of which the bottom left corner falls in (endF+1,startE-1); i.e. touches the top right corner of the current phrase pair (endF,startE). I hope this will clarify the situation. Cheers, Nadi On Wed, 11 Apr 2012 14:18:20 -0400, Dennis Mehay wrote: > Hi all, > > I've been looking into the extraction code for lexicalized reordering > models, and I can't seem to wrap my head around it. Looking at the > phrase-based lexicalized reordering extraction code, e.g., I see that > a phrase is considered monotone wrt the previous phrase iff there is > a > phrase consistent with the alignment (and within the phrase size > limit) whose bottom right corner is adjacent to the top left corner > of > the phrase in question. > > From github, the relevant code for > mosesdecoder/scripts/training/phrase-extract/extract.cpp is: > > ----------------------------- > > if((connectedLeftTop && !connectedRightTop) || > ... > > ((it = inBottomRight.find(startE - unit)) != > inBottomRight.end() && > it->second.find(startF-unit) != it->second.end())) > return LEFT; > ------------------------------ > > (This is assuming 'unit'=1, i.e., we are extracting wrt the previous > phrase.) > > So far so, so good. Next, however, a 'swap' orientation is predicted > if there is a phrase whose bottom left corner is adjacent to the > bottom left corner of the current phrase. That doesn't make sense. > Shouldn't it be the *top right* of the adjacent phrase that is > touching the bottom left of the current phrase? > > Again from github, the code here is: > ------------------------------ > > if((!connectedLeftTop && connectedRightTop) || > ((it = inBottomLeft.find(startE - unit)) != > inBottomLeft.end() && it->second.find(endF + unit) != > it->second.end())) > return RIGHT; > ------------------------------ > > If this is correct, I don't understand it. If it isn't, that could > explain why moses is giving negative weights to my own home-brewed > 'swap' orientation feature (that I extract with another script) -- > i.e., my notion of a 'swap' and moses's notion of it differ in > incompatible ways. > > Can anyone advise? > > --D.N. -- Nadi Tomeh Univ. Paris XI, LIMSI-CNRS LIMSI Bât 508, Bureau 118 BP 133 F-91403 ORSAY CEDEX Tél : +33/0 1 69 85 80 68 [email protected] _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
