Re: [Moses-support] Format of phrase reordering file extract.o.gz

Philipp Koehn Wed, 11 Nov 2009 12:00:41 -0800

Hi,

the determination in training, whether a phrase is swap (with regard to previous
phrase or next) is based on alignment points around the phrase.


Slide 112 in this tutorial defines which alignment points are looked at:
http://www.iccs.inf.ed.ac.uk/~pkoehn/publications/tutorial2006.pdf

So, yes, swap swap is possible - it happens if a sequence of
phrases is in inverse order.

-phi

On Wed, Nov 11, 2009 at 7:49 PM, John DeNero <[email protected]> wrote:
> Thanks, Chris.  Just to clarify, am I interpreting the following cases
> correctly, where P is the phrase pair in question and X are word
> alignments in neighboring corners, and the source goes left to right?
>
> The "mono swap" case:
> $ zcat extract.o.gz | grep "mono swap"  | wc -l
> 41043
>
> X X
>  P
>
> The "swap swap" case:
> $ zcat extract.o.gz | grep "swap swap"  | wc -l
> 61745
>
>  X
>  P
> X
>
>
> The "swap mono" case:
> $ zcat extract.o.gz | grep "swap mono"  | wc -l
> 50403
>
>  P
> X X
>
> On Wed, Nov 11, 2009 at 11:39 AM, Chris Dyer <[email protected]> wrote:
>> Hi John-
>> The first label is the orientation of the phrase pair with respect to
>> its left context (on the source side), and the second is the
>> orientation with respect to its right context.  That's why you have to
>> have "swap other" or "other swap", since a phrase can only be inverted
>> on one side.
>> Hope this helps,
>> Chris
>>
>> On Wed, Nov 11, 2009 at 2:34 PM, John DeNero <[email protected]> wrote:
>>> Hi all,
>>>
>>> I'm trying to generate a replacement phrase extraction file to be used
>>> in estimating a lexical reordering model.  I'm running
>>> train-factored-phrase-model.perl with the "-reordering
>>> msd-bidirectional-fe" flag, which generates an extract.o.gz file with
>>> content like:
>>>
>>> reanudación ||| resumption ||| mono mono
>>> reanudación del ||| resumption of the ||| mono mono
>>> ...
>>> este ||| this ||| swap other
>>> ...
>>>
>>> I understand that the mono, swap, and other tags correspond to the
>>> "(m) monotone order, (s) switch with previous phrase, or (d)
>>> discontinuous" types described in the online Moses docs.  I don't
>>> really understand what the two different tags correspond to, though.
>>> What does the first entry vs. the second entry mean in each line?
>>> Apologies if this is explained somewhere in the docs or mailing list
>>> archives -- I didn't find it.
>>>
>>> Thanks,
>>> John
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Format of phrase reordering file extract.o.gz

Reply via email to