Re: [Moses-support] Format of phrase reordering file extract.o.gz

John DeNero Wed, 11 Nov 2009 12:54:11 -0800

Hi Philipp & Chris,

Thanks for the help so far.  One more question about a special case:
What is the reordering of the first source phrase with regard to the
previous phrase? Always mono?  (Another reasonable policy might be
mono if aligned to the first target phrase and discontinuous
otherwise.)  Does the last source phrase with regard to the following
context have the same policy?  If you don't know off the top of your
head, I'll dig into the data and figure it out.


Thanks,
John

On Wed, Nov 11, 2009 at 11:59 AM, Philipp Koehn <[email protected]> wrote:
> Hi,
>
> the determination in training, whether a phrase is swap (with regard to 
> previous
> phrase or next) is based on alignment points around the phrase.
>
> Slide 112 in this tutorial defines which alignment points are looked at:
> http://www.iccs.inf.ed.ac.uk/~pkoehn/publications/tutorial2006.pdf
>
> So, yes, swap swap is possible - it happens if a sequence of
> phrases is in inverse order.
>
> -phi
>
> On Wed, Nov 11, 2009 at 7:49 PM, John DeNero <[email protected]> wrote:
>> Thanks, Chris.  Just to clarify, am I interpreting the following cases
>> correctly, where P is the phrase pair in question and X are word
>> alignments in neighboring corners, and the source goes left to right?
>>
>> The "mono swap" case:
>> $ zcat extract.o.gz | grep "mono swap"  | wc -l
>> 41043
>>
>> X X
>>  P
>>
>> The "swap swap" case:
>> $ zcat extract.o.gz | grep "swap swap"  | wc -l
>> 61745
>>
>>  X
>>  P
>> X
>>
>>
>> The "swap mono" case:
>> $ zcat extract.o.gz | grep "swap mono"  | wc -l
>> 50403
>>
>>  P
>> X X
>>
>> On Wed, Nov 11, 2009 at 11:39 AM, Chris Dyer <[email protected]> wrote:
>>> Hi John-
>>> The first label is the orientation of the phrase pair with respect to
>>> its left context (on the source side), and the second is the
>>> orientation with respect to its right context.  That's why you have to
>>> have "swap other" or "other swap", since a phrase can only be inverted
>>> on one side.
>>> Hope this helps,
>>> Chris
>>>
>>> On Wed, Nov 11, 2009 at 2:34 PM, John DeNero <[email protected]> wrote:
>>>> Hi all,
>>>>
>>>> I'm trying to generate a replacement phrase extraction file to be used
>>>> in estimating a lexical reordering model.  I'm running
>>>> train-factored-phrase-model.perl with the "-reordering
>>>> msd-bidirectional-fe" flag, which generates an extract.o.gz file with
>>>> content like:
>>>>
>>>> reanudación ||| resumption ||| mono mono
>>>> reanudación del ||| resumption of the ||| mono mono
>>>> ...
>>>> este ||| this ||| swap other
>>>> ...
>>>>
>>>> I understand that the mono, swap, and other tags correspond to the
>>>> "(m) monotone order, (s) switch with previous phrase, or (d)
>>>> discontinuous" types described in the online Moses docs.  I don't
>>>> really understand what the two different tags correspond to, though.
>>>> What does the first entry vs. the second entry mean in each line?
>>>> Apologies if this is explained somewhere in the docs or mailing list
>>>> archives -- I didn't find it.
>>>>
>>>> Thanks,
>>>> John
>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> [email protected]
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Format of phrase reordering file extract.o.gz

Reply via email to