Re: [Moses-support] Factored models, example from tutorial

Ondrej Bojar Sun, 10 Jun 2012 10:04:03 -0700

Dear Marcin,

the short answer is: you need to avoid the blow-up.


The options that affect pruning during creation of translation options are:

-ttable-limit ...how many variants of a phrase to read from the phrase 
table)

-max-partial-trans-opt ...how many partial translation options are 
considered for a span. This is the critical pruning to contain the 
blowup in memory.

-max-trans-opt-per-coverage ...how many finished options should be then 
passed to the search.
-translation-option-threshold ...the same thing, but expressed relative 
to the score of the best one.

If you set the model so that it does blow up but you don't thrash your 
machine by setting -max-partial-trans-opt reasonably low, you are very 
likely to get a lot of search errors because the pruning of translation 
options happens too early, without the linear context of surrounding 
translation options. Moses simply does not have good means to handle the 
combinatorics of factored models.

Cheers, Ondrej.

On 06/10/2012 06:40 PM, Marcin Junczys-Dowmunt wrote:
> Hi,
> by the way, are there some best-practice decoder settings for heavily
> factored models with combinatorial blow-up? If I am not wrong, most
> settings affect hypothesis recombination later on. Here the heavy work
> happens during the creation of target phrases and future score
> calculation before the actual translation.
> Best,
> Marcin
>
> W dniu 09.06.2012 16:45, Philipp Koehn pisze:
>> Hi,
>>
>> the idea here was to create a link between the
>> words and POS tags early on and use this as
>> an additional scoring function. But if you see better
>> performance with your setting, please report back.
>>
>> -phi
>>
>> On Fri, Jun 8, 2012 at 6:03 PM, Marcin Junczys-Dowmunt
>> <[email protected]>   wrote:
>>> Hi all,
>>> I have a question concerning the "Tutorial for Using Factored Models",
>>> section on "Train a morphological analysis and generation model".
>>>
>>> The following translation factors and generation factors are trained for
>>> the given example corpus:
>>>
>>>       --translation-factors 1-1+3-2 \
>>>       --generation-factors 1-2+1,2-0 \
>>>       --decoding-steps t0,g0,t1,g1
>>>
>>> What is the advantage of using the first generation factor 1-2 compared
>>> to the configuration below?
>>>
>>>       --translation-factors 1-1+3-2 \
>>>       --generation-factors 1,2-0 \
>>>       --decoding-steps t0,t1,g1
>>>
>>> I understand the 1-2 generation factor maps lemmas to POS+morph
>>> information, but the same information is also generated by the 3-2
>>> translation factor. Apart from that this generation factor introduces
>>> huge combinatorial blow-up, since every lemma can be mapped to basically
>>> every possible morphological information seen for this lemma.
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
>

-- 
Ondrej Bojar (mailto:[email protected] / [email protected])
http://www.cuni.cz/~obo
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Factored models, example from tutorial

Reply via email to