Hi Ondrej,
The blow-up is happening in "DecodeStepGeneration::Process(...)", right? 
If understand the code correctly from a first glance, all possibilities 
are simply multiplied. And indeed, there seems to be no way to limit the 
number of combinations in this step. Could something like Cube-Pruning 
work here to limit the number of options right from the beginning?
Best,
Marcin

On 10.06.2012 19:02, Ondrej Bojar wrote:
> Dear Marcin,
>
> the short answer is: you need to avoid the blow-up.
>
> The options that affect pruning during creation of translation options are:
>
> -ttable-limit ...how many variants of a phrase to read from the phrase
> table)
>
> -max-partial-trans-opt ...how many partial translation options are
> considered for a span. This is the critical pruning to contain the
> blowup in memory.
>
> -max-trans-opt-per-coverage ...how many finished options should be then
> passed to the search.
> -translation-option-threshold ...the same thing, but expressed relative
> to the score of the best one.
>
> If you set the model so that it does blow up but you don't thrash your
> machine by setting -max-partial-trans-opt reasonably low, you are very
> likely to get a lot of search errors because the pruning of translation
> options happens too early, without the linear context of surrounding
> translation options. Moses simply does not have good means to handle the
> combinatorics of factored models.
>
> Cheers, Ondrej.
>
> On 06/10/2012 06:40 PM, Marcin Junczys-Dowmunt wrote:
>> Hi,
>> by the way, are there some best-practice decoder settings for heavily
>> factored models with combinatorial blow-up? If I am not wrong, most
>> settings affect hypothesis recombination later on. Here the heavy work
>> happens during the creation of target phrases and future score
>> calculation before the actual translation.
>> Best,
>> Marcin
>>
>> W dniu 09.06.2012 16:45, Philipp Koehn pisze:
>>> Hi,
>>>
>>> the idea here was to create a link between the
>>> words and POS tags early on and use this as
>>> an additional scoring function. But if you see better
>>> performance with your setting, please report back.
>>>
>>> -phi
>>>
>>> On Fri, Jun 8, 2012 at 6:03 PM, Marcin Junczys-Dowmunt
>>> <[email protected]> wrote:
>>>> Hi all,
>>>> I have a question concerning the "Tutorial for Using Factored Models",
>>>> section on "Train a morphological analysis and generation model".
>>>>
>>>> The following translation factors and generation factors are trained
>>>> for
>>>> the given example corpus:
>>>>
>>>> --translation-factors 1-1+3-2 \
>>>> --generation-factors 1-2+1,2-0 \
>>>> --decoding-steps t0,g0,t1,g1
>>>>
>>>> What is the advantage of using the first generation factor 1-2 compared
>>>> to the configuration below?
>>>>
>>>> --translation-factors 1-1+3-2 \
>>>> --generation-factors 1,2-0 \
>>>> --decoding-steps t0,t1,g1
>>>>
>>>> I understand the 1-2 generation factor maps lemmas to POS+morph
>>>> information, but the same information is also generated by the 3-2
>>>> translation factor. Apart from that this generation factor introduces
>>>> huge combinatorial blow-up, since every lemma can be mapped to
>>>> basically
>>>> every possible morphological information seen for this lemma.
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> [email protected]
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>
>>
>


-- 
dr inż. Marcin Junczys-Dowmunt
Uniwersytet im. Adama Mickiewicza
Wydział Matematyki i Informatyki
ul. Umultowska 87
61-614 Poznań
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to