Hi Ondrej,
thank you for the literature pointers. I am trying to build an as good 
as possible baseline system for English-to-Polish SMT using factors. 
Until now it is rather frustrating, but from what I have seen in the 
literature that's not really surprising. Later I hope to come up with 
something clever to improve upon that.

Among the work you have dedicated to the problem of Czech morphology in 
English-to-Czech MT, which system would you recommend for such a baseline?
Best,
Marcin

On 10.06.2012 22:16, Ondrej Bojar wrote:
> Hi, Marcin,
>
> yes, the root of the trouble is that all possibilities are multiplied.
> Cube pruning can be considered "just a clever speedup" (a very clever,
> of course), but I think implementing a similar thing would not be very
> useful here. This of course depends on what you actually use the factors
> for, but if you use them for morphology or anything that has to get
> support for a particular choice in the context, you can't avoid the
> problem.
>
> Consider a noun: using a few factored steps, you can easily produce
> translation options for all potential cases. But without the context of
> the preceding verb, preposition or e.g. adjective, you can't pick the
> correct one. So pruning of the translation options for the noun is
> likely to prevent you from getting the agreement right. I've run into
> this issue a few times already (most recently this year,
> http://aclweb.org/anthology-new/W/W12/W12-3130.pdf) and I've tried
> circumventing it using a two-step approach, which postpones the
> morphological explosion to a separate search (where lemmas are already
> chosen). Needless to say, Alex Fraser (in the follow-up work of
> http://www.statmt.org/wmt09/pdf/WMT-0920.pdf) was somewhat more successful.
>
> So you don't want to just limit the number of options, what you actually
> want is to select the good ones...
>
> O.
>
> On 06/10/2012 08:21 PM, Marcin Junczys-Dowmunt wrote:
>> Hi Ondrej,
>> The blow-up is happening in "DecodeStepGeneration::Process(...)", right?
>> If understand the code correctly from a first glance, all possibilities
>> are simply multiplied. And indeed, there seems to be no way to limit the
>> number of combinations in this step. Could something like Cube-Pruning
>> work here to limit the number of options right from the beginning?
>> Best,
>> Marcin
>>
>> On 10.06.2012 19:02, Ondrej Bojar wrote:
>>> Dear Marcin,
>>>
>>> the short answer is: you need to avoid the blow-up.
>>>
>>> The options that affect pruning during creation of translation
>>> options are:
>>>
>>> -ttable-limit ...how many variants of a phrase to read from the phrase
>>> table)
>>>
>>> -max-partial-trans-opt ...how many partial translation options are
>>> considered for a span. This is the critical pruning to contain the
>>> blowup in memory.
>>>
>>> -max-trans-opt-per-coverage ...how many finished options should be then
>>> passed to the search.
>>> -translation-option-threshold ...the same thing, but expressed relative
>>> to the score of the best one.
>>>
>>> If you set the model so that it does blow up but you don't thrash your
>>> machine by setting -max-partial-trans-opt reasonably low, you are very
>>> likely to get a lot of search errors because the pruning of translation
>>> options happens too early, without the linear context of surrounding
>>> translation options. Moses simply does not have good means to handle the
>>> combinatorics of factored models.
>>>
>>> Cheers, Ondrej.
>>>
>>> On 06/10/2012 06:40 PM, Marcin Junczys-Dowmunt wrote:
>>>> Hi,
>>>> by the way, are there some best-practice decoder settings for heavily
>>>> factored models with combinatorial blow-up? If I am not wrong, most
>>>> settings affect hypothesis recombination later on. Here the heavy work
>>>> happens during the creation of target phrases and future score
>>>> calculation before the actual translation.
>>>> Best,
>>>> Marcin
>>>>
>>>> W dniu 09.06.2012 16:45, Philipp Koehn pisze:
>>>>> Hi,
>>>>>
>>>>> the idea here was to create a link between the
>>>>> words and POS tags early on and use this as
>>>>> an additional scoring function. But if you see better
>>>>> performance with your setting, please report back.
>>>>>
>>>>> -phi
>>>>>
>>>>> On Fri, Jun 8, 2012 at 6:03 PM, Marcin Junczys-Dowmunt
>>>>> <[email protected]> wrote:
>>>>>> Hi all,
>>>>>> I have a question concerning the "Tutorial for Using Factored
>>>>>> Models",
>>>>>> section on "Train a morphological analysis and generation model".
>>>>>>
>>>>>> The following translation factors and generation factors are trained
>>>>>> for
>>>>>> the given example corpus:
>>>>>>
>>>>>> --translation-factors 1-1+3-2 \
>>>>>> --generation-factors 1-2+1,2-0 \
>>>>>> --decoding-steps t0,g0,t1,g1
>>>>>>
>>>>>> What is the advantage of using the first generation factor 1-2
>>>>>> compared
>>>>>> to the configuration below?
>>>>>>
>>>>>> --translation-factors 1-1+3-2 \
>>>>>> --generation-factors 1,2-0 \
>>>>>> --decoding-steps t0,t1,g1
>>>>>>
>>>>>> I understand the 1-2 generation factor maps lemmas to POS+morph
>>>>>> information, but the same information is also generated by the 3-2
>>>>>> translation factor. Apart from that this generation factor introduces
>>>>>> huge combinatorial blow-up, since every lemma can be mapped to
>>>>>> basically
>>>>>> every possible morphological information seen for this lemma.
>>>>>> _______________________________________________
>>>>>> Moses-support mailing list
>>>>>> [email protected]
>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>
>>>>
>>>>
>>>
>>
>>
>


-- 
dr inż. Marcin Junczys-Dowmunt
Uniwersytet im. Adama Mickiewicza
Wydział Matematyki i Informatyki
ul. Umultowska 87
61-614 Poznań
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to