Re: [Moses-support] Tuning and decoding of lattices in the new Moses.

Hieu Hoang Fri, 06 Sep 2013 15:26:58 -0700

Good to know. I don't think it's obvious that you need that switch for lattice 
input. Maybe there should be a check of some sort in the mert scrip


Sent while bumping into things

On 6 Sep 2013, at 15:42, Yulia Tsvetkov <[email protected]> wrote:

> Hi Hieu, 
> 
> A quick update: I should have used the --no-filter-phrase-table flag, 
> otherwise phrase table gets filtered. Thanks a lot for our help!!!!
> 
> Yulia
> 
> 
> On Wed, Sep 4, 2013 at 12:34 PM, Hieu Hoang <[email protected]> wrote:
>> Ok. If you're stil stuck please send me your phrase table and I'll try and 
>> debug it
>> 
>> Sent while bumping into things
>> 
>> On 4 Sep 2013, at 17:07, Yulia Tsvetkov <[email protected]> wrote:
>> 
>>> phrase table is not empty, it looks normal, here is the snippet:
>>> 
>>> no one ||| aucun de ceux ||| 1 0.00157474 0.0060241 5.00684e-06 ||| 0-0 0-1 
>>> 1-2 ||| 1 166 1
>>> no one ||| ce que personne ||| 0.5 3.7494e-05 0.0060241 5.89199e-06 ||| 0-0 
>>> 1-2 ||| 2 166 1
>>> no one ||| il que personne ||| 1 9.31515e-05 0.0060241 1.11289e-05 ||| 0-0 
>>> 1-2 ||| 1 166 1
>>> no one ||| n'est pas le seul ||| 0.0714286 0.0073779 0.0060241 4.54759e-07 
>>> ||| 0-0 0-1 1-3 ||| 14 166 1
>>> no one ||| on ne ||| 0.00444444 0.000152764 0.0060241 0.000497078 ||| 1-0 
>>> 0-1 ||| 225 166 1
>>> no one ||| pas ||| 6.5066e-05 0.000267155 0.0060241 0.294497 ||| 0-0 ||| 
>>> 15369 166 1
>>> 
>>> i don't filter the phrase table...
>>> 
>>> I'll debug more, and Chris was going to look at it too, I will send you an 
>>> update.
>>> 
>>> Thanks!
>>> 
>>> Yulia
>>> 
>>> 
>>> 
>>> On Wed, Sep 4, 2013 at 10:41 AM, Hieu Hoang <[email protected]> wrote:
>>>> hmm, strange. the moses.ini file looks ok. There shouldn't be an issue 
>>>> with initialisation. Is the phrase-table empty? 
>>>> 
>>>> make sure you're not fitlering the phrase table, i don't think the filter 
>>>> script understand lattices
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On 4 September 2013 15:10, Yulia Tsvetkov <[email protected]> wrote:
>>>>> Hi Hieu,
>>>>> 
>>>>>> did you manage to get moses working with lattices again? it would be 
>>>>>> nice to get some feedback
>>>>> 
>>>>> Sorry for not sending feedback earlier -- I was just trying to debug by 
>>>>> myself before I send feedback or ask next question...
>>>>> 
>>>>> I was able to run a pipeline with the new settings, thanks a lot for the 
>>>>> detailed answer!
>>>>> 
>>>>> There is still a problem (with feature initialization?), here is the 
>>>>> first lattice translation, looks like all input words are treated as OOVs 
>>>>> (and they are not), and then MERT gets killed:
>>>>> 
>>>>> BEST TRANSLATION: no|UNK|UNK|UNK one|UNK|UNK|UNK of|UNK|UNK|UNK 
>>>>> the|UNK|UNK|UNK intense|UNK|UNK|UNK closures|UNK|UNK|UNK of|UNK|UNK|UNK 
>>>>> travel|UNK|UNK|UNK and|UNK|UNK|UNK one|UNK|UNK|UNK of|UNK|UNK|UNK 
>>>>> the|UNK|UNK|UNK delights|UNK|UNK|UNK of|UNK|UNK|UNK 
>>>>> ethnographic|UNK|UNK|UNK research|UNK|UNK|UNK is|UNK|UNK|UNK 
>>>>> the|UNK|UNK|UNK opportunity|UNK|UNK|UNK to|UNK|UNK|UNK live|UNK|UNK|UNK 
>>>>> amongst|UNK|UNK|UNK those|UNK|UNK|UNK who|UNK|UNK|UNK have|UNK|UNK|UNK 
>>>>> not|UNK|UNK|UNK forgotten|UNK|UNK|UNK the|UNK|UNK|UNK old|UNK|UNK|UNK 
>>>>> ways|UNK|UNK|UNK to|UNK|UNK|UNK still|UNK|UNK|UNK feel|UNK|UNK|UNK 
>>>>> their|UNK|UNK|UNK pass|UNK|UNK|UNK in|UNK|UNK|UNK the|UNK|UNK|UNK 
>>>>> when|UNK|UNK|UNK touch|UNK|UNK|UNK and|UNK|UNK|UNK stones|UNK|UNK|UNK 
>>>>> caused|UNK|UNK|UNK by|UNK|UNK|UNK rain|UNK|UNK|UNK tasted|UNK|UNK|UNK 
>>>>> leaves|UNK|UNK|UNK of|UNK|UNK|UNK the|UNK|UNK|UNK bitter|UNK|UNK|UNK 
>>>>> plants|UNK|UNK|UNK 
>>>>> [1111111111111111111111111111111111111111111111111111111111111]  
>>>>> [total=-6405.459] 
>>>>> core=(-6100.000,-50.000,61.000,0.000,0.000,0.000,0.000,-8.000,-1952.355,0.000)
>>>>>   
>>>>> Line 0: Translation took 0.000 seconds total
>>>>> Translating line 1  in thread id 47061808453376
>>>>> sh: line 1:  7333 Killed                  
>>>>> /home/ytsvetko/tools/mosesdecoder/bin/moses -config filtered/moses.ini 
>>>>> -inputtype 2 -weight-overwrite 'InputFeature0= 0.066667 PhrasePenalty0= 
>>>>> 0.066667 WordPenalty0= -0.333333 TranslationModel0= 0.066667 0.066667 
>>>>> 0.066667 0.066667 Distortion0= 0.100000 LM0= 0.166667' -n-best-list 
>>>>> run1.best100.out 100 -input-file 
>>>>> /share/workhorse4/ytsvetko/projects/mt_proj/mt_eval/baselines/fr-base-1-lats/tuning/corpus.en
>>>>>  > run1.out
>>>>> Exit code: 137
>>>>> The decoder died. CONFIG WAS -weight-overwrite 'InputFeature0= 0.066667 
>>>>> PhrasePenalty0= 0.066667 WordPenalty0= -0.333333 TranslationModel0= 
>>>>> 0.066667 0.066667 0.066667 0.066667 Distortion0= 0.100000 LM0= 0.166667' 
>>>>> 
>>>>> I attach my config file, and here is the exact command that I am 
>>>>> executing:
>>>>> 
>>>>> mert-moses.pl ./tuning/corpus.en ./tuning/corpus.fr 
>>>>> /home/ytsvetko/tools/mosesdecoder/bin/moses ./moses.ini --working-dir 
>>>>> ./tuning --mertdir /home/ytsvetko/tools/mosesdecoder/mert --inputtype 2
>>>>> 
>>>>> 
>>>>> Thanks a lot for your help!
>>>>> Yulia
>>>>> 
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On 2 September 2013 17:03, Hieu Hoang <[email protected]> wrote:
>>>>>>> Hi Yulia
>>>>>>> 
>>>>>>> 
>>>>>>> On 1 September 2013 22:46, Yulia Tsvetkov <[email protected]> 
>>>>>>> wrote:
>>>>>>>> Dear Moses developers, 
>>>>>>>> 
>>>>>>>> I am trying to use the a new version of Moses, seems like things have 
>>>>>>>> changed quite a bit and I have hard time finding an up-to-date 
>>>>>>>> documentation. For debugging I used very small train/tune/test corpora 
>>>>>>>> (10 lines each).
>>>>>>>> 
>>>>>>>> First thing is running the following command produces a phrase table 
>>>>>>>> with only 4 features:
>>>>>>>> train-model.perl --root-dir $root_dir --corpus $root_dir/$corpus_name  
>>>>>>>> --f $src_lng --e $trg_lng --alignment grow-diag-final --lm 0:3:$LM 
>>>>>>>> -external-bin-dir $external_bin_dir`;
>>>>>>>> 
>>>>>>>> Here is a snippet from a produced moses.iniPhraseDictionaryMemory 
>>>>>>>> name=TranslationModel0 table-limit=20 num-features=4 
>>>>>>>> path=/usr1/projects/mt_proj/mt_eval/baselines/fr-base-1-lats/model/phrase-table.gz
>>>>>>>>  input-factor=0 output-factor=0
>>>>>>> 
>>>>>>> Yes, the phrase-table now has 4 scores, instead of 5. The 5th score was 
>>>>>>> a constant 2.718. This has now moved into it's own feature function, 
>>>>>>> PhrasePenalty.
>>>>>>> 
>>>>>>> it save 3% of disk space, and i think is better for research. eg. 
>>>>>>> create better, non-constant phrase penalty feature functions, if we 
>>>>>>> have 2 phrase tables do we need just 1 phrase penalty? etc.
>>>>>>> 
>>>>>>>> 
>>>>>>>> Second, I am trying to run tuning and decoding of lattices in plf 
>>>>>>>> format.
>>>>>>>> Can you point me to example commands and moses.ini for running mert 
>>>>>>>> and decoding lattices with the new Moses?
>>>>>>> 
>>>>>>> an example ini file for lattices can be seen here
>>>>>>>    
>>>>>>> https://github.com/moses-smt/moses-regression-tests/blob/master/tests/phrase.lattice-surface/moses.ini
>>>>>>> 
>>>>>>> Mert should run like it has always did. However, if you upgrade the 
>>>>>>> decoder, you should use the upgraded mert script too.
>>>>>>> 
>>>>>>> Decoding with lattice is exactly the same as for a sentence, except 2 
>>>>>>> things
>>>>>>>    1. inputtype=2. This can be on the command line of in the ini file, 
>>>>>>> eg.
>>>>>>>            ./moses -inputtype 2
>>>>>>> 
>>>>>>>        or
>>>>>>>             [inputtype]
>>>>>>>             2
>>>>>>> 
>>>>>>>    2. You should use the InputFeature feature function. This is the 
>>>>>>> score of the path through the lattice. You can see the InputFeature in 
>>>>>>> the ini file:
>>>>>>>       [feature]
>>>>>>>       ....      
>>>>>>>       InputFeature num-features=1 num-input-features=1 real-word-count=0
>>>>>>>   
>>>>>>>       [weight]
>>>>>>>       ...
>>>>>>>       InputFeature0 = 1
>>>>>>> 
>>>>>>>    Before the refactoring, this was hacked into as an extra feature in 
>>>>>>> the phrase-table
>>>>>>>>  
>>>>>>>> So far I tried training and tuning on text files and decoding on 
>>>>>>>> lattices because I could not figure out the right settings for tuning.
>>>>>>>> According to some old documentation I am supposed to convert the 
>>>>>>>> phrase table to a binary format. Is it still needed?
>>>>>>> 
>>>>>>> You no longer need to convert it to binary format. It's good to convert 
>>>>>>> to binary format to save memory, but it is not required. Lattice 
>>>>>>> decoding works with all phrase-table implmentations now 
>>>>>>>> 
>>>>>>>> When I ran it with the following command:
>>>>>>>> moses -inputtype 2 -weight-i 0.62 -weight-l 12.5 -f 
>>>>>>>> $tune_dir/moses.ini < $eval_dir/69.plf > $eval_dir/69.plf.out
>>>>>>>> I got an error:
>>>>>>>> Don't mix old and new ini file format
>>>>>>>> What is the new equivalent of weight-i and weight-l?
>>>>>>>  
>>>>>>>    -weight-i 0.62
>>>>>>> now becomes
>>>>>>>    -weight-overwrite 'InputFeature0= 0.62'
>>>>>>> 
>>>>>>>   -weight-l 12.5
>>>>>>> now becomes
>>>>>>>    -weight-overwrite 'LM0= 12.5'
>>>>>>> 
>>>>>>> The updated mert script should be doing this anyway. 
>>>>>>>> 
>>>>>>>> Without those parameters I get a Segmentation Fault with both a .gz 
>>>>>>>> and a binary phrase table.
>>>>>>> 
>>>>>>> if you're still having problems, give me your ini file and exact 
>>>>>>> command you're executing and i'll try and debug it  
>>>>>>>> 
>>>>>>>> Could you help me figuring out the right settings?
>>>>>>>> 
>>>>>>>> Thanks in advance.
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> Moses-support mailing list
>>>>>>>> [email protected]
>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -- 
>>>>>>> Hieu Hoang
>>>>>>> Research Associate
>>>>>>> University of Edinburgh
>>>>>>> http://www.hoang.co.uk/hieu
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> Hieu Hoang
>>>>>> Research Associate
>>>>>> University of Edinburgh
>>>>>> http://www.hoang.co.uk/hieu
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Hieu Hoang
>>>> Research Associate
>>>> University of Edinburgh
>>>> http://www.hoang.co.uk/hieu
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Tuning and decoding of lattices in the new Moses.

Reply via email to