Hi Anh * Looking for MT/NLP opportunities * Hieu Hoang http://moses-smt.org/
On 12 March 2017 at 23:44, Tran Anh <[email protected]> wrote: > I have done experiments with Factored model. The tuning and testing is > done with source text annotated with the same factors as during the > training. The target text is clean, without factors. > > I found that my factored model (bleu score = 22.2) higher than bleu score > of Baseline = 21.11(no factor). > Training command has translation factors and generation factors steps: > (......--translation-factors 0-0+1-1+2-2 --generation-factors 2,3-0.....). > > *This is moses.ini file (trainig is finished, but notyet tuning):* > > ######################### > ### MOSES CONFIG FILE ### > ######################### > > # input factors > [input-factors] > 0 > 1 > 2 > > # mapping steps > [mapping] > 0 T 0 > 0 T 1 > 0 T 2 > > [distortion-limit] > 6 > > # feature functions > [feature] > UnknownWordPenalty > WordPenalty > PhrasePenalty > PhraseDictionaryMemory name=TranslationModel0 num-features=4 > path=/home/yychen/55factor-hz4new-VC/train2-ge3/train/model/phrase-table.0-0.gz > input-factor=0 output-factor=0 > PhraseDictionaryMemory name=TranslationModel1 num-features=4 > path=/home/yychen/55factor-hz4new-VC/train2-ge3/train/model/phrase-table.1-1.gz > input-factor=1 output-factor=1 > PhraseDictionaryMemory name=TranslationModel2 num-features=4 > path=/home/yychen/55factor-hz4new-VC/train2-ge3/train/model/phrase-table.2-2.gz > input-factor=2 output-factor=2 > Generation name=GenerationModel0 num-features=2 > path=/home/yychen/55factor-hz4new-VC/train2-ge3/train/model/generation.2,3-0.gz > input-factor=2,3 output-factor=0 > LexicalReordering name=LexicalReordering0 num-features=6 > type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0 > path=/home/yychen/55factor-hz4new-VC/train2-ge3/train/model/ > reordering-table.0-0.wbe-msd-bidirectional-fe.gz > Distortion > KENLM lazyken=0 name=LM0 factor=0 path=/home/yychen/55factor-hz4 > new-VC/train2-ge3/vi-ch.lm.ch order=3 > > # dense weights for feature functions > [weight] > # The default weights are NOT optimized for translation quality. You MUST > tune the weights. > # Documentation for tuning is here: http://www.statmt.org/ > moses/?n=FactoredTraining.Tuning > UnknownWordPenalty0= 1 > WordPenalty0= -1 > PhrasePenalty0= 0.2 > TranslationModel0= 0.2 0.2 0.2 0.2 > TranslationModel1= 0.2 0.2 0.2 0.2 > TranslationModel2= 0.2 0.2 0.2 0.2 > GenerationModel0= 0.3 0 > LexicalReordering0= 0.3 0.3 0.3 0.3 0.3 0.3 > Distortion0= 0.3 > LM0= 0.5 > > > *This is moses.ini file (tuning is finished):* > > # MERT optimized configuration > # decoder /opt/moses/bin/moses > # BLEU 0.200847 on dev /home/yychen/55factor-hz4new-V > C/tun2-ge3/vi.tun4-new > # We were before running iteration 4 > # finished 2017年 01月 08日 星期日 19:51:49 CST > ### MOSES CONFIG FILE ### > ######################### > > # input factors > [input-factors] > 0 > 1 > 2 > > # mapping steps > [mapping] > 0 T 0 > > > > #[decoding-graph-backoff] > #0 > #1 > Why did you comment out this section? Did you retune it after you comment it out? > > [distortion-limit] > 6 > > # feature functions > [feature] > UnknownWordPenalty > WordPenalty > PhrasePenalty > PhraseDictionaryMemory name=TranslationModel0 num-features=4 > path=/home/yychen/55factor-hz4new-VC/train2-ge3/train/model/phrase-table.0-0.gz > input-factor=0 output-factor=0 > PhraseDictionaryMemory name=TranslationModel1 num-features=4 > path=/home/yychen/55factor-hz4new-VC/train2-ge3/train/model/phrase-table.1-1.gz > input-factor=1 output-factor=1 > PhraseDictionaryMemory name=TranslationModel2 num-features=4 > path=/home/yychen/55factor-hz4new-VC/train2-ge3/train/model/phrase-table.2-2.gz > input-factor=2 output-factor=2 > Generation name=GenerationModel0 num-features=2 > path=/home/yychen/55factor-hz4new-VC/train2-ge3/train/model/generation.2,3-0.gz > input-factor=2,3 output-factor=0 > LexicalReordering name=LexicalReordering0 num-features=6 > type=wbe-msd-bidirectional-fe-allff input-factor=0 output-factor=0 > path=/home/yychen/55factor-hz4new-VC/train2-ge3/train/model/ > reordering-table.0-0.wbe-msd-bidirectional-fe.gz > Distortion > KENLM lazyken=0 name=LM0 factor=0 path=/home/yychen/55factor-hz4 > new-VC/train2-ge3/vi-ch.lm.ch order=3 > > # dense weights for feature functions > [weight] > > LexicalReordering0= 0.0421305 0.0145905 0.0421305 0.0419472 0.0571605 > 0.110762 > Distortion0= 0.0357908 > LM0= 0.0702177 > WordPenalty0= -0.140435 > PhrasePenalty0= 0.037449 > TranslationModel0= 0.00820789 0.0280871 0.117941 -0.00550954 > TranslationModel1= 0.0280871 0.0273782 -0.0150248 0.0280871 > TranslationModel2= 0.0453928 0.00576192 0.0280871 0.0276907 > GenerationModel0= 0.0421305 0 > UnknownWordPenalty0= 1 > > Here, I want to try translate " 留 学 生 " in source language to target > language by using n-best. > However, I want to demonstrate why that result is better BASELINE, by > using n-best (% moses -f moses.ini -n-best-list listfile2 < in). > When tuning process is finished, i tried to translate some resource > sentences to target sentences. But, parameters of TranslationModel0 ( map > 0-0) is changed, while the parameters of (TranslationModel1, > TranslationModel2, GenerationModel0) are 0 0 0 0. Translation results is > as follows . *(here, n = 2)**:* > > > > 0 ||| 留 学 生 ||| LexicalReordering0= -1.60944 0 0 0 0 0 Distortion0= 0 > LM0= -15.2278 LM1= -699.809 WordPenalty0= -3 PhrasePenalty0= 1 > TranslationModel0= -1.38629 -2.20651 0 -2.21554 *TranslationModel1= 0 0 0 > 0 TranslationModel2= 0 0 0 0 GenerationModel0= 0 0* ||| -0.589076 > 0 ||| 留 学 生 ||| LexicalReordering0= -1.86048 0 0 -0.510826 0 0 > Distortion0= 0 LM0= -15.2278 LM1= -699.809 WordPenalty0= -3 > PhrasePenalty0= 2 TranslationModel0= -2.86909 -2.20651 -0.09912 -2.21554 > *TranslationModel1= > 0 0 0 0 TranslationModel2= 0 0 0 0 GenerationModel0= 0 0* ||| -0.727864 > > I want to compare my factored model with baseline at every translation > step in SMT to explain why my model is good. > So I want to ask you: > > 1. Can you explain for me why that parameters are 0 0 0 0.? > 2. My factors which I added to factored model are useful or not? > 3. How to get the parameters in translation result (n-best) of > *TranslationModel1, **TranslationModel2, **GenerationModel0* is different > to 0 0 0 0? > Start with a simple, non-factored mode and make sure it worksl. Build it up slowly, adding a phrase-table or generation table at each step. Retune at each step > > i am waiting for you reply ~~! > Thank you so much! > With best regards, > > Tran Anh, > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
