答复: 答复: [Moses-support] mert-moses.pl script

Li Xianhua Tue, 01 Nov 2011 02:41:00 -0700

Hi Barry,

        Thank you very much. I have run three times on my baseline and the 
variation is 0.11 BLEU-4 score on dev set, 0.28 on test set. I think I should 
do more experiments.
        By the way, as the problem I'm focusing on is the reordering problem in 
Japanese- Chinese translation, I'm afraid BLEU won't reflect the differences as 
it uses n gram. In other words, I 'm afraid it is not so sensitive to 
reordering. So, which evaluation tool would you suggest in this case?


----------------------------------------------------
Best wishes!
Xianhua Li 
Information Technology Laboratory
Fujitsu Research & Development Center Co.,LTD.
13F Tower A, Ocean International Center,
No.56 Dong Si Huan Zhong Rd, Chaoyang District, Beijing, China ,100025
E-mail:[email protected]


-----邮件原件-----
发件人: Barry Haddow [mailto:[email protected]] 
发送时间: 2011年11月1日 17:17
收件人: Li Xianhua
抄送: [email protected]
主题: Re: 答复: [Moses-support] mert-moses.pl script

Hi Xianhua

>  that to reduce the server burden? So the question is like, will the  
> improvement we get while run deterministic mert still be observed 
> while  run undeterministic mert?

It depends how large the improvement is. If you run (underterministic) mert a 
few times on your baseline then you will see how much variation there is, and 
that will tell you (roughly) how big your improvement needs to be before you 
can be confident about it,

cheers - Barry


On Tuesday 01 November 2011 09:06:47 Li Xianhua wrote:
> Hi Barry,
> 
>       Thanks for your suggestion. I am doing Japanese-Chinese machine  
> translation, solving the reordering problem. I want to make sure 
> whether  my method improves the results. Our server is not strong 
> enough and  several people run experiments on it. So I am wondering, 
> can I first run  deterministic mert to make sure the method is 
> effective, then run  undeterministic mert several times and average 
> them to get the results, so  that to reduce the server burden? So the 
> question is like, will the  improvement we get while run deterministic 
> mert still be observed while  run undeterministic mert?
> 
> ----------------------------------------------------
> Best wishes!
> Xianhua Li
> Information Technology Laboratory
> Fujitsu Research & Development Center Co.,LTD.
> 13F Tower A, Ocean International Center,
> No.56 Dong Si Huan Zhong Rd, Chaoyang District, Beijing, China ,100025 
> E-mail:[email protected]
> 
> 
> -----邮件原件-----
> 发件人: Barry Haddow [mailto:[email protected]]
> 发送时间: 2011年11月1日 16:51
> 收件人: [email protected]
> 抄送: Li Xianhua
> 主题: Re: [Moses-support] mert-moses.pl script
> 
> Hi Xianhua
> 
> Way 1 uses no random restarts at all, so each iteration starts where 
> the  last one left off, which is why the results are always the same. 
> The other  methods both use random restarts (which make them different 
> from Way 1)  and they each use fixed seeds (so they are deterministic) 
> but the seeds  are different (so they are different from each other).
> 
> My guess is that Ways 2 & 3 will give better results as they explore a  
> larger part of the parameter space. However the best method is to 
> follow  Clark et al. and run mert several times and average. This 
> advice also  applies to most other methods of tuning MT systems, in 
> particular online  algorithms where the order of the data may matter.
> 
> cheers - Barry
> 
> On Tuesday 01 November 2011 01:42:26 Li Xianhua wrote:
> > Hi all,
> >
> >     About the moses mert problem Neda mentioned, I tried three ways to 
> > make MERT deterministic, but their results between them are different.
> >     Way 1: edit mert-moses.pl  line 105: "my $_RANDOM-RESTARTS=20", 
> > change 20 to 0
> >     Way 2: switch to the mert-moses.pl call: --mertargs=" -r $seed " as 
> > Patrik said
> >     Way 3: activate flag "--predictable-seed", as Nicola and Barry said
> >
> >     I ran totally 9 experiments, 3 for each way with the same corpus 
> > and parameters, and compared the results. the results in the same 
> > way are the same, but the results with the same corpus but different 
> > ways are different. So my questions are:
> >     Are the three ways right to make MERT deterministic? Which is best?
> > Why are they different?
> >
> > ----------------------------------------------------
> > Best wishes!
> > Xianhua Li
> > Information Technology Laboratory
> > Fujitsu Research & Development Center Co.,LTD.
> > 13F Tower A, Ocean International Center,
> > No.56 Dong Si Huan Zhong Rd, Chaoyang District, Beijing, China 
> > ,100025 E-mail:[email protected]
> >
> >
> >
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> --
> Barry Haddow
> University of Edinburgh
> +44 (0) 131 651 3173
> 
> --
> The University of Edinburgh is a charitable body, registered in 
> Scotland,  with registration number SC005336.
> 
 
--
Barry Haddow
University of Edinburgh
+44 (0) 131 651 3173

--
The University of Edinburgh is a charitable body, registered in Scotland, with 
registration number SC005336.


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

答复: 答复: [Moses-support] mert-moses.pl script

Reply via email to