Hi Xianhua

Good point about the evaluation. I don't have an easy answer, but these two 
papers might help

http://www.aclweb.org/anthology-new/P/P11/P11-1103.pdf
http://www.aclweb.org/anthology-new/W/W11/W11-2102.pdf

cheers - Barry

On Tuesday 01 November 2011 09:37:18 Li Xianhua wrote:
> Hi Barry,
> 
>       Thank you very much. I have run three times on my baseline and the
>  variation is 0.11 BLEU-4 score on dev set, 0.28 on test set. I think I
>  should do more experiments. By the way, as the problem I'm focusing on is
>  the reordering problem in Japanese- Chinese translation, I'm afraid BLEU
>  won't reflect the differences as it uses n gram. In other words, I 'm
>  afraid it is not so sensitive to reordering. So, which evaluation tool
>  would you suggest in this case?
> 
> ----------------------------------------------------
> Best wishes!
> Xianhua Li
> Information Technology Laboratory
> Fujitsu Research & Development Center Co.,LTD.
> 13F Tower A, Ocean International Center,
> No.56 Dong Si Huan Zhong Rd, Chaoyang District, Beijing, China ,100025
> E-mail:[email protected]
> 
> 
> -----邮件原件-----
> 发件人: Barry Haddow [mailto:[email protected]]
> 发送时间: 2011年11月1日 17:17
> 收件人: Li Xianhua
> 抄送: [email protected]
> 主题: Re: 答复: [Moses-support] mert-moses.pl script
> 
> Hi Xianhua
> 
> >  that to reduce the server burden? So the question is like, will the
> > improvement we get while run deterministic mert still be observed
> > while  run undeterministic mert?
> 
> It depends how large the improvement is. If you run (underterministic) mert
>  a few times on your baseline then you will see how much variation there
>  is, and that will tell you (roughly) how big your improvement needs to be
>  before you can be confident about it,
> 
> cheers - Barry
> 
> On Tuesday 01 November 2011 09:06:47 Li Xianhua wrote:
> > Hi Barry,
> >
> >     Thanks for your suggestion. I am doing Japanese-Chinese machine
> > translation, solving the reordering problem. I want to make sure
> > whether  my method improves the results. Our server is not strong
> > enough and  several people run experiments on it. So I am wondering,
> > can I first run  deterministic mert to make sure the method is
> > effective, then run  undeterministic mert several times and average
> > them to get the results, so  that to reduce the server burden? So the
> > question is like, will the  improvement we get while run deterministic
> > mert still be observed while  run undeterministic mert?
> >
> > ----------------------------------------------------
> > Best wishes!
> > Xianhua Li
> > Information Technology Laboratory
> > Fujitsu Research & Development Center Co.,LTD.
> > 13F Tower A, Ocean International Center,
> > No.56 Dong Si Huan Zhong Rd, Chaoyang District, Beijing, China ,100025
> > E-mail:[email protected]
> >
> >
> > -----邮件原件-----
> > 发件人: Barry Haddow [mailto:[email protected]]
> > 发送时间: 2011年11月1日 16:51
> > 收件人: [email protected]
> > 抄送: Li Xianhua
> > 主题: Re: [Moses-support] mert-moses.pl script
> >
> > Hi Xianhua
> >
> > Way 1 uses no random restarts at all, so each iteration starts where
> > the  last one left off, which is why the results are always the same.
> > The other  methods both use random restarts (which make them different
> > from Way 1)  and they each use fixed seeds (so they are deterministic)
> > but the seeds  are different (so they are different from each other).
> >
> > My guess is that Ways 2 & 3 will give better results as they explore a
> > larger part of the parameter space. However the best method is to
> > follow  Clark et al. and run mert several times and average. This
> > advice also  applies to most other methods of tuning MT systems, in
> > particular online  algorithms where the order of the data may matter.
> >
> > cheers - Barry
> >
> > On Tuesday 01 November 2011 01:42:26 Li Xianhua wrote:
> > > Hi all,
> > >
> > >   About the moses mert problem Neda mentioned, I tried three ways to
> > > make MERT deterministic, but their results between them are different.
> > >   Way 1: edit mert-moses.pl  line 105: "my $_RANDOM-RESTARTS=20",
> > > change 20 to 0
> > >   Way 2: switch to the mert-moses.pl call: --mertargs=" -r $seed " as
> > > Patrik said
> > >   Way 3: activate flag "--predictable-seed", as Nicola and Barry said
> > >
> > >   I ran totally 9 experiments, 3 for each way with the same corpus
> > > and parameters, and compared the results. the results in the same
> > > way are the same, but the results with the same corpus but different
> > > ways are different. So my questions are:
> > >   Are the three ways right to make MERT deterministic? Which is best?
> > > Why are they different?
> > >
> > > ----------------------------------------------------
> > > Best wishes!
> > > Xianhua Li
> > > Information Technology Laboratory
> > > Fujitsu Research & Development Center Co.,LTD.
> > > 13F Tower A, Ocean International Center,
> > > No.56 Dong Si Huan Zhong Rd, Chaoyang District, Beijing, China
> > > ,100025 E-mail:[email protected]
> > >
> > >
> > >
> > > _______________________________________________
> > > Moses-support mailing list
> > > [email protected]
> > > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> > --
> > Barry Haddow
> > University of Edinburgh
> > +44 (0) 131 651 3173
> >
> > --
> > The University of Edinburgh is a charitable body, registered in
> > Scotland,  with registration number SC005336.
> 
> --
> Barry Haddow
> University of Edinburgh
> +44 (0) 131 651 3173
> 
> --
> The University of Edinburgh is a charitable body, registered in Scotland,
>  with registration number SC005336.
> 
 
--
Barry Haddow
University of Edinburgh
+44 (0) 131 651 3173

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to