Hi Mohamed

One notable difference in the new mert is that, during tuning,  it does not 
retokenise the reference before calculating bleu. It therefore assumes that 
the reference is already tokenised. There is a python script in the new mert 
directory to do the official bleu  tokenisation, but it has not yet been 
integrated into moses-mert-new.pl. I don't know how much of a difference the 
tokenisation change would make to the intermediate bleu calculations,

regards
Barry 

On Wednesday 23 July 2008 22:55:16 Mohamed F. Noamany wrote:
> I am using AR-EN GALE data (~5.5 M sentences) for training and tuning on
> NIST MT06 and testing GALE-DEV07.
> I also, noticed serious change in the brevity Penalty; from 0.9839
> to 0.9137.
> GALE-DEV07 is two parts; news wire where there is significant improvement
> by the new MERT implementation but there is degradation on the web part.
> Till now it is not a big deal. What worried me is the degradation on BLEU
> and improvement on TER while I am optimizing toward BLEU.
>
> Thanks,
> Mohamed
>
> -----Original Message-----
> From: Barry Haddow [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, July 23, 2008 5:34 PM
> To: [email protected]
> Cc: Mohamed F. Noamany
> Subject: Re: [Moses-support] New MERT
>
> Hi
>
> The new mert is a rewrite of mert to provide a cleaner, more flexible
> codebase
> allowing for easier experimentation/extension. It's the same algorithm as
> the
> old mert so should give very similar results, however the results won't be
> exactly the same, and it hasn't been tested as extensively as the old mert
> so
> there may still be bugs.
>
> What's your train/test setup? Using the fr-en europarl data and testing on
> the
> wmt06/07 test sets I got slightly higher bleu scores, but the differences
> are
> probably not significant.
>
> regards
> Barry
>
> On Wednesday 23 July 2008 22:17:04 Mohamed F. Noamany wrote:
> > Hi,
> > Can some one please elaborate more on what has been changed in the MERT
> > optimization in the latest version (2008-07-08).
> >
> > By comparing it to the previous one, I noticed it tends to degrade on
> > BLEU and improve on TER  (comparison on the test/unseen set). I can not
> > understand that since I am tuning toward BLEU.
> > Any feedback?
> >
> > Thanks,
> > Mohamed
> >
> >
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support



_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to