A couple of other possibilities for you to check:

- MERT is non-deterministic and there are many local optima.  If you  
re-run the same implementation on the same data, you will get a  
different BLEU.  Usually the difference is small, but on a few  
occasions I have observed large differences in BLEU simply by re- 
tuning.  If you're observing systematic differences between the old  
and new implementation then there might be a problem; otherwise I'd  
advise you to re-run it several times just to be sure you aren't  
looking at an outlier.

- Be sure that you are tuning and scoring test with the same  
implementations of BLEU (specifically the same implementation of the  
brevity penalty).  Different implementations of the brevity penalty  
will cause BLEU to rank systems differently (even to invert the  
rankings in certain regions of the space).  In the old MERT you could  
choose the implementation of the brevity penalty via a flag and it was  
important to set this correctly; I'm not sure with the new one.


Cheers
Adam

On 23 Jul 2008, at 23:01, Barry Haddow wrote:

> Hi Mohamed
>
> One notable difference in the new mert is that, during tuning,  it  
> does not
> retokenise the reference before calculating bleu. It therefore  
> assumes that
> the reference is already tokenised. There is a python script in the  
> new mert
> directory to do the official bleu  tokenisation, but it has not yet  
> been
> integrated into moses-mert-new.pl. I don't know how much of a  
> difference the
> tokenisation change would make to the intermediate bleu calculations,
>
> regards
> Barry
>
> On Wednesday 23 July 2008 22:55:16 Mohamed F. Noamany wrote:
>> I am using AR-EN GALE data (~5.5 M sentences) for training and  
>> tuning on
>> NIST MT06 and testing GALE-DEV07.
>> I also, noticed serious change in the brevity Penalty; from 0.9839
>> to 0.9137.
>> GALE-DEV07 is two parts; news wire where there is significant  
>> improvement
>> by the new MERT implementation but there is degradation on the web  
>> part.
>> Till now it is not a big deal. What worried me is the degradation  
>> on BLEU
>> and improvement on TER while I am optimizing toward BLEU.
>>
>> Thanks,
>> Mohamed
>>
>> -----Original Message-----
>> From: Barry Haddow [mailto:[EMAIL PROTECTED]
>> Sent: Wednesday, July 23, 2008 5:34 PM
>> To: [email protected]
>> Cc: Mohamed F. Noamany
>> Subject: Re: [Moses-support] New MERT
>>
>> Hi
>>
>> The new mert is a rewrite of mert to provide a cleaner, more flexible
>> codebase
>> allowing for easier experimentation/extension. It's the same  
>> algorithm as
>> the
>> old mert so should give very similar results, however the results  
>> won't be
>> exactly the same, and it hasn't been tested as extensively as the  
>> old mert
>> so
>> there may still be bugs.
>>
>> What's your train/test setup? Using the fr-en europarl data and  
>> testing on
>> the
>> wmt06/07 test sets I got slightly higher bleu scores, but the  
>> differences
>> are
>> probably not significant.
>>
>> regards
>> Barry
>>
>> On Wednesday 23 July 2008 22:17:04 Mohamed F. Noamany wrote:
>>> Hi,
>>> Can some one please elaborate more on what has been changed in the  
>>> MERT
>>> optimization in the latest version (2008-07-08).
>>>
>>> By comparing it to the previous one, I noticed it tends to degrade  
>>> on
>>> BLEU and improve on TER  (comparison on the test/unseen set). I  
>>> can not
>>> understand that since I am tuning toward BLEU.
>>> Any feedback?
>>>
>>> Thanks,
>>> Mohamed
>>>
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to