Jia,

 Yes, mert's purpose is to optimize the configuration weights such that 
 BLEU scores increase.

 I had a similar case where mert didn't change the BLEU scores. Our 
 troubleshooting found the tuning set wasn't prepared the same as the 
 training data... i.e. we forgot to lower-case and tokenize the tuning 
 set. This is probably a good place for you to start.

 Tom


 On Mon, 21 Feb 2011 09:35:41 +1100, Suzy Howlett <[email protected]> 
 wrote:
> Hi Jia,
>
> It could very well be that the training data isn't very good. Tuning
> changes how much each feature is weighted, but if the estimates of 
> the
> feature values aren't reasonable in the first place, I can't imagine 
> it
> helps too much. Perhaps you're not using enough training data, or the
> training data is just too different from your test data (e.g. genre)?
> Someone with more experience than me may be able to give you more 
> advice.
>
> Best,
> Suzy
>
> On 21/02/11 2:46 AM, Jia Xu wrote:
>> Hi,
>>
>> In my experiments, tuning with mert-moses.pl or mert-moses-new.pl on 
>> a development set did not improve the translation quality on a test 
>> set, about half percent worse in the BLEU score (no tuning vs. 
>> tuning). Does anyone have a similar experience or did I call anything 
>> wrong?
>>
>> nbest=100
>> dev: wmt-test08
>> test: wmt-test10
>> with/without tuning is achieved by turning off/on weight-config in 
>> the config file.
>>
>> Thank you!
>> Best Wishes,
>> Jia
>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to