Hi Jörg

In each MERT iteration, the first action is to decode the tuning set and 
create an n-best list, using the current weight set. The 1-bests from 
this decoding run are the hypotheses which get scored by --return-best-dev.

After that decoding, MERT searchs for a weight set that can rerank the 
n-best lists to give a better BLEU, and stops when it reaches a local 
maximum. This is the BLEU that is reported in the moses.ini file. So it 
is a BLEU obtained by decoding with one weight set, and then reranking 
with a different weight set. When you redecode using the new weight set 
you do not get the same set of translations, since the nbest list is 
just a tiny sample of the hypotheses that are considered during 
decoding, so there will normally be hypotheses outwith the nbest list 
which have higher model score.

We haven't generally used --return-best-dev with MERT - does it help? 
It's really designed for pro and kbmira.

cheers - Barry

On 06/03/14 11:28, Jorg Tiedemann wrote:
> Hi,
>
> I have a question about the --return-best-dev flag in mert-moses.pl
> I have run several experiments using this flag and I don't really 
> understand how it influences the choice of settings during MERT. In 
> many cases, the system will select an early iteration which is much 
> below in terms of BLEU than many iterations later. Maybe my confusing 
> is related to the BLEU score mentioned in the moses.ini files printed 
> after each iteration? Can someone help me? Thanks!
>
>
> Cheers,
> Jörg
>
>
> Jörg Tiedemann
> [email protected] <mailto:[email protected]>
>
>
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to