Hi,

the size of the tuning set is one factor in the variance between different
tuning runs.

Run MERT (or kbmira) multiple times, to see what test BLEU scores you will
get. If they vary too much between tuning runs, then a larger tuning set
would be better.

The motivation to keep variance low is that you can actually observe the
impact of your improved system over the baseline.

-phi

On Mon, Dec 28, 2015 at 11:16 AM, Vincent Nguyen <[email protected]> wrote:

>
> this is fine for tuning. if you want to make it quicker, drag it down to
> 1000 sentences.
>
>
> Le 28/12/2015 16:37, Read, James C a écrit :
>
> Hi,
>
>
> I'm setting up some Moses baseline systems for various language pairs to
> compare the systems against my own work. I've largely been following the
> baseline tutorial here http://www.statmt.org/moses/?n=moses.baseline
>
>
> What would a fair amount of tuning data be to do the Moses system justice?
> I've currently managed to isolate a set of about 1,900 sentences which are
> roughly common to all language pairs. Would that be enough? To much? Not
> enough?
>
>
> James
>
>
> _______________________________________________
> Moses-support mailing 
> [email protected]http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to