Re: [Moses-support] Significance of BLEU using Multi-bleu

Miles Osborne Thu, 18 Sep 2008 00:51:35 -0700

firstly, do MERT and make sure that everything has reasonable parameters!

this is how to think about testing.   you are trying to estimate the
error of your model (which you trained-up in the usual way).  when
estimating this error, the *training set* is the test set. so, the
more `training' material you have, the better your confidence in
estimating that error.


in short,  the more test material you use, the more reliable your
results will be.  results can vary in both directions --both up (you
got lucky) and down (you are unlucky).  increasing the test set size
reduces the chances of either of these situations happening.

when working with a narrow domain, you should need fewer sentences,
exactly how few will depend upon what you are doing.

Miles

2008/9/18 Vineet Kashyap <[EMAIL PROTECTED]>:
> Hi Miles
>
> Thanks for the fast reply.
>
> I am very sure that both testing and training data is different.
>
> Also, no optimization has been done using MERT and the training
> set is about 8948 sentences. But generally speaking would
> testing on a small set of sentences increase the BLEU
> scores and is it possible to get good scores with a small corpus
> when working with a narrow domain.
>
> I am doing further testing and will look at Corpus size vs BLEU
>
> Thanks
>
> Vineet
>
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Significance of BLEU using Multi-bleu

Reply via email to