it looks like you have made an error of some kind;  are you testing on
training-set sentences for example?  did you rerun MERT when adding
those new sentences?

(and a training set of 6 -- 8k pairs is very small ...)

the other scores are unigram / bigram etc matches, up to a four-gram

Miles

2008/9/18 Vineet Kashyap <[EMAIL PROTECTED]>:
> Hi all
>
> I would just like to know if there is a significant difference
> when scoring translations using multi-bleu.
>
> With multi-bleu i got the following scores for testing on 2000 sentences
>
> BLEU = 34.62, 63.4/38.8/27.8/21.3 (BP=0.996, ratio=0.996, hyp_len=16587,
> ref_len=16660)
>
> and the following for 5082 sentences
>
> BLEU = 3.82, 11.1/4.0/2.6/1.9 (1, 1.017,44536,43809)
>
> The only change i made was increased the corpus size from 6053 to 8948.
>
> Would this have caused such a big jump in the BLEU scores?
>
> Another question is that what does other parameters except the first which is
> the BLEU score mean ?
>
> Also, is multi-bleu in par with mteval? Can i consider a BLEU of 34.62
> to be correct.
>
> Thanks
>
> Vineet
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to