it looks like you have made an error of some kind; are you testing on training-set sentences for example? did you rerun MERT when adding those new sentences?
(and a training set of 6 -- 8k pairs is very small ...) the other scores are unigram / bigram etc matches, up to a four-gram Miles 2008/9/18 Vineet Kashyap <[EMAIL PROTECTED]>: > Hi all > > I would just like to know if there is a significant difference > when scoring translations using multi-bleu. > > With multi-bleu i got the following scores for testing on 2000 sentences > > BLEU = 34.62, 63.4/38.8/27.8/21.3 (BP=0.996, ratio=0.996, hyp_len=16587, > ref_len=16660) > > and the following for 5082 sentences > > BLEU = 3.82, 11.1/4.0/2.6/1.9 (1, 1.017,44536,43809) > > The only change i made was increased the corpus size from 6053 to 8948. > > Would this have caused such a big jump in the BLEU scores? > > Another question is that what does other parameters except the first which is > the BLEU score mean ? > > Also, is multi-bleu in par with mteval? Can i consider a BLEU of 34.62 > to be correct. > > Thanks > > Vineet > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
