On 19 Sep 2008, at 02:25, Vineet Kashyap wrote: > Hi Adam > > Thanks for the detailed explanation. > > I sorted out my training and testing sets and finally > i have managed to get the following scores: > > NIST BLEU Multi-bleu IBM-BLEU > 6.0819 0.2070 20.70 0.2438 > > These were evaluated on a testing set of 5082 sentences > as suggested to get reliable scores need to test on more > data and the system was trained using only 6053 sentences. > > Obviously, accuracy can be further improved by tuning and > dropping unknown words and increasing the corpus size. > > It took about 70 minutes for the decoding process > which i guess can be improved by running moses in parallel. > > Now, my question is can such a small corpus suggest BLEU score of > 20% accuracy > as i have made sure both training and testing is separate? > > Also, what is the range of each of these BLEU systems?
I'm not sure that I understand wither of these questions. The absolute value of the BLEU score is not especially meaningful -- it is a tool to compare multiple systems. > With multi-bleu i guess 100% accuracy will 100 and > BLEU in general will between 0-1 and multiply by 100 > to get accuracy in percentage. Yes, that's correct. > But NIST, not sure I don't recall the scale for NIST scores, but maybe someone else on the list does. > Thanks again > > Vineet > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
