Hi Adam Thanks for the detailed explanation.
I sorted out my training and testing sets and finally i have managed to get the following scores: NIST BLEU Multi-bleu IBM-BLEU 6.0819 0.2070 20.70 0.2438 These were evaluated on a testing set of 5082 sentences as suggested to get reliable scores need to test on more data and the system was trained using only 6053 sentences. Obviously, accuracy can be further improved by tuning and dropping unknown words and increasing the corpus size. It took about 70 minutes for the decoding process which i guess can be improved by running moses in parallel. Now, my question is can such a small corpus suggest BLEU score of 20% accuracy as i have made sure both training and testing is separate? Also, what is the range of each of these BLEU systems? With multi-bleu i guess 100% accuracy will 100 and BLEU in general will between 0-1 and multiply by 100 to get accuracy in percentage. But NIST, not sure Thanks again Vineet _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
