Hello, I want to estimate the bleu score for translating between two languages using Moses. I have a corpus with one million sentence pairs which should be divided into train, development and test sets. I used 1000 sentences for tuning and split the others into train and test sets. By increasing the size of the test set, the bleu score will decrease.
I want to know what is the reasonable size for the train and test data sets to get a reliable bleu score? Regards Arefeh
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
