Hi all, I am still trying to figure out why my BLEU baseline score is different from the literature. So I am backtracking everything. Maybe you could help me out with the following: I would like to use the same europarl corpus for training as the wmt08. I download this from the following paths, could you please tell me if this is correct. preprocessing: http://www.statmt.org/wmt08/training-parallel.tar for a FR->EN system I use: europarl-v3b.fr-en.en.gz europarl-v3b.fr-en.fr.gz In the wmt08 baseline system at prepare data these unpacked files are called wmt08/training/europarl-v3.fr-en.fr wmt08/training/europarl-v3.fr-en.en No b here, are these the same/correct files? Same issue with the language model data which i download from: http://www.statmt.org/wmt08/training-monolingual.tar
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
