Dear Moses users, The latest version (1.15) of Moses for Mere Mortals (MMM) has been published. Main changes: * MMM is now hosted in the Moses project * works with Ubuntu 10.04 LTS * uses the new version of Moses published/updated on 13/14 August 2010. Link: http://mosesdecoder.svn.sourceforge.net/viewvc/mosesdecoder/trunk/scripts/moses-for-mere-mortals/
For more info on MMM, please read below. Regards, Hilário Leal Fontes Maria José Machado João Rosas MOSES FOR MERE MORTALS - A PROTOTYPE OF A REAL WORLD TRANSLATION CHAIN MMM is a set of scripts that enables a quick installation, in a single step, of Moses Machine Translation System and the training of large corpora, the translation of documents and the automatic (BLEU and NIST) scoring of its output. MMM enables the use, in a simple way, of very large corpora and is being used for that purpose (translation for real translators) in our working environment. We have therefore coupled it with two Windows add-ins that enable the conversion of TMX files into Moses corpora and also the conversion of Moses translations into TMX files that can be used with a translation memory tool. MMM has been tested with Ubuntu 10.04 LTS and the Moses version published on August 13, 2010 and updated on August 14, 2010. A) SOME CHARACTERISTICS: 1) Compiles all the packages used by these scripts with a single instruction; 2) Removes control characters from the input files (these can crash a training); 3) Extracts from the corpus files 2 test files by pseudorandomly selecting non-consecutive segments that are erased from the corpus files; 4) A new training does not interfere with the files of a previous training; 5) A new training reuses as much as possible the files created in previous trainings (thus saving time); 6) Detects inversions of corpora (e.g., from en-pt to pt-en), allowing a much quicker training than that of the original language pair (also checks that the inverse training is correct); 7) Stops with an informative message if any of the phases of training (language model building, recaser training, corpus training, memory-mapping, tuning or training test) doesn't produce the expected results; 8) Can limit the duration of tuning; 9) Generates the BLEU and NIST scores of a translation or of a set of translations placed in a single directory (either for each whole document or for each segment of it); 10) Allows you to transfer your trainings to someone else's computer or to another Moses installation in the same computer; 11) All the mkcls, GIZA and MGIZA parameters can be controlled through parameters of the train script; 12) Selected parameters of the Moses scripts and the Moses decoder can be controlled with the train and translate scripts. B) MOSES VERY SIMPLE DEMO ------------------------- MMM is meant to quickly allow getting results with Moses. It can be placed anywhere on the hard disk of a computer and then each of its several scripts can be run with a single command: a) create (in order to compile Moses and the packages it uses with a single command); b) make-test-files; c) train; d) translate; e) score the translation(s) you got; and f) transfer trained corpora between users or to other places of your disk. MMM uses non-factored training, a type of training that in our experience already produces good results in a significant number of language pairs, and mainly with non-morphologically rich languages or with language pairs in which the target language is not morphologically rich. A Quick-Start-Guide helps to quickly get the feel of it and start getting results. It comes with a small demo corpus, too small to do justice to the quality that Moses can achieve, but sufficient to get a general overview of SMT and an idea of how useful Moses can be.
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
