Dear Moses users,

The latest version (1.15) of Moses for Mere Mortals (MMM) has been
published.
Main changes:
* MMM is now hosted in the Moses project
* works with Ubuntu 10.04 LTS
* uses the new version of Moses published/updated on 13/14 August 2010.
Link:
http://mosesdecoder.svn.sourceforge.net/viewvc/mosesdecoder/trunk/scripts/moses-for-mere-mortals/

For more info on MMM, please read below.

Regards,
Hilário Leal Fontes    Maria José Machado    João Rosas

MOSES FOR MERE MORTALS - A PROTOTYPE OF A REAL WORLD TRANSLATION CHAIN
MMM is a set of scripts that enables a quick installation, in a single step,
of Moses Machine Translation System and the training of large corpora, the
translation of documents and the automatic (BLEU and NIST) scoring of its
output.

MMM enables the use, in a simple way, of very large corpora and is being
used for that purpose (translation for real translators) in our working
environment. We have therefore coupled it with two Windows add-ins that
enable the conversion of TMX files into Moses corpora and
also the conversion of Moses translations into TMX files that can be
used with a translation memory tool.

MMM has been tested with Ubuntu 10.04 LTS and the Moses version published on
August 13, 2010 and updated on August 14, 2010.

A) SOME CHARACTERISTICS:

1) Compiles all the packages used by these scripts with a single
instruction;
2) Removes control characters from the input files (these can crash a
training);
3) Extracts from the corpus files 2 test files by pseudorandomly selecting
non-consecutive segments that are erased from the corpus files;
4) A new training does not interfere with the files of a previous training;
5) A new training reuses as much as possible the files created in previous
trainings (thus saving time);
6) Detects inversions of corpora (e.g., from en-pt to pt-en), allowing a
much quicker training than that of the original language pair (also checks
that the inverse training is correct);
7) Stops with an informative message if any of the phases of training
(language model building, recaser training, corpus training, memory-mapping,
tuning or training test) doesn't produce the expected results;
8) Can limit the duration of tuning;
9) Generates the BLEU and NIST scores of a translation or of a set of
translations placed in a single directory (either for each whole document or
for each segment of it);
10) Allows you to transfer your trainings to someone else's computer or to
another Moses installation in the same computer;
11) All the mkcls, GIZA and MGIZA parameters can be controlled through
parameters of the train script;
12) Selected parameters of the Moses scripts and the Moses decoder can be
controlled with the train and translate scripts.

B) MOSES VERY SIMPLE DEMO
-------------------------
MMM is meant to quickly allow getting results with Moses. It can be placed
anywhere on the hard disk of a computer and then each of its several scripts
can be run with a single command:
a) create (in order to compile Moses and the packages it uses with a single
command);
b) make-test-files;
c) train;
d) translate;
e) score the translation(s) you got; and
f) transfer trained corpora between users or to other places of your disk.

MMM uses non-factored training, a type of training that in our experience
already produces good results in a significant number of language pairs, and
mainly with non-morphologically rich languages or with language pairs in
which the target language is not morphologically rich. A
Quick-Start-Guide helps to quickly get the feel of it and start getting
results.

It comes with a small demo corpus, too small to do justice to the quality
that Moses can achieve, but sufficient to get a general overview of SMT and
an idea of how useful Moses can be.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to