Dear Moses team and users,

I am using Moses to translate from an imaginary language "French" to
English, and was hoping I could get some comments on my current setup.

Does the following use of Moses sound reasonable to anybody?  I have
posted it below as a commented Makefile excerpt.  Note that it is
based on the tutorials:

  http://www.statmt.org/moses/?n=FactoredTraining.HomePage
  http://www.statmt.org/moses/?n=Moses.Tutorial

software
--------
- GIZA++ 1.0.2
   (compiled /without/ the -DBINARY_SEARCH_FOR_TTABLE flag)
- SRILM
   (standard)
- moses 2008-7-11
   (standard)

usage
-----
My corpus consists of two text files,
 foo/train-corpus.en
 foo/train-corpus.fr

Each line in the file consists of a sentence in the respective language,
with (for example) the sentence in line 3 of the English file
corresponding to the sentence in line 3 of the "French" file.

> %/m-corpus.en %/m-corpus.fr : %/train-corpus.en %/train-corpus.fr
>         cd $(<D) ; $(MOSES_SCRIPTS)/training/clean-corpus-n.perl train-corpus 
> en fr m-corpus 1 100

Before using my corpus directly, I clean it up with the clean-corpus
script, which produces the files foo/m-corpus.en and foo/m-corpus.fr

> %.lm : %
>         $(SRILM_BINDIR)/ngram-count -text $< -lm $@

From foo/m-corpus.lm, I train a language model using SRILM's ngram-count
with the options -text.  I assume these are reasonable options to pass
to SRILM.

> %/model/moses.ini: %/m-corpus.en.lm
>         cd $(<D); $(MOSES_SCRIPTS)/training/train-factored-phrase-model.perl\
>           --root-dir .\
>           --corpus $(basename $(basename $(<F)))\
>           --f fr --e en --lm 0:3:$(<F):0

Armed with an English language model, I use the script
  train-factored-phrase-model.perl
I am using an unfactored language model for simplicity.

This produces foo/model/moses.ini, among other files in foo/model,
notably foo/model/phrase-table.0-0.gz.

> %/test.results: %/test-corpus.fr %/test-corpus.en %/model/moses.ini
>         cd $(<D); moses -f model/moses.ini < $(<F) > $(@F)

Finally, some translation.  I call Moses on the file foo/model/moses.ini
and I produce foo/test.results which looks a bit like English indeed.

Any thoughts?

Thanks!

-- 
Eric Kow <http://www.nltg.brighton.ac.uk/home/Eric.Kow>
PGP Key ID: 08AC04F9

Attachment: signature.asc
Description: Digital signature

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to