All,
After the MT Marathon this year, in Edinburgh, an implementation of a very
simple Moses training pipeline has been developed. This was developed, as
an MT Marathon project, in order to assess an Arrow based pipeline library
written in Python. This library is freely available at
https://github.com/ianj-als/pypeline .
The training pipeline can be found in the Github repository
https://github.com/ianj-als/mosesdecoder . This is a fork of the Moses
decoder Git repo. If you're interested in taking a look at the arrow based
training pipeline then please clone the forked Github repo:
$ git clone https://github.com/ianj-als/mosesdecoder
$ cd mosesdecoder
$ git submodule init
$ git submodule update
$ cd contrib/arrow-pipelines/python
Five environment variable are needed by manage.py:
MOSES_HOME = <a directory where the Moses bin and scripts directories can
be found>
IRSTLM = <an installation directory of IRSTLM>
GIZA_HOME = <an installation directory of GIZA++>
PYTHONPATH = `pwd`/libs/pypeline/src:`pwd`/training
MANAGE_HOME = `pwd`
Create a directory in your favourite directory and go to it.
python $MANAGE_HOME/manage.py <source language code, e.g. en> <target
language code, e.g. de> <source filename> <target filename>
Please note that the source and target files should be cleaned using the
Moses clean-corpus-n.perl script beforehand. Unfortunately, this was
omitted in the pipeline, but can be left as an exercise for the reader!
Happy pipelining!
--
Kind regards,
Ian Johnson
Software Engineer
[email protected]
Applied Language Solutions
High quality language solutions delivered on time...with a smile!
www.appliedlanguage.com
Tel (UK): +44 (0)845 367 7000
Tel (US): +1 (800) 579-5010
Riverside Court, Huddersfield Road, Delph, Oldham, OL3 5FZ. UK
Registered in the UK 5122429
Pride in everything we do | Respect everyone like a friend
Think of the environment; please don't print this e-mail unless you really
need to.
_______________________________________________
Mt-list mailing list