Fix formatting... Hey,
BilingualLM is implemented and as of last week resides within moses master: https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/BilingualLM.cpp To compile it you need a NeuralNetwork backend for it. Currently there are two supported: Oxlm and Nplm. Adding a new backend is relatively easy, you need to implement the interface as shown here: https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/bilingual-lm/BiLM_NPLM.h To compile with oxlm backend you need to compile moses with the switch -with-oxlm=/path/to/oxlm To compile with nplm backend you need to compile moses with the switch -with-nplm=/path/to/nplm (You need this fork of nplm https://github.com/rsennrich/nplm Unfortunately documentaiton is not yet available so here's a short summary how to train a model and use it using, the nplm backend: Use the extract training script to prepare aligned bilingual corpus: https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/bilingual-lm/extract_training.py You need the following options: "-e", "--target-language", type="string", dest="target_language") //Mandatory, for example es "-f", "--source-language", type="string", dest="source_language") //Mandatory, for example en "-c", "--corpus", type="string", dest="corpus_stem") // path/to/corpus In the directory you have specified there should be files corpus.sourcelang and corpus.targetlang "-t", "--tagged-corpus", type="string", dest="tagged_stem") //Optional for backoff to pos tag "-a", "--align", type="string", dest="align_file") //Mandatory alignemtn file "-w", "--working-dir", type="string", dest="working_dir") //Output directory of the model "-n", "--target-context", type="int", dest="n") / "-m", "--source-context", type="int", dest="m") //The actual context size is 2*m + 1, this is the number of words on both left and right "-s", "--prune-source-vocab", type="int", dest="sprune") //cutoff vocabulary threshold "-p", "--prune-target-vocab", type="int", dest="tprune") //cutoff vocabulary threshold Then, use the training script to train the model: https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/bilingual-lm/train_nplm.py Example execution is: train_nplm.py -w de-en-500250source/ -r de-en150nopos-source750 -n 16 -d 0 --nplm-home=/home/abmayne/code/deepathon/nplm_one_layer/ -c corpus.1.word -i 750 -o 750 where -i and -o are input and output embeddings -n is the total ngram size -d is the number of hidden layyers -w and -c are the same as the extract_training options -r is the output directory of the model Consult the python script for more detailed description of the options After you have done that in the output directory you should have a trained bilingual Neural Network language model To run it in moses as a feature function you need the following line: BilingualNPLM filepath=/mnt/gna0/nbogoych/new_nplm_german/de-en150nopos/train.10k.model.nplm.10 target_ngrams=4 source_ngrams=9 source_vocab=/mnt/gna0/ nbogoych/new_nplm_german/de-enIWSLTnopos/vocab.source target_vocab=/mnt/gna0/nbogoych/new_nplm_german/de-enIWSLTnopos/vocab.targe The source and target vocab is located in the working directory used to prepare the neural network language model. target_ngrams doesn't include the predicted word (so target_ngrams = 4, would mean 1 word predicted and 4 target context word) The total of the model would target_ngrams + source_ngrams + 1) I will write a proper documentation in the following weeks. If you have any problems runnning it, please consult me. Cheers, Nick On Wed, Nov 26, 2014 at 1:02 PM, Nikolay Bogoychev <[email protected]> wrote: > Hey, > > BilingualLM is implemented and as of last week resides within moses > master: > https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/BilingualLM.cpp > > To compile it you need a NeuralNetwork backend for it. Currently there are > two supported: Oxlm and Nplm. Adding a new backend is relatively easy, you > need to implement the interface as shown here: > > https://github.com/moses-smt/mosesdecoder/blob/master/moses/LM/bilingual-lm/BiLM_NPLM.h > > To compile with oxlm backend you need to compile moses with the switch > -with-oxlm=/path/to/oxlm > To compile with nplm backend you need to compile moses with the switch > -with-nplm=/path/to/nplm (You need this fork of nplm > https://github.com/rsennrich/nplm > > Unfortunately documentaiton is not yet available so here's a short summary > how to train a model and use it using, the nplm backend: > Use the extract training script to prepare aligned bilingual corpus: > https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/bilingual-lm/extract_training.py > > You need the following options: > > "-e", "--target-language", type="string", dest="target_language") > //Mandatory, for example es "-f", "--source-language", type="string", > dest="source_language") //Mandatory, for example en "-c", "--corpus", > type="string", dest="corpus_stem") // path/to/corpus In the directory you > have specified there should be files corpus.sourcelang and > corpus.targetlang "-t", "--tagged-corpus", type="string", > dest="tagged_stem") //Optional for backoff to pos tag "-a", "--align", > type="string", dest="align_file") //Mandatory alignemtn file "-w", > "--working-dir", type="string", dest="working_dir") //Output directory of > the model "-n", "--target-context", type="int", dest="n") / "-m", > "--source-context", type="int", dest="m") //The actual context size is 2*m > + 1, this is the number of words on both left and right "-s", > "--prune-source-vocab", type="int", dest="sprune") //cutoff vocabulary > threshold "-p", "--prune-target-vocab", type="int", dest="tprune") //cutoff > vocabulary threshold > Then, use the training script to train the model: > https://github.com/moses-smt/mosesdecoder/blob/master/scripts/training/bilingual-lm/train_nplm.py > > Example execution is: train_nplm.py -w de-en-500250source/ -r > de-en150nopos-source750 -n 16 -d 0 > --nplm-home=/home/abmayne/code/deepathon/nplm_one_layer/ -c corpus.1.word > -i 750 -o 750 > > where -i and -o are input and output embeddings > -n is the total ngram size > -d is the number of hidden layyers > -w and -c are the same as the extract_training options > -r is the output directory of the model > > Consult the python script for more detailed description of the options > > After you have done that in the output directory you should have a trained > bilingual Neural Network language model > > To run it in moses as a feature function you need the following line: > > BilingualNPLM > filepath=/mnt/gna0/nbogoych/new_nplm_german/de-en150nopos/train.10k.model.nplm.10 > target_ngrams=4 source_ngrams=9 > source_vocab=/mnt/gna0/nbogoych/new_nplm_german/de-enIWSLTnopos/vocab.source > target_vocab=/mnt/gna0/nbogoych/new_nplm_german/de-enIWSLTnopos/vocab.targe > > The source and target vocab is located in the working directory used to > prepare the neural network language model. > target_ngrams doesn't include the predicted word (so target_ngrams = 4, > would mean 1 word predicted and 4 target context word) > The total of the model would target_ngrams + source_ngrams + 1) > > I will write a proper documentation in the following weeks. If you have > any problems runnning it, please consult me. > > Cheers, > > Nick > > > > > On Wed, Nov 26, 2014 at 11:53 AM, Tom Hoar < > [email protected]> wrote: > >> Hieu, >> >> Sorry I missed you in Vancouver. I just reviewed your slide deck from the >> MosesCore TAUS Round Table in Vancouver >> (taus-moses-industry-roundtable-2014-changes-in-moses-hieu-hoang-university-of-edinburgh). >> >> >> In particular, I'm interested in the "Bilingual Language Models" that >> "replicate Delvin et al, 2014". A search on statmt.org/moses doesn't >> show any hits searching for "delvin". So, A) is the code finished? If so B) >> are there any instructions how to enable/use this feature? If not, C) what >> kind of help do you need to test the code for release? >> >> -- >> >> Best regards, >> Tom Hoar >> Managing Director >> *Precision Translation Tools Co., Ltd.* >> Bangkok, Thailand >> Web: www.precisiontranslationtools.com >> Mobile: +66 87 345-1875 >> Skype: tahoar >> >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
