you should start with simple factored models 1st, perhaps using only 1 translation model with 2 factors. Then move onto 1 translation model and 1 generation model.

The factored you are difficult to control, they use a lot of memory and takes a lot of time. You may be getting errors because it runs out of memory.

On 28/12/15 10:01, gozde gul wrote:
Hi,

I am trying to perform a 3 factored translation from English to Turkish. My example input is as follows:

En: Life+NNP|Life|NNP end+VBZ_never+RB|end|VBZ_never+RB but+CC|but|CC earthly+JJ|earthly|JJ life+NN|life|NN do+VBZ|do|VBZ .+.|.|. Tr: Hayat|hayat|+Noun+A3sg+Pnon+Nom hiç|hiç|+Adverb bitmez|bit|+Verb+Neg+Aor+A3sg fakat|fakat|+Conj dünyadaki|dünya|+Noun+A3sg+Pnon+Loc^DB+Adj+Rel hayat|hayat|+Noun+A3sg+Pnon+Nom biter|bit|+Verb+Pos+Aor+A3sg .|.|+Punc

My translation and generation factors and decoding steps are as follows. I am pretty sure they are correct:
--translation-factors 1-1+2-2+0-0 \
--generation-factors 1,2-0 \
--decoding-steps t2:t0,t1,g0

I have created language models for all 3 factors with irstlm with the steps explained in moses website.

If I train with the following model, it creates a moses.ini. When I manually check the phrase tables and generation table they look meaningful.
~/mosesdecoder/scripts/training/train-model.perl \
--parallel --mgiza \
--external-bin-dir ~/workspace/bin/training-tools/mgizapp \
--root-dir ~/FactoredModel/SmallModel/  \
--corpus ~/FactoredModel/SmallModel/factored-corpus/training/korpus_1000K.en-tr.KO.recleaned_new \
--f en --e tr --alignment grow-diag-final-and \
--reordering msd-bidirectional-fe \
--lm 0:3:$HOME/corpus/FilteredCorpus/training/lm/surLM/sur.lm.blm.tr:8 <http://sur.lm.blm.tr:8/> \ --lm 1:3:$HOME/corpus/FilteredCorpus/training/lm/lemmaLM/lemma.lm.blm.tr:8 <http://lemma.lm.blm.tr:8/> \ --lm 2:3:$HOME/corpus/FilteredCorpus/training/lm/postagLM/postags.lm.blm.tr:8 <http://postags.lm.blm.tr:8/> \
--alignment-factors 1-1 \
--translation-factors 1-1+2-2+0-0 \
--generation-factors 1,2-0 \
--decoding-steps t2:t0,t1,g0 >& ~/FactoredModel/trainingSmall3lm.out

But when I try to decode a very simple one-line sentence, I get a "Segmentation fault (Core Dumped)" message. You can see the detailed decoding log here <https://www.dropbox.com/s/2xgi681k2ssus5z/error.txt?dl=0>. I tried many things and I'm in a dead end, so I would really appreciate your help.

Thanks,

Gozde



_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

--
Hieu Hoang
http://www.hoang.co.uk/hieu

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to