Hi all again,

A little more info, if someone has any ideas as I still haven't been able
to figure it out.

When I do tuning with models that only have one translation table, it works
fine, however with a non-factored tuning corpus. If I use a factored tuning
corpus, Moses does one run and sets all weights to zero. If I have two
translation tables, Moses doesn't do the tuning as he is missing factors.
If I use the factored corpus, I get a similar result as above. Tuning stops
after one run and sets all weights to zero. There was a similar error
mentioned a few monts back and the solution was to turn of mbr decoding,
however I am not using it. I just use the command:

~/mosesdecoder/scripts/training/mert-moses.pl
~/working/IT_corpus/TMX/txt/tuning_corpus/tuning_corpus.tagged.en
~/working/IT_corpus/TMX/txt/tuning_corpus/tuning_corpus.tagged.sl
~/mosesdecoder/bin/moses
~/working/IT_corpus/TMX/txt/factored_corpus/complex/model/moses.ini
--mertdir ~/mosesdecoder/bin/ --decoder-flags="-threads 32"

Is there something I am missing? Do I have to add anything else for tuning
a factored model?

Any help will be greatly appreciated.

Best regards,

Saso

---------- Forwarded message ----------
From: Sašo Kuntaric <saso.kunta...@gmail.com>
Date: 2016-06-20 19:36 GMT+02:00
Subject: Binarization fails with the Segmentation Fault error
To: moses-support <moses-support@mit.edu>


Hi all,

Me again (last time I hope). I have successfully trained and tuned my
factored model. Here are both moses.ini files:

#########################
### MOSES CONFIG FILE ###
#########################

# input factors
[input-factors]
0
1

# mapping steps
[mapping]
0 T 0
0 G 0
0 T 1

[distortion-limit]
6

# feature functions
[feature]
UnknownWordPenalty
WordPenalty
PhrasePenalty
PhraseDictionaryMemory name=TranslationModel0 num-features=4
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.0-1.gz
input-factor=0 output-factor=1
PhraseDictionaryMemory name=TranslationModel1 num-features=4
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.1-2.gz
input-factor=1 output-factor=2
Generation name=GenerationModel0 num-features=2
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/generation.1-0,3.gz
input-factor=1 output-factor=0,3
Distortion
KENLM name=LM0 factor=0
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/
IT_corpus_surface.blm.sl order=3
KENLM name=LM1 factor=2
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/
IT_corpus_parts.blm.sl order=3

# dense weights for feature functions
[weight]
# The default weights are NOT optimized for translation quality. You MUST
tune the weights.
# Documentation for tuning is here:
http://www.statmt.org/moses/?n=FactoredTraining.Tuning
UnknownWordPenalty0= 1
WordPenalty0= -1
PhrasePenalty0= 0.2
TranslationModel0= 0.2 0.2 0.2 0.2
TranslationModel1= 0.2 0.2 0.2 0.2
GenerationModel0= 0.3 0
Distortion0= 0.3
LM0= 0.5
LM1= 0.5

# MERT optimized configuration
# decoder /home/ksaso/mosesdecoder/bin/moses
# BLEU 0 on dev
/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/tuning/tuning-corpus.tagged.en
# We were before running iteration 2
# finished Mon Jun 20 16:19:08 CEST 2016
### MOSES CONFIG FILE ###
#########################

# input factors
[input-factors]
0
1

# mapping steps
[mapping]
0 T 0
0 G 0
0 T 1

[distortion-limit]
6

# feature functions
[feature]
UnknownWordPenalty
WordPenalty
PhrasePenalty
PhraseDictionaryMemory name=TranslationModel0 num-features=4
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.0-1.gz
input-factor=0 output-factor=1
PhraseDictionaryMemory name=TranslationModel1 num-features=4
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.1-2.gz
input-factor=1 output-factor=2
Generation name=GenerationModel0 num-features=2
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/generation.1-0,3.gz
input-factor=1 output-factor=0,3
Distortion
KENLM name=LM0 factor=0
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/
IT_corpus_surface.blm.sl order=3
KENLM name=LM1 factor=2
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/
IT_corpus_parts.blm.sl order=3

# dense weights for feature functions

[threads]
16
[weight]

Distortion0= 0
LM0= 0
LM1= 0
WordPenalty0= 0
PhrasePenalty0= 0
TranslationModel0= 0 0 0 0
TranslationModel1= 0 0 0 0
GenerationModel0= 0 0
UnknownWordPenalty0= 1

First of all, is it strange that I get all zeroes after tuning?

My problem is that the translation with this model is spectacularly slow (a
few days to translate a couple of thousand words with a 2,4 million line
corpus), so naturally I tried to binarize my phrase tables with the command

~/mosesdecoder/bin/processPhraseTableMin -in
~/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.0-1.gz
-nscores 4 -out ~/working/binarised_model/phrase-table.0-1 and
~/mosesdecoder/bin/processPhraseTableMin -in
~/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.1-2.gz
-nscores 4 -out ~/working/binarised_model/phrase-table.1-2

The process itself finishes without errors and I can run the translation
with the command

~/mosesdecoder/bin/moses -f
/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/binarised_model/moses.ini

But when I try to enter my text, I get the following:

 Translating: use|NN of|IN light|JJ
Line 1: Initialize search took 0.000 seconds total
Segmentation fault (core dumped)

When I try to filter my model, I get the same error. Any ideas what could
be causing this?

My final moses.ini file looks like this:

# MERT optimized configuration
# decoder /home/ksaso/mosesdecoder/bin/moses
# BLEU 0 on dev
/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/tuning/tuning-corpus.tagged.en
# We were before running iteration 2
# finished Mon Jun 20 16:19:08 CEST 2016
### MOSES CONFIG FILE ###
#########################

# input factors
[input-factors]
0
1

# mapping steps
[mapping]
0 T 0
0 G 0
0 T 1

[distortion-limit]
6

# feature functions
[feature]
UnknownWordPenalty
WordPenalty
PhrasePenalty
PhraseDictionaryCompact name=TranslationModel0 num-features=4
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/binarised_model/phrase-table.0-1.minphr
input-factor=0 output-factor=1
PhraseDictionaryCompact name=TranslationModel1 num-features=4
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/binarised_model/phrase-table.1-2.minphr
input-factor=1 output-factor=2
Generation name=GenerationModel0 num-features=2
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/generation.1-0,3.gz
input-factor=1 output-factor=0,3
Distortion
KENLM name=LM0 factor=0
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/
IT_corpus_surface.blm.sl order=3
KENLM name=LM1 factor=2
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/
IT_corpus_parts.blm.sl order=3

# dense weights for feature functions

[threads]
16
[weight]

Distortion0= 0
LM0= 0
LM1= 0
WordPenalty0= 0
PhrasePenalty0= 0
TranslationModel0= 0 0 0 0
TranslationModel1= 0 0 0 0
GenerationModel0= 0 0
UnknownWordPenalty0= 1

And one more question ... can I run a translation (with the
~/mosesdecoder/bin/moses command) multi-threaded?

Thanks for all the help and best regards,

Saso








-- 
lp,

Sašo
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to