I don't know the exact problem but your factored model looks too
complicated so the tuning algorithm kinda just gives up.
i would try a very simple model 1st, eg.
translate 0 -> 0,1,2,3
or
translate 0,1 -> 0,1,2,3
Once you see that working correctly, add a generation model.
You have to do this bit-by-bit and see what happens
On 28/06/2016 20:44, Sašo Kuntaric wrote:
Well, I installed Moses only a few months ago, so it should be the
latest version.
I find it really strange. I have tried everything - binarizing tables
(which finishes with no problems), using the --no-filter-phrase-table
parameter, adding language models for all the factors I have (this one
gave me a segmentation fault) and I always get the same result. Tuning
stops after two runs and all the weights get set to zero with the message
(2) BEST at 2: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 => 0 at Tue Jun 28
17:38:43 CEST 2016
None of the weights changed more than 1e-05. Stopping.
The translation models themselves are created with no issues. If I
have one translation table, I can tune them with an unfactored corpus,
but as soon as I use a factored one, everything goes south. If I have
two translation tables, I cannot tune with an unfactored file, since
it wants the stated number of factors.
I would really appreciate if someone has an idea what to do.
Best regards,
Saso
2016-06-27 14:45 GMT+02:00 Rajen Chatterjee
<rajen.k.chatter...@gmail.com <mailto:rajen.k.chatter...@gmail.com>>:
Hi, in the past I had similar problem, the weights after 1
iteration of tuning were getting to 0. I do not know the cause of
this, but if I remember when I used another version of Moses (I
think Release-3.0) I didn't had this problem.
On Sun, Jun 26, 2016 at 1:40 PM, Sašo Kuntaric
<saso.kunta...@gmail.com <mailto:saso.kunta...@gmail.com>> wrote:
Hi all again,
A little more info, if someone has any ideas as I still
haven't been able to figure it out.
When I do tuning with models that only have one translation
table, it works fine, however with a non-factored tuning
corpus. If I use a factored tuning corpus, Moses does one run
and sets all weights to zero. If I have two translation
tables, Moses doesn't do the tuning as he is missing factors.
If I use the factored corpus, I get a similar result as above.
Tuning stops after one run and sets all weights to zero. There
was a similar error mentioned a few monts back and the
solution was to turn of mbr decoding, however I am not using
it. I just use the command:
~/mosesdecoder/scripts/training/mert-moses.pl
<http://mert-moses.pl>
~/working/IT_corpus/TMX/txt/tuning_corpus/tuning_corpus.tagged.en
~/working/IT_corpus/TMX/txt/tuning_corpus/tuning_corpus.tagged.sl
<http://tuning_corpus.tagged.sl> ~/mosesdecoder/bin/moses
~/working/IT_corpus/TMX/txt/factored_corpus/complex/model/moses.ini
--mertdir ~/mosesdecoder/bin/ --decoder-flags="-threads 32"
Is there something I am missing? Do I have to add anything
else for tuning a factored model?
Any help will be greatly appreciated.
Best regards,
Saso
---------- Forwarded message ----------
From: *Sašo Kuntaric* <saso.kunta...@gmail.com
<mailto:saso.kunta...@gmail.com>>
Date: 2016-06-20 19:36 GMT+02:00
Subject: Binarization fails with the Segmentation Fault error
To: moses-support <moses-support@mit.edu
<mailto:moses-support@mit.edu>>
Hi all,
Me again (last time I hope). I have successfully trained and
tuned my factored model. Here are both moses.ini files:
#########################
### MOSES CONFIG FILE ###
#########################
# input factors
[input-factors]
0
1
# mapping steps
[mapping]
0 T 0
0 G 0
0 T 1
[distortion-limit]
6
# feature functions
[feature]
UnknownWordPenalty
WordPenalty
PhrasePenalty
PhraseDictionaryMemory name=TranslationModel0 num-features=4
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.0-1.gz
input-factor=0 output-factor=1
PhraseDictionaryMemory name=TranslationModel1 num-features=4
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.1-2.gz
input-factor=1 output-factor=2
Generation name=GenerationModel0 num-features=2
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/generation.1-0,3.gz
input-factor=1 output-factor=0,3
Distortion
KENLM name=LM0 factor=0
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/IT_corpus_surface.blm.sl
<http://IT_corpus_surface.blm.sl> order=3
KENLM name=LM1 factor=2
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/IT_corpus_parts.blm.sl
<http://IT_corpus_parts.blm.sl> order=3
# dense weights for feature functions
[weight]
# The default weights are NOT optimized for translation
quality. You MUST tune the weights.
# Documentation for tuning is here:
http://www.statmt.org/moses/?n=FactoredTraining.Tuning
UnknownWordPenalty0= 1
WordPenalty0= -1
PhrasePenalty0= 0.2
TranslationModel0= 0.2 0.2 0.2 0.2
TranslationModel1= 0.2 0.2 0.2 0.2
GenerationModel0= 0.3 0
Distortion0= 0.3
LM0= 0.5
LM1= 0.5
# MERT optimized configuration
# decoder /home/ksaso/mosesdecoder/bin/moses
# BLEU 0 on dev
/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/tuning/tuning-corpus.tagged.en
# We were before running iteration 2
# finished Mon Jun 20 16:19:08 CEST 2016
### MOSES CONFIG FILE ###
#########################
# input factors
[input-factors]
0
1
# mapping steps
[mapping]
0 T 0
0 G 0
0 T 1
[distortion-limit]
6
# feature functions
[feature]
UnknownWordPenalty
WordPenalty
PhrasePenalty
PhraseDictionaryMemory name=TranslationModel0 num-features=4
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.0-1.gz
input-factor=0 output-factor=1
PhraseDictionaryMemory name=TranslationModel1 num-features=4
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.1-2.gz
input-factor=1 output-factor=2
Generation name=GenerationModel0 num-features=2
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/generation.1-0,3.gz
input-factor=1 output-factor=0,3
Distortion
KENLM name=LM0 factor=0
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/IT_corpus_surface.blm.sl
<http://IT_corpus_surface.blm.sl> order=3
KENLM name=LM1 factor=2
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/IT_corpus_parts.blm.sl
<http://IT_corpus_parts.blm.sl> order=3
# dense weights for feature functions
[threads]
16
[weight]
Distortion0= 0
LM0= 0
LM1= 0
WordPenalty0= 0
PhrasePenalty0= 0
TranslationModel0= 0 0 0 0
TranslationModel1= 0 0 0 0
GenerationModel0= 0 0
UnknownWordPenalty0= 1
First of all, is it strange that I get all zeroes after tuning?
My problem is that the translation with this model is
spectacularly slow (a few days to translate a couple of
thousand words with a 2,4 million line corpus), so naturally I
tried to binarize my phrase tables with the command
~/mosesdecoder/bin/processPhraseTableMin -in
~/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.0-1.gz
-nscores 4 -out ~/working/binarised_model/phrase-table.0-1 and
~/mosesdecoder/bin/processPhraseTableMin -in
~/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.1-2.gz
-nscores 4 -out ~/working/binarised_model/phrase-table.1-2
The process itself finishes without errors and I can run the
translation with the command
~/mosesdecoder/bin/moses -f
/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/binarised_model/moses.ini
But when I try to enter my text, I get the following:
Translating: use|NN of|IN light|JJ
Line 1: Initialize search took 0.000 seconds total
Segmentation fault (core dumped)
When I try to filter my model, I get the same error. Any ideas
what could be causing this?
My final moses.ini file looks like this:
# MERT optimized configuration
# decoder /home/ksaso/mosesdecoder/bin/moses
# BLEU 0 on dev
/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/tuning/tuning-corpus.tagged.en
# We were before running iteration 2
# finished Mon Jun 20 16:19:08 CEST 2016
### MOSES CONFIG FILE ###
#########################
# input factors
[input-factors]
0
1
# mapping steps
[mapping]
0 T 0
0 G 0
0 T 1
[distortion-limit]
6
# feature functions
[feature]
UnknownWordPenalty
WordPenalty
PhrasePenalty
PhraseDictionaryCompact name=TranslationModel0 num-features=4
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/binarised_model/phrase-table.0-1.minphr
input-factor=0 output-factor=1
PhraseDictionaryCompact name=TranslationModel1 num-features=4
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/binarised_model/phrase-table.1-2.minphr
input-factor=1 output-factor=2
Generation name=GenerationModel0 num-features=2
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/generation.1-0,3.gz
input-factor=1 output-factor=0,3
Distortion
KENLM name=LM0 factor=0
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/IT_corpus_surface.blm.sl
<http://IT_corpus_surface.blm.sl> order=3
KENLM name=LM1 factor=2
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/IT_corpus_parts.blm.sl
<http://IT_corpus_parts.blm.sl> order=3
# dense weights for feature functions
[threads]
16
[weight]
Distortion0= 0
LM0= 0
LM1= 0
WordPenalty0= 0
PhrasePenalty0= 0
TranslationModel0= 0 0 0 0
TranslationModel1= 0 0 0 0
GenerationModel0= 0 0
UnknownWordPenalty0= 1
And one more question ... can I run a translation (with the
~/mosesdecoder/bin/moses command) multi-threaded?
Thanks for all the help and best regards,
Saso
--
lp,
Sašo
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu <mailto:Moses-support@mit.edu>
http://mailman.mit.edu/mailman/listinfo/moses-support
--
-Regards,
Rajen Chatterjee.
--
lp,
Sašo
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support