I don't know the exact problem but your factored model looks too complicated so the tuning algorithm kinda just gives up.

i would try a very simple model 1st, eg.
   translate 0 -> 0,1,2,3
or
   translate 0,1 -> 0,1,2,3
Once you see that working correctly, add a generation model.

You have to do this bit-by-bit and see what happens

On 28/06/2016 20:44, Sašo Kuntaric wrote:
Well, I installed Moses only a few months ago, so it should be the latest version.

I find it really strange. I have tried everything - binarizing tables (which finishes with no problems), using the --no-filter-phrase-table parameter, adding language models for all the factors I have (this one gave me a segmentation fault) and I always get the same result. Tuning stops after two runs and all the weights get set to zero with the message

(2) BEST at 2: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 => 0 at Tue Jun 28 17:38:43 CEST 2016
None of the weights changed more than 1e-05. Stopping.

The translation models themselves are created with no issues. If I have one translation table, I can tune them with an unfactored corpus, but as soon as I use a factored one, everything goes south. If I have two translation tables, I cannot tune with an unfactored file, since it wants the stated number of factors.

I would really appreciate if someone has an idea what to do.

Best regards,

Saso

2016-06-27 14:45 GMT+02:00 Rajen Chatterjee <rajen.k.chatter...@gmail.com <mailto:rajen.k.chatter...@gmail.com>>:

    Hi, in the past I had similar problem, the weights after 1
    iteration of tuning were getting to 0. I do not know the cause of
    this, but if I remember when I used another version of Moses (I
    think Release-3.0) I didn't had this problem.

    On Sun, Jun 26, 2016 at 1:40 PM, Sašo Kuntaric
    <saso.kunta...@gmail.com <mailto:saso.kunta...@gmail.com>> wrote:

        Hi all again,

        A little more info, if someone has any ideas as I still
        haven't been able to figure it out.

        When I do tuning with models that only have one translation
        table, it works fine, however with a non-factored tuning
        corpus. If I use a factored tuning corpus, Moses does one run
        and sets all weights to zero. If I have two translation
        tables, Moses doesn't do the tuning as he is missing factors.
        If I use the factored corpus, I get a similar result as above.
        Tuning stops after one run and sets all weights to zero. There
        was a similar error mentioned a few monts back and the
        solution was to turn of mbr decoding, however I am not using
        it. I just use the command:

        ~/mosesdecoder/scripts/training/mert-moses.pl
        <http://mert-moses.pl>
        ~/working/IT_corpus/TMX/txt/tuning_corpus/tuning_corpus.tagged.en
        ~/working/IT_corpus/TMX/txt/tuning_corpus/tuning_corpus.tagged.sl
        <http://tuning_corpus.tagged.sl> ~/mosesdecoder/bin/moses
        ~/working/IT_corpus/TMX/txt/factored_corpus/complex/model/moses.ini
        --mertdir ~/mosesdecoder/bin/ --decoder-flags="-threads 32"

        Is there something I am missing? Do I have to add anything
        else for tuning a factored model?

        Any help will be greatly appreciated.

        Best regards,

        Saso

        ---------- Forwarded message ----------
        From: *Sašo Kuntaric* <saso.kunta...@gmail.com
        <mailto:saso.kunta...@gmail.com>>
        Date: 2016-06-20 19:36 GMT+02:00
        Subject: Binarization fails with the Segmentation Fault error
        To: moses-support <moses-support@mit.edu
        <mailto:moses-support@mit.edu>>


        Hi all,

        Me again (last time I hope). I have successfully trained and
        tuned my factored model. Here are both moses.ini files:

        #########################
        ### MOSES CONFIG FILE ###
        #########################

        # input factors
        [input-factors]
        0
        1

        # mapping steps
        [mapping]
        0 T 0
        0 G 0
        0 T 1

        [distortion-limit]
        6

        # feature functions
        [feature]
        UnknownWordPenalty
        WordPenalty
        PhrasePenalty
        PhraseDictionaryMemory name=TranslationModel0 num-features=4
        
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.0-1.gz
        input-factor=0 output-factor=1
        PhraseDictionaryMemory name=TranslationModel1 num-features=4
        
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.1-2.gz
        input-factor=1 output-factor=2
        Generation name=GenerationModel0 num-features=2
        
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/generation.1-0,3.gz
        input-factor=1 output-factor=0,3
        Distortion
        KENLM name=LM0 factor=0
        
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/IT_corpus_surface.blm.sl
        <http://IT_corpus_surface.blm.sl> order=3
        KENLM name=LM1 factor=2
        
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/IT_corpus_parts.blm.sl
        <http://IT_corpus_parts.blm.sl> order=3

        # dense weights for feature functions
        [weight]
        # The default weights are NOT optimized for translation
        quality. You MUST tune the weights.
        # Documentation for tuning is here:
        http://www.statmt.org/moses/?n=FactoredTraining.Tuning
        UnknownWordPenalty0= 1
        WordPenalty0= -1
        PhrasePenalty0= 0.2
        TranslationModel0= 0.2 0.2 0.2 0.2
        TranslationModel1= 0.2 0.2 0.2 0.2
        GenerationModel0= 0.3 0
        Distortion0= 0.3
        LM0= 0.5
        LM1= 0.5

        # MERT optimized configuration
        # decoder /home/ksaso/mosesdecoder/bin/moses
        # BLEU 0 on dev
        
/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/tuning/tuning-corpus.tagged.en
        # We were before running iteration 2
        # finished Mon Jun 20 16:19:08 CEST 2016
        ### MOSES CONFIG FILE ###
        #########################

        # input factors
        [input-factors]
        0
        1

        # mapping steps
        [mapping]
        0 T 0
        0 G 0
        0 T 1

        [distortion-limit]
        6

        # feature functions
        [feature]
        UnknownWordPenalty
        WordPenalty
        PhrasePenalty
        PhraseDictionaryMemory name=TranslationModel0 num-features=4
        
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.0-1.gz
        input-factor=0 output-factor=1
        PhraseDictionaryMemory name=TranslationModel1 num-features=4
        
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.1-2.gz
        input-factor=1 output-factor=2
        Generation name=GenerationModel0 num-features=2
        
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/generation.1-0,3.gz
        input-factor=1 output-factor=0,3
        Distortion
        KENLM name=LM0 factor=0
        
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/IT_corpus_surface.blm.sl
        <http://IT_corpus_surface.blm.sl> order=3
        KENLM name=LM1 factor=2
        
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/IT_corpus_parts.blm.sl
        <http://IT_corpus_parts.blm.sl> order=3

        # dense weights for feature functions

        [threads]
        16
        [weight]

        Distortion0= 0
        LM0= 0
        LM1= 0
        WordPenalty0= 0
        PhrasePenalty0= 0
        TranslationModel0= 0 0 0 0
        TranslationModel1= 0 0 0 0
        GenerationModel0= 0 0
        UnknownWordPenalty0= 1

        First of all, is it strange that I get all zeroes after tuning?

        My problem is that the translation with this model is
        spectacularly slow (a few days to translate a couple of
        thousand words with a 2,4 million line corpus), so naturally I
        tried to binarize my phrase tables with the command

        ~/mosesdecoder/bin/processPhraseTableMin -in
        
~/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.0-1.gz
        -nscores 4 -out ~/working/binarised_model/phrase-table.0-1 and
        ~/mosesdecoder/bin/processPhraseTableMin -in
        
~/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/phrase-table.1-2.gz
        -nscores 4 -out ~/working/binarised_model/phrase-table.1-2

        The process itself finishes without errors and I can run the
        translation with the command

        ~/mosesdecoder/bin/moses -f
        
/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/binarised_model/moses.ini

        But when I try to enter my text, I get the following:

         Translating: use|NN of|IN light|JJ
        Line 1: Initialize search took 0.000 seconds total
        Segmentation fault (core dumped)

        When I try to filter my model, I get the same error. Any ideas
        what could be causing this?

        My final moses.ini file looks like this:

        # MERT optimized configuration
        # decoder /home/ksaso/mosesdecoder/bin/moses
        # BLEU 0 on dev
        
/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/tuning/tuning-corpus.tagged.en
        # We were before running iteration 2
        # finished Mon Jun 20 16:19:08 CEST 2016
        ### MOSES CONFIG FILE ###
        #########################

        # input factors
        [input-factors]
        0
        1

        # mapping steps
        [mapping]
        0 T 0
        0 G 0
        0 T 1

        [distortion-limit]
        6

        # feature functions
        [feature]
        UnknownWordPenalty
        WordPenalty
        PhrasePenalty
        PhraseDictionaryCompact name=TranslationModel0 num-features=4
        
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/binarised_model/phrase-table.0-1.minphr
        input-factor=0 output-factor=1
        PhraseDictionaryCompact name=TranslationModel1 num-features=4
        
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/binarised_model/phrase-table.1-2.minphr
        input-factor=1 output-factor=2
        Generation name=GenerationModel0 num-features=2
        
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/morphgen/model/generation.1-0,3.gz
        input-factor=1 output-factor=0,3
        Distortion
        KENLM name=LM0 factor=0
        
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/IT_corpus_surface.blm.sl
        <http://IT_corpus_surface.blm.sl> order=3
        KENLM name=LM1 factor=2
        
path=/home/ksaso/working/IT_corpus/TMX/txt/factored_corpus/language_model/IT_corpus_parts.blm.sl
        <http://IT_corpus_parts.blm.sl> order=3

        # dense weights for feature functions

        [threads]
        16
        [weight]

        Distortion0= 0
        LM0= 0
        LM1= 0
        WordPenalty0= 0
        PhrasePenalty0= 0
        TranslationModel0= 0 0 0 0
        TranslationModel1= 0 0 0 0
        GenerationModel0= 0 0
        UnknownWordPenalty0= 1

        And one more question ... can I run a translation (with the
        ~/mosesdecoder/bin/moses command) multi-threaded?

        Thanks for all the help and best regards,

        Saso








-- lp,

        Sašo

        _______________________________________________
        Moses-support mailing list
        Moses-support@mit.edu <mailto:Moses-support@mit.edu>
        http://mailman.mit.edu/mailman/listinfo/moses-support




-- -Regards,
     Rajen Chatterjee.




--
lp,

Sašo


_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to