Hi,

the hierarchical mode requires a different binarizer.

Typically, you would specify
-Binarizer "/path/to/moses/bin/CreateOnDiskPt 1 1 5 100 2"

-phi


On Fri, Mar 29, 2013 at 3:26 PM, Ken Fasano <kenfas...@hotmail.com> wrote:

>
>
> Thank you,
>
> Ken Fasano
>
> Begin forwarded message:
>
> *From:* "Ken Fasano" <kfas...@motionpoint.com>
> *Date:* March 29, 2013, 11:25:05 AM EDT
> *To:* <kenfas...@hotmail.com>
> *Subject:* *I am having problems with binarizing in a hierarchical system
> (but not a phrase-based system).*
>
> I am having problems with *binarizing* in a *hierarchical* system (but
> not a phrase-based****
>
> system). I can run the hierarchical without binarization; decoding and
> getting ****
>
> a BLEU score are both fine. With filtering and binarizing, however, I am
> not able****
>
> to get past filtering without an error - it's all downhill from there, and
> perhaps****
>
> I am losing track of which moses.ini is which.****
>
> ** **
>
> *Thank you so much.* I have managed to get a lot done, but this one seems
> a lot more****
>
> daunting than it really is. *I'm thinking it's just some missing
> parameter or a '1'*
>
> *when I should have a '0'...I’ve underlined where I think the problems
> are.*
>
> ** **
>
> (1) Tuning the system and filtering. *This works:*
>
> ** **
>
> *      $MOSES_SCRIPTS/training/filter-model-given-input.pl \*
>
> *           "$TRAINING_DIR/filtered" \*
>
> *           "$TRAINING_DIR/model/moses.ini" \*
>
> *           "$TUNING_CORPUS.tok.$F" \*
>
> *           -Hierarchical*
>
> ** **
>
> (2) Filtering on the way to a BLEU score, following
> http://www.statmt.org/moses/?n=Moses.Baseline. *This works, too:*
>
> ** **
>
> *      $MOSES_SCRIPTS/training/filter-model-given-input.pl \*
>
> *           $WORKING_FILTERED \*
>
> *           $TRAINING_DIR/model/moses-tuned.ini \*
>
> *            $HOME/corpus/dev/news2011.true.$F  \*
>
> *           -Hierarchical*
>
> ** **
>
> (3) Filtering AND binarizing during tuning fails. I assume this is the
> preferred****
>
> method, since it is done with one command:****
>
> ** **
>
> *      # filter the phrase table. *
>
> *      # This changes moses.ini so that the binary files are used.
> CORRECT???*
>
> *      # [ttable-file]*
>
> *      #     2 0 0 5
> /home/kenny/working/ebay-terms/hier/filtered/phrase-table.0-0.1.1.bin*
>
> *      # This will fail:*
>
> *      # ERROR: unknown option
> '/home/kenny/working/ebay-terms/hier/filtered/phrase-table.0-0.1.1'*
>
> *      $MOSES_SCRIPTS/training/filter-model-given-input.pl \*
>
> *           "$TRAINING_DIR/filtered" \*
>
> *           "$TRAINING_DIR/model/moses.ini" \*
>
> *           "$TUNING_CORPUS.tok.$F" \*
>
> *           -Hierarchical \*
>
> *           -Binarizer $MOSES_BIN/processPhraseTable*
>
> * *
>
> *      # run MERT*
>
> *      time nice $MOSES_SCRIPTS/training/mert-moses.pl \*
>
> *           "$TUNING_CORPUS.true.$F" \*
>
> *           "$TUNING_CORPUS.true.$E" \*
>
> *           "$MOSES_BIN/moses_chart" \*
>
> *            "$TRAINING_DIR/filtered/moses.ini" \*
>
> *           --no-filter-phrase-table \*
>
> *           --working-dir "$TRAINING_DIR/mert" \*
>
> *           --mertdir "$MOSES_BIN" \*
>
> *           --decoder-flags "-threads $threads -v 0"*
>
> *           &> "$TRAINING_DIR/mert.out"*
>
> * *
>
> *      # Insert weights into configuration file.*
>
> *      $MOSES_SCRIPTS/ems/support/reuse-weights.perl \*
>
> *           "$TRAINING_DIR/mert/moses.ini" \*
>
> *           < "$TRAINING_DIR/model/moses.ini" \*
>
> *           > "$TRAINING_DIR/model/moses-tuned.ini"*
>
> * *
>
> *      # update "$TRAINING_DIR/model/moses-tuned.ini" to read from rules
> folder*
>
> *      # IS THIS NEEDED? Looks like filter-model-given-input.pl has
> already done it for me!*
>
> *      $HOME/scripts/binarizeMoses.perl \*
>
> *           "moses-tuned.ini" \*
>
> *           "$TRAINING_DIR/model" \*
>
> *           "$TRAINING_DIR/model"*****
>
> ** **
>
>               binarizeMoses.perl is basically this:****
>
>     } elsif (/^6 0 0 5/) {
>
>                            # hierarchical (and others)****
>
>                            s/6 0 0 5/2 0 0 5/;****
>
>                            s/rule-table\.gz/rule-table/;****
>
> ** **
>
> filtered/phrase-table.0-0.1.1 exists and has data:****
>
> *-rw-rw-r-- 1 kenny kenny 111719 Mar 29 08:35
> /home/kenny/working/ebay-terms/hier/filtered/phrase-table.0-0.1.1*
>
> ** **
>
> Tuning proceeds as follows:****
>
> ****************************** CONSOLE OUTPUT*****************************
> *
>
> *Starting tuning...*
>
> *Tokenizer Version 1.1*
>
> *Language: de*
>
> *Number of threads: 1*
>
> *Tokenizer Version 1.1*
>
> *Language: en*
>
> *Number of threads: 1*
>
> * *
>
> * **** MOSES.INI BEFORE FILTER **** *
>
> *[ttable-file]*
>
> *6 0 0 5** /home/kenny/working/ebay-terms/hier/model/rule-table.gz*
>
> *6 0 0 1 /home/kenny/working/ebay-terms/hier/model/glue-grammar*
>
> * *
>
> *# translation model weights*
>
> *[weight-t]*
>
> *0.20*
>
> *0.20*
>
> *0.20*
>
> *0.20*
>
> *0.20*
>
> *1.0*
>
> * **** MOSES.INI BEFORE FILTER **** *
>
> * *
>
> *Executing: mkdir -p /home/kenny/working/ebay-terms/hier/filtered*
>
> *Considering factor 0*
>
> *Done.*
>
> *filtering /home/kenny/working/ebay-terms/hier/model/rule-table.gz ->
> /home/kenny/working/ebay-terms/hier/filtered/phrase-table.0-0.1.1...*
>
> *binarizing.../home/kenny/mosesdecoder/bin/processPhraseTable
> /home/kenny/working/ebay-terms/hier/filtered/phrase-table.0-0.1.1
> /home/kenny/working/ebay-terms/hier/filtered/phrase-table.0-0.1.1.bin*
>
> *ERROR: unknown option
> '/home/kenny/working/ebay-terms/hier/filtered/phrase-table.0-0.1.1'*
>
> *To run the decoder, please call:*
>
> *  moses -f /home/kenny/working/ebay-terms/hier/filtered/moses.ini -i
> /home/kenny/corpus/dev/news-test2008.tok.de*
>
> * *
>
> * **** MOSES.INI AFTER FILTER **** *
>
> *[ttable-file]*
>
> *2 0 0 5
> /home/kenny/working/ebay-terms/hier/filtered/phrase-table.0-0.1.1.bin*
>
> * *
>
> *[n.b. -rw-rw-r-- 1 kenny kenny 111719 Mar 29 09:51 phrase-table.0-0.1.1]*
>
> * *
>
> *6 0 0 1 /home/kenny/working/ebay-terms/hier/model/glue-grammar*
>
> * *
>
> *# translation model weights*
>
> *[weight-t]*
>
> *0.20*
>
> *0.20*
>
> *0.20*
>
> *0.20*
>
> *0.20*
>
> *1.0*
>
> * **** MOSES.INI AFTER FILTER **** *
>
> * *
>
> *Using SCRIPTS_ROOTDIR: /home/kenny/mosesdecoder/scripts*
>
> *Asking moses for feature names and values from
> /home/kenny/working/ebay-terms/hier/filtered/moses.ini*
>
> *Executing: /home/kenny/mosesdecoder/bin/moses_chart -threads 2 -v 0
> -config /home/kenny/working/ebay-terms/hier/filtered/moses.ini  -inputtype
> 0 -show-weights > ./features.list*
>
> */home/kenny/mosesdecoder/bin*
>
> *max-chart-span: 20*
>
> *max-chart-span: 1000*
>
> *Start loading text SCFG phrase table. Moses  format : [0.001] seconds*
>
> *MERT starting values and ranges for random generation:*
>
> *     lm =   0.500 ( 0.00 ..  1.00)*
>
> *     tm =   0.200 ( 0.00 ..  1.00)*
>
> *     tm =   0.200 ( 0.00 ..  1.00)*
>
> *     tm =   0.200 ( 0.00 ..  1.00)*
>
> *     tm =   0.200 ( 0.00 ..  1.00)*
>
> *     tm =   0.200 ( 0.00 ..  1.00)*
>
> *     tm =   1.000 ( 0.00 ..  1.00)*
>
> *      w =  -1.000 ( 0.00 ..  1.00)*
>
> *run 1 start at Fri Mar 29 09:51:20 EDT 2013*
>
> *Parsing --decoder-flags: |-threads 2 -v 0|*
>
> *Saving new config to: ./run1.moses.ini*
>
> *Saved: ./run1.moses.ini*
>
> *(1) run decoder to produce n-best lists*
>
> *params = -threads 2 -v 0*
>
> *Normalizing lambdas: 0.500000 0.200000 0.200000 0.200000 0.200000
> 0.200000 1.000000 -1.000000*
>
> *DECODER_CFG = -w -0.285714 -lm 0.142857 -tm 0.057143 0.057143 0.057143
> 0.057143 0.057143 0.285714*
>
> *decoder_config = -w -0.285714 -lm 0.142857 -tm 0.057143 0.057143
> 0.057143 0.057143 0.057143 0.285714*
>
> *Executing: /home/kenny/mosesdecoder/bin/moses_chart -threads 2 -v 0
> -config /home/kenny/working/ebay-terms/hier/filtered/moses.ini -inputtype 0
> -w -0.285714 -lm 0.142857 -tm 0.057143 0.057143 0.057143 0.057143 0.057143
> 0.285714  -n-best-list run1.best100.out 100 -input-file
> /home/kenny/corpus/dev/news-test2008.true.de > run1.out*
>
> */home/kenny/mosesdecoder/bin*
>
> *max-chart-span: 20*
>
> *max-chart-span: 1000*
>
> *Start loading text SCFG phrase table. Moses  format : [0.001] seconds*
>
> *Start loading binary SCFG phrase table.  : [0.009] seconds*
>
> *Check m_fileSource.is_open() failed in OnDiskPt/OnDiskWrapper.cpp:61*
>
> *Start loading binary SCFG phrase table. Aborted (core dumped)*
>
> *Exit code: 134*
>
> *The decoder died.** CONFIG WAS -w -0.285714 -lm 0.142857 -tm 0.057143
> 0.057143 0.057143 0.057143 0.057143 0.285714 *
>
> * *
>
> *real  0m0.168s*
>
> *user  0m0.048s*
>
> *sys   0m0.000s*
>
> * *
>
> * **** MOSES.INI AFTER MERT **** *
>
> *[ttable-file]*
>
> *2 0 0 5
> /home/kenny/working/ebay-terms/hier/filtered/phrase-table.0-0.1.1.bin*
>
> *6 0 0 1 /home/kenny/working/ebay-terms/hier/model/glue-grammar*
>
> * *
>
> *# translation model weights*
>
> *[weight-t]*
>
> *0.20*
>
> *0.20*
>
> *0.20*
>
> *0.20*
>
> *0.20*
>
> *1.0*
>
> * **** MOSES.INI AFTER MERT **** *
>
> *ERROR: could not open weight file:
> /home/kenny/working/ebay-terms/hier/mert/moses.ini at
> /home/kenny/mosesdecoder/scripts/ems/support/reuse-weights.perl line 15.*
>
> * *
>
> * **** MOSES_TUNED.INI AFTER MERT **** *
>
> *cat: /home/kenny/working/ebay-terms/hier/model/filtered/moses-tuned.ini:
> No such file or directory*
>
> * **** MOSES_TUNED.INI AFTER MERT **** *
>
> *TUNING FAILED!*
>
> ****************************** END CONSOLE OUTPUT*****************************
> *
>
> * *
>
> Here's the rule-table directory after all that. If I try to decode, I get
> nothing.****
>
> ** **
>
> *           -rw-rw-r-- 1 kenny kenny   77 Mar 29 08:24 Misc.dat*
>
> *           -rw-rw-r-- 1 kenny kenny   21 Mar 29 08:24 Source.dat*
>
> *           -rw-rw-r-- 1 kenny kenny    9 Mar 29 08:24 TargetColl.dat*
>
> *           -rw-rw-r-- 1 kenny kenny    1 Mar 29 08:24 TargetInd.dat*
>
> *           -rw-rw-r-- 1 kenny kenny    0 Mar 29 08:24 Vocab.dat*
>
> ** **
>
> (4) Instead, I try binarizing myself. All the files exist. This fails:****
>
> ** **
>
> *      $MOSES_BIN/CreateOnDiskPt 1 1 5 100 2 \*
>
> *           "$TRAINING_DIR/filtered/rule-table.gz" \*
>
> *           "$TRAINING_DIR/model/rule-table"*
>
> ** **
>
>               This step is run after tuning. Before this, everything
> works. This binarization****
>
>               takes almost no time (0.000455144 seconds), and produces a
> rule-table directory****
>
>               with empty or nearly empty files. There are no error
> messages, but if I try to decode,****
>
>               I get nothing.****
>
> ** **
>
> *           -rw-rw-r-- 1 kenny kenny   77 Mar 29 08:24 Misc.dat*
>
> *           -rw-rw-r-- 1 kenny kenny   21 Mar 29 08:24 Source.dat*
>
> *           -rw-rw-r-- 1 kenny kenny    9 Mar 29 08:24 TargetColl.dat*
>
> *           -rw-rw-r-- 1 kenny kenny    1 Mar 29 08:24 TargetInd.dat*
>
> *           -rw-rw-r-- 1 kenny kenny    0 Mar 29 08:24 Vocab.dat*
>
>               ****
>
> (5) With the non-binarized system from #1 (unbinarized), I try binarizing
> on my way to getting a BLEU score. This fails, too:****
>
> ** **
>
> *      $MOSES_SCRIPTS/training/filter-model-given-input.pl \*
>
> *           $WORKING_FILTERED \*
>
> *           $TRAINING_DIR/model/moses-tuned.ini \*
>
> *            $HOME/corpus/dev/news2011.true.$F  \*
>
> *           -Hierarchical \*
>
> *           -Binarizer $MOSES_BIN/processPhraseTable*
>
> ** **
>
> ****************************** CONSOLE OUTPUT*****************************
> *
>
> *ERROR: unknown option
> '/home/kenny/working/dev/newstest2011/filtered/hier/phrase-table.0-0.1.1'*
>
> *[ n.b. this is the same error I got when I try to binarize during
> filtering, above. Hmm… ]*
>
> * *
>
> *To run the decoder, please call:*
>
> *  moses -f /home/kenny/working/dev/newstest2011/filtered/hier/moses.ini
> -i /home/kenny/corpus/dev/newstest2011.true.de*
>
> *Filtering and binarization complete.*
>
> * *
>
> *Starting decoder test (this will take a while)...*
>
> * *
>
> *./bleuEbayHier.sh: line 178: 10784 Aborted                 (core dumped)
> nice $DECODER -f $MOSES_INI_FILTERED -i $TEST_SET_PATH.true.$F >
> $TRANSLATION_PATH 2> $WORKING/moses.out*
>
> *Decoding test failed! rc = 134*
>
> ****************************** CONSOLE OUTPUT*****************************
> *
>
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to