Hi,
My EMS setup (factored,MIRA) crashes at tuning stage after single run.
config.toy: (attaching only training and tuning sections)
*# TRANSLATION MODEL TRAINING
[TRAINING]
### training script to be used: either a legacy script or
# current moses training script (default)
#
script = $moses-script-dir/training/train-model.perl
### general options
# these are options that are passed on to train-model.perl, for instance
# * "-mgiza -mgiza-cpus 8" to use mgiza instead of giza
# * "-sort-buffer-size 8G -sort-compress gzip" to reduce on-disk sorting
# * "-sort-parallel 8 -cores 8" to speed up phrase table building
#
#training-options = ""
### factored training: specify here which factors used
# if none specified, single factor training is assumed
# (one translation step, surface to surface)
#
input-factors = word pos
output-factors = word pos
alignment-factors = "word -> word"
translation-factors = "word+pos -> word+pos"
reordering-factors = "word -> word"
#generation-factors = "pos -> word"
decoding-steps = "t0"
### parallelization of data preparation step
# the two directions of the data preparation can be run in parallel
# comment out if not needed
#
parallel = yes
### pre-computation for giza++
# giza++ has a more efficient data structure that needs to be
# initialized with snt2cooc. if run in parallel, this may reduces
# memory requirements. set here the number of parts
#
#run-giza-in-parts = 5
### symmetrization method to obtain word alignments from giza output
# (commonly used: grow-diag-final-and)
#
alignment-symmetrization-method = grow-diag-final-and
### use of berkeley aligner for word alignment
#
#use-berkeley = true
#alignment-symmetrization-method = berkeley
#berkeley-train = $moses-script-dir/ems/support/berkeley-train.sh
#berkeley-process = $moses-script-dir/ems/support/berkeley-process.sh
#berkeley-jar = /your/path/to/berkeleyaligner-1.1/berkeleyaligner.jar
#berkeley-java-options = "-server -mx30000m -ea"
#berkeley-training-options = "-Main.iters 5 5 -EMWordAligner.numThreads 8"
#berkeley-process-options = "-EMWordAligner.numThreads 8"
#berkeley-posterior = 0.5
### use of baseline alignment model (incremental training)
#
#baseline = 68
#baseline-alignment-model =
"$working-dir/training/prepared.$baseline/$input-extension.vcb \
# $working-dir/training/prepared.$baseline/$output-extension.vcb \
#
$working-dir/training/giza.$baseline/${output-extension}-$input-extension.cooc
\
#
$working-dir/training/giza-inverse.$baseline/${input-extension}-$output-extension.cooc
\
#
$working-dir/training/giza.$baseline/${output-extension}-$input-extension.thmm.5
\
#
$working-dir/training/giza.$baseline/${output-extension}-$input-extension.hhmm.5
\
#
$working-dir/training/giza-inverse.$baseline/${input-extension}-$output-extension.thmm.5
\
#
$working-dir/training/giza-inverse.$baseline/${input-extension}-$output-extension.hhmm.5"
### if word alignment should be skipped,
# point to word alignment files
#
#word-alignment = $working-dir/model/aligned.1
### filtering some corpora with modified Moore-Lewis
# specify corpora to be filtered and ratio to be kept, either before or
after word alignment
#mml-filter-corpora = toy
#mml-before-wa = "-proportion 0.9"
#mml-after-wa = "-proportion 0.9"
### create a bilingual concordancer for the model
#
#biconcor = $moses-script-dir/ems/biconcor/biconcor
### lexicalized reordering: specify orientation type
# (default: only distance-based reordering model)
#
lexicalized-reordering = msd-bidirectional-fe
### hierarchical rule set
#
#hierarchical-rule-set = true
### settings for rule extraction
#
#extract-settings = ""
max-phrase-length = 5
### add extracted phrases from baseline model
#
#baseline-extract = $working-dir/model/extract.$baseline
#
# requires aligned parallel corpus for re-estimating lexical translation
probabilities
#baseline-corpus = $working-dir/training/corpus.$baseline
#baseline-alignment =
$working-dir/model/aligned.$baseline.$alignment-symmetrization-method
### unknown word labels (target syntax only)
# enables use of unknown word labels during decoding
# label file is generated during rule extraction
#
#use-unknown-word-labels = true
### if phrase extraction should be skipped,
# point to stem for extract files
#
# extracted-phrases =
### settings for rule scoring
#
score-settings = "--GoodTuring"
### include word alignment in phrase table
#
include-word-alignment-in-rules = yes
### sparse lexical features
#
#sparse-lexical-features = "target-word-insertion top 50,
source-word-deletion top 50, word-translation top 50 50, phrase-length"
### domain adaptation settings
# options: sparse, any of: indicator, subset, ratio
#domain-features = "subset"
### if phrase table training should be skipped,
# point to phrase translation table
#
# phrase-translation-table =
### if reordering table training should be skipped,
# point to reordering table
#
# reordering-table =
### filtering the phrase table based on significance tests
# Johnson, Martin, Foster and Kuhn. (2007): "Improving Translation Quality
by Discarding Most of the Phrasetable"
# options: -n number of translations; -l 'a+e', 'a-e', or a positive real
value -log prob threshold
#salm-index = /path/to/project/salm/Bin/Linux/Index/IndexSA.O64
#sigtest-filter = "-l a+e -n 50"
### if training should be skipped,
# point to a configuration file that contains
# pointers to all relevant model files
#
#config-with-reused-weights =
#####################################################
### TUNING: finding good weights for model components
[TUNING]
### instead of tuning with this setting, old weights may be recycled
# specify here an old configuration file with matching weights
#
#weight-config = $working-dir/model/weight.ini
### tuning script to be used
#
tuning-script = $moses-script-dir/training/mert-moses.pl
tuning-settings = "-mertdir $moses-bin-dir --batch-mira --return-best-dev
--batch-mira-args '-J 100 -C 0.001'"
### specify the corpus used for tuning
# it should contain 1000s of sentences
#
input-sgm = $toy-data/dev.en.sgm
#raw-input =
#tokenized-input = $toy-data/dev.en
factorized-input = $toy-data/dev.en
#factorized-input =
#input =
#
reference-sgm = $toy-data/dev.hi.sgm
#raw-reference =
factorized-reference = $toy-data/dev.hi
#factorized-reference =
#reference =
### size of n-best list used (typically 100)
#
nbest = 100
### ranges for weights for random initialization
# if not specified, the tuning script will use generic ranges
# it is not clear, if this matters
#
# lambda =
### additional flags for the filter script
#
filter-settings = ""
### additional flags for the decoder
#
decoder-settings = ""
### if tuning should be skipped, specify this here
# and also point to a configuration file that contains
# pointers to all relevant model files
#
#config = *
TUNING_tune.1.STDERR file has the following lines
*Translating line 1078 in thread id 139965279725312
Translating line 1079 in thread id 139965279725312
Translating line 1080 in thread id 139965279725312
Translating line 1081 in thread id 139965279725312
The decoder returns the scores in this order: d d d d d d d lm w tm tm tm
tm tm
Executing: gzip -f run1.best100.out
Scoring the nbestlist.
exec: /home/eilmt/wrk-dir/wrk-jhu-fact/tuning/tmp.1/extractor.sh
Executing: /home/eilmt/wrk-dir/wrk-jhu-fact/tuning/tmp.1/extractor.sh >
extract.out 2> extract.err
Executing: \cp -f init.opt run1.init.opt
Executing: echo 'not used' > weights.txt
exec: /tools/mosesdecoder-master_2/bin/kbmira -J 100 -C 0.001 --dense-init
run1.init.opt --ffile run1.features.dat --scfile run1.scores.dat$
Executing: /tools/mosesdecoder-master_2/bin/kbmira -J 100 -C 0.001
--dense-init run1.init.opt --ffile run1.features.dat --scfile run1.score$
Executing: \cp -f extract.err run1.extract.err
Executing: \cp -f extract.out run1.extract.out
Executing: \cp -f mert.out run1.mert.out
cp: cannot stat `mert.out': No such file or directory
Exit code: 1
Died at /tools/mosesdecoder-master_2/scripts/training/mert-moses.pl line
956.
cp: cannot stat `/home/eilmt/wrk-dir/wrk-jhu-fact/tuning/tmp.1/moses.ini':
No such file or directory*
Opening mert.log shows that the BLEU score is initialized to a value of
zero. *On a side note, BLEU score seems to initialize fine in case of
non-factored models*
*kbmira with c=0.001 decay=0.999 no_shuffle=0
Initialising random seed from system clock
..........Initial BLEU = 0
0/1082 updates, avg loss = 0, BLEU = 0
0/1082 updates, avg loss = 0, BLEU = 0
0/1082 updates, avg loss = 0, BLEU = 0
0/1082 updates, avg loss = 0, BLEU = 0
0/1082 updates, avg loss = 0, BLEU = 0
.
.
.
*
Kindly do suggest a solution.
Thanking you,
--
- Jayendra Rakesh.
BTech CSD.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support