[Moses-support] Empty phrase table after running

whitehall s.g. (sgw1g10) Fri, 29 Jun 2012 12:17:37 -0700

Hi all,

I'm just getting started with Moses, and have come across a stumbling block 
trying to train a model following the instructions (2.2.1) in the manual – 
train-model.perl looks as though it has no complaints throughout (although as I 
said, I'm new to this, and may be missing something), but the phrase table at 
~/unfactored/model/phrase-table.gz extracts to a completely empty file.


I've had to modify the command slightly from the documentation, to account for 
a) absolute locations for each of the files, b) the --external-bin-dir 
argument, so the command I ran was:

train-model.perl \
--corpus /Users/sam/mosesdecoder/factored-corpus/proj-syndicate.1000 \
--root-dir unfactored \
--f de --e en \
--lm 0:3:/Users/sam/mosesdecoder/factored-corpus/surface.lm:0 \
--external-bin-dir /Users/sam/moses-external-bins

I noticed this issue was addressed on the mailing list on 12th June ("Empty 
phrase-table"), but I'm only using absolute directories, and my log shows no 
signs of script location mismatch (although I don't understand about 80% of the 
output, so I'm likely missing something!). I've attached the output log, so 
hopefully someone who's less of a novice may see something wrong,

I'm able to decode from the sample phrase table (seemingly) without issue, if 
that helps at all.

Thanks in advance,
Sam

***** 2 runs. (algorithm:TA)*****
;KategProblem:cats: 50   words: 6287

start-costs: MEAN: 214294 (214177-214411)  SIGMA:116.535   
  end-costs: MEAN: 187487 (187184-187791)  SIGMA:303.251   
   start-pp: MEAN: 718.444 (714.636-722.252)  SIGMA:3.80785   
     end-pp: MEAN: 212.292 (209.364-215.22)  SIGMA:2.92781   
 iterations: MEAN: 141051 (139921-142181)  SIGMA:1130   
       time: MEAN: 2.91946 (2.91118-2.92774)  SIGMA:0.0082825   

***** 2 runs. (algorithm:TA)*****
;KategProblem:cats: 50   words: 5097

start-costs: MEAN: 207260 (207207-207314)  SIGMA:53.4682   
  end-costs: MEAN: 182497 (182492-182502)  SIGMA:5.08775   
   start-pp: MEAN: 513.767 (512.481-515.053)  SIGMA:1.28575   
     end-pp: MEAN: 161.209 (161.171-161.248)  SIGMA:0.0383895   
 iterations: MEAN: 118656 (118268-119044)  SIGMA:388   
       time: MEAN: 2.52304 (2.4968-2.54927)  SIGMA:0.0262315   
/Users/sam/moses-external-bins/snt2cooc.out /Users/sam/unfactored/corpus/en.vcb 
/Users/sam/unfactored/corpus/de.vcb 
/Users/sam/unfactored/corpus/de-en-int-train.snt > 
/Users/sam/unfactored/giza.de-en/de-en.cooc
/Users/sam/moses-external-bins/GIZA++  -CoocurrenceFile 
/Users/sam/unfactored/giza.de-en/de-en.cooc -c 
/Users/sam/unfactored/corpus/de-en-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3 
-model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 1 -nsmooth 4 -o 
/Users/sam/unfactored/giza.de-en/de-en -onlyaldumps 1 -p0 0.999 -s 
/Users/sam/unfactored/corpus/en.vcb -t /Users/sam/unfactored/corpus/de.vcb
Parameter 'coocurrencefile' changed from '' to 
'/Users/sam/unfactored/giza.de-en/de-en.cooc'
Parameter 'c' changed from '' to 
'/Users/sam/unfactored/corpus/de-en-int-train.snt'
Parameter 'm3' changed from '5' to '3'
Parameter 'm4' changed from '5' to '3'
Parameter 'model1dumpfrequency' changed from '0' to '1'
Parameter 'model4smoothfactor' changed from '0.2' to '0.4'
Parameter 'nodumps' changed from '0' to '1'
Parameter 'nsmooth' changed from '64' to '4'
Parameter 'o' changed from '112-06-29.200509.sam' to 
'/Users/sam/unfactored/giza.de-en/de-en'
Parameter 'onlyaldumps' changed from '0' to '1'
Parameter 'p0' changed from '-1' to '0.999'
Parameter 's' changed from '' to '/Users/sam/unfactored/corpus/en.vcb'
Parameter 't' changed from '' to '/Users/sam/unfactored/corpus/de.vcb'
general parameters:
-------------------
ml = 101  (maximum sentence length)

No. of iterations:
-------------------
hmmiterations = 5  (mh)
model1iterations = 5  (number of iterations for Model 1)
model2iterations = 0  (number of iterations for Model 2)
model3iterations = 3  (number of iterations for Model 3)
model4iterations = 3  (number of iterations for Model 4)
model5iterations = 0  (number of iterations for Model 5)
model6iterations = 0  (number of iterations for Model 6)

parameter for various heuristics in GIZA++ for efficient training:
------------------------------------------------------------------
countincreasecutoff = 1e-06  (Counts increment cutoff threshold)
countincreasecutoffal = 1e-05  (Counts increment cutoff threshold for 
alignments in training of fertility models)
mincountincrease = 1e-07  (minimal count increase)
peggedcutoff = 0.03  (relative cutoff probability for alignment-centers in 
pegging)
probcutoff = 1e-07  (Probability cutoff threshold for lexicon probabilities)
probsmooth = 1e-07  (probability smoothing (floor) value )

parameters for describing the type and amount of output:
-----------------------------------------------------------
compactalignmentformat = 0  (0: detailled alignment format, 1: compact 
alignment format )
hmmdumpfrequency = 0  (dump frequency of HMM)
l = 112-06-29.200509.sam.log  (log file name)
log = 0  (0: no logfile; 1: logfile)
model1dumpfrequency = 1  (dump frequency of Model 1)
model2dumpfrequency = 0  (dump frequency of Model 2)
model345dumpfrequency = 0  (dump frequency of Model 3/4/5)
nbestalignments = 0  (for printing the n best alignments)
nodumps = 1  (1: do not write any files)
o = /Users/sam/unfactored/giza.de-en/de-en  (output file prefix)
onlyaldumps = 1  (1: do not write any files)
outputpath =   (output path)
transferdumpfrequency = 0  (output: dump of transfer from Model 2 to 3)
verbose = 0  (0: not verbose; 1: verbose)
verbosesentence = -10  (number of sentence for which a lot of information 
should be printed (negative: no output))

parameters describing input files:
----------------------------------
c = /Users/sam/unfactored/corpus/de-en-int-train.snt  (training corpus file 
name)
d =   (dictionary file name)
s = /Users/sam/unfactored/corpus/en.vcb  (source vocabulary file name)
t = /Users/sam/unfactored/corpus/de.vcb  (target vocabulary file name)
tc =   (test corpus file name)

smoothing parameters:
---------------------
emalsmooth = 0.2  (f-b-trn: smoothing factor for HMM alignment model (can be 
ignored by -emSmoothHMM))
model23smoothfactor = 0  (smoothing parameter for IBM-2/3 (interpolation with 
constant))
model4smoothfactor = 0.4  (smooting parameter for alignment probabilities in 
Model 4)
model5smoothfactor = 0.1  (smooting parameter for distortion probabilities in 
Model 5 (linear interpolation with constant))
nsmooth = 4  (smoothing for fertility parameters (good value: 64): weight for 
wordlength-dependent fertility parameters)
nsmoothgeneral = 0  (smoothing for fertility parameters (default: 0): weight 
for word-independent fertility parameters)

parameters modifying the models:
--------------------------------
compactadtable = 1  (1: only 3-dimensional alignment table for IBM-2 and IBM-3)
deficientdistortionforemptyword = 0  (0: IBM-3/IBM-4 as described in (Brown et 
al. 1993); 1: distortion model of empty word is deficient; 2: distoriton model 
of empty word is deficient (differently); setting this parameter also helps to 
avoid that during IBM-3 and IBM-4 training too many words are aligned with the 
empty word)
depm4 = 76  (d_{=1}: &1:l, &2:m, &4:F, &8:E, d_{>1}&16:l, &32:m, &64:F, &128:E)
depm5 = 68  (d_{=1}: &1:l, &2:m, &4:F, &8:E, d_{>1}&16:l, &32:m, &64:F, &128:E)
emalignmentdependencies = 2  (lextrain: dependencies in the HMM alignment 
model.  &1: sentence length; &2: previous class; &4: previous position;  &8: 
French position; &16: French class)
emprobforempty = 0.4  (f-b-trn: probability for empty word)

parameters modifying the EM-algorithm:
--------------------------------------
m5p0 = -1  (fixed value for parameter p_0 in IBM-5 (if negative then it is 
determined in training))
manlexfactor1 = 0  ()
manlexfactor2 = 0  ()
manlexmaxmultiplicity = 20  ()
maxfertility = 10  (maximal fertility for fertility models)
p0 = 0.999  (fixed value for parameter p_0 in IBM-3/4 (if negative then it is 
determined in training))
pegging = 0  (0: no pegging; 1: do pegging)

general parameters:
-------------------
ml = 101  (maximum sentence length)

No. of iterations:
-------------------
hmmiterations = 5  (mh)
model1iterations = 5  (number of iterations for Model 1)
model2iterations = 0  (number of iterations for Model 2)
model3iterations = 3  (number of iterations for Model 3)
model4iterations = 3  (number of iterations for Model 4)
model5iterations = 0  (number of iterations for Model 5)
model6iterations = 0  (number of iterations for Model 6)

parameter for various heuristics in GIZA++ for efficient training:
------------------------------------------------------------------
countincreasecutoff = 1e-06  (Counts increment cutoff threshold)
countincreasecutoffal = 1e-05  (Counts increment cutoff threshold for 
alignments in training of fertility models)
mincountincrease = 1e-07  (minimal count increase)
peggedcutoff = 0.03  (relative cutoff probability for alignment-centers in 
pegging)
probcutoff = 1e-07  (Probability cutoff threshold for lexicon probabilities)
probsmooth = 1e-07  (probability smoothing (floor) value )

parameters for describing the type and amount of output:
-----------------------------------------------------------
compactalignmentformat = 0  (0: detailled alignment format, 1: compact 
alignment format )
hmmdumpfrequency = 0  (dump frequency of HMM)
l = 112-06-29.200509.sam.log  (log file name)
log = 0  (0: no logfile; 1: logfile)
model1dumpfrequency = 1  (dump frequency of Model 1)
model2dumpfrequency = 0  (dump frequency of Model 2)
model345dumpfrequency = 0  (dump frequency of Model 3/4/5)
nbestalignments = 0  (for printing the n best alignments)
nodumps = 1  (1: do not write any files)
o = /Users/sam/unfactored/giza.de-en/de-en  (output file prefix)
onlyaldumps = 1  (1: do not write any files)
outputpath =   (output path)
transferdumpfrequency = 0  (output: dump of transfer from Model 2 to 3)
verbose = 0  (0: not verbose; 1: verbose)
verbosesentence = -10  (number of sentence for which a lot of information 
should be printed (negative: no output))

parameters describing input files:
----------------------------------
c = /Users/sam/unfactored/corpus/de-en-int-train.snt  (training corpus file 
name)
d =   (dictionary file name)
s = /Users/sam/unfactored/corpus/en.vcb  (source vocabulary file name)
t = /Users/sam/unfactored/corpus/de.vcb  (target vocabulary file name)
tc =   (test corpus file name)

smoothing parameters:
---------------------
emalsmooth = 0.2  (f-b-trn: smoothing factor for HMM alignment model (can be 
ignored by -emSmoothHMM))
model23smoothfactor = 0  (smoothing parameter for IBM-2/3 (interpolation with 
constant))
model4smoothfactor = 0.4  (smooting parameter for alignment probabilities in 
Model 4)
model5smoothfactor = 0.1  (smooting parameter for distortion probabilities in 
Model 5 (linear interpolation with constant))
nsmooth = 4  (smoothing for fertility parameters (good value: 64): weight for 
wordlength-dependent fertility parameters)
nsmoothgeneral = 0  (smoothing for fertility parameters (default: 0): weight 
for word-independent fertility parameters)

parameters modifying the models:
--------------------------------
compactadtable = 1  (1: only 3-dimensional alignment table for IBM-2 and IBM-3)
deficientdistortionforemptyword = 0  (0: IBM-3/IBM-4 as described in (Brown et 
al. 1993); 1: distortion model of empty word is deficient; 2: distoriton model 
of empty word is deficient (differently); setting this parameter also helps to 
avoid that during IBM-3 and IBM-4 training too many words are aligned with the 
empty word)
depm4 = 76  (d_{=1}: &1:l, &2:m, &4:F, &8:E, d_{>1}&16:l, &32:m, &64:F, &128:E)
depm5 = 68  (d_{=1}: &1:l, &2:m, &4:F, &8:E, d_{>1}&16:l, &32:m, &64:F, &128:E)
emalignmentdependencies = 2  (lextrain: dependencies in the HMM alignment 
model.  &1: sentence length; &2: previous class; &4: previous position;  &8: 
French position; &16: French class)
emprobforempty = 0.4  (f-b-trn: probability for empty word)

parameters modifying the EM-algorithm:
--------------------------------------
m5p0 = -1  (fixed value for parameter p_0 in IBM-5 (if negative then it is 
determined in training))
manlexfactor1 = 0  ()
manlexfactor2 = 0  ()
manlexmaxmultiplicity = 20  ()
maxfertility = 10  (maximal fertility for fertility models)
p0 = 0.999  (fixed value for parameter p_0 in IBM-3/4 (if negative then it is 
determined in training))
pegging = 0  (0: no pegging; 1: do pegging)

reading vocabulary files 
Source vocabulary list has 5098 unique tokens 
Target vocabulary list has 6288 unique tokens 
Calculating vocabulary frequencies from corpus 
/Users/sam/unfactored/corpus/de-en-int-train.snt
Reading more sentence pairs into memory ... 
Corpus fits in memory, corpus has: 1000 sentence pairs.
 Train total # sentence pairs (weighted): 1000
Size of source portion of the training corpus: 20365 tokens
Size of the target portion of the training corpus: 20987 tokens 
In source portion of the training corpus, only 5097 unique tokens appeared
In target portion of the training corpus, only 6286 unique tokens appeared
lambda for PP calculation in IBM-1,IBM-2,HMM:= 20987/(21365-1000)== 1.03054
There are 294100 294100 entries in table
==========================================================
Model1 Training Started at: Fri Jun 29 20:05:09 2012

-----------
Model1: Iteration 1
Model1: (1) TRAIN CROSS-ENTROPY 12.8036 PERPLEXITY 7149.51
Model1: (1) VITERBI TRAIN CROSS-ENTROPY 17.346 PERPLEXITY 166599
Model 1 Iteration: 1 took: 0 seconds
-----------
Model1: Iteration 2
Model1: (2) TRAIN CROSS-ENTROPY 6.18931 PERPLEXITY 72.9739
Model1: (2) VITERBI TRAIN CROSS-ENTROPY 9.05015 PERPLEXITY 530.11
Model 1 Iteration: 2 took: 0 seconds
-----------
Model1: Iteration 3
Model1: (3) TRAIN CROSS-ENTROPY 5.80148 PERPLEXITY 55.7726
Model1: (3) VITERBI TRAIN CROSS-ENTROPY 8.33014 PERPLEXITY 321.827
Model 1 Iteration: 3 took: 0 seconds
-----------
Model1: Iteration 4
Model1: (4) TRAIN CROSS-ENTROPY 5.61238 PERPLEXITY 48.9209
Model1: (4) VITERBI TRAIN CROSS-ENTROPY 7.82462 PERPLEXITY 226.696
Model 1 Iteration: 4 took: 0 seconds
-----------
Model1: Iteration 5
Model1: (5) TRAIN CROSS-ENTROPY 5.50897 PERPLEXITY 45.5371
Model1: (5) VITERBI TRAIN CROSS-ENTROPY 7.503 PERPLEXITY 181.396
Model 1 Iteration: 5 took: 0 seconds
Entire Model1 Training took: 0 seconds
NOTE: I am doing iterations with the HMM model!
Read classes: #words: 5097  #classes: 51
Read classes: #words: 6287  #classes: 51

==========================================================
Hmm Training Started at: Fri Jun 29 20:05:10 2012

-----------
Hmm: Iteration 1
A/D table contains 28719 parameters.
Hmm: (1) TRAIN CROSS-ENTROPY 5.44812 PERPLEXITY 43.6563
Hmm: (1) VITERBI TRAIN CROSS-ENTROPY 7.29299 PERPLEXITY 156.823

Hmm Iteration: 1 took: 0 seconds

-----------
Hmm: Iteration 2
A/D table contains 28719 parameters.
Hmm: (2) TRAIN CROSS-ENTROPY 5.43689 PERPLEXITY 43.3177
Hmm: (2) VITERBI TRAIN CROSS-ENTROPY 6.47398 PERPLEXITY 88.8916

Hmm Iteration: 2 took: 1 seconds

-----------
Hmm: Iteration 3
A/D table contains 28719 parameters.
Hmm: (3) TRAIN CROSS-ENTROPY 4.95496 PERPLEXITY 31.0163
Hmm: (3) VITERBI TRAIN CROSS-ENTROPY 5.55296 PERPLEXITY 46.947

Hmm Iteration: 3 took: 0 seconds

-----------
Hmm: Iteration 4
A/D table contains 28719 parameters.
Hmm: (4) TRAIN CROSS-ENTROPY 4.42608 PERPLEXITY 21.4973
Hmm: (4) VITERBI TRAIN CROSS-ENTROPY 4.79448 PERPLEXITY 27.7512

Hmm Iteration: 4 took: 1 seconds

-----------
Hmm: Iteration 5
A/D table contains 28719 parameters.
Hmm: (5) TRAIN CROSS-ENTROPY 4.04137 PERPLEXITY 16.4655
Hmm: (5) VITERBI TRAIN CROSS-ENTROPY 4.29508 PERPLEXITY 19.6313

Hmm Iteration: 5 took: 1 seconds

Entire Hmm Training took: 3 seconds
==========================================================
Read classes: #words: 5097  #classes: 51
Read classes: #words: 6287  #classes: 51
Read classes: #words: 5097  #classes: 51
Read classes: #words: 6287  #classes: 51

==========================================================
Starting H333444:  Viterbi Training
 H333444 Training Started at: Fri Jun 29 20:05:13 2012


---------------------
THTo3: Iteration 1
#centers(pre/hillclimbed/real): 1 1 1  #al: 735.364 
#alsophisticatedcountcollection: 0 #hcsteps: 0
#peggingImprovements: 0
A/D table contains 28719 parameters.
A/D table contains 27324 parameters.
p0_count is 17944.9 and p1 is 1521.04; p0 is 0.999 p1: 0.001
THTo3: TRAIN CROSS-ENTROPY 3.76471 PERPLEXITY 13.5922
THTo3: (1) TRAIN VITERBI CROSS-ENTROPY 3.85524 PERPLEXITY 14.4725

THTo3 Viterbi Iteration : 1 took: 0 seconds

---------------------
Model3: Iteration 2
#centers(pre/hillclimbed/real): 1 1 1  #al: 736.492 
#alsophisticatedcountcollection: 0 #hcsteps: 2.522
#peggingImprovements: 0
A/D table contains 28719 parameters.
A/D table contains 27324 parameters.
p0_count is 19426.4 and p1 is 780.3; p0 is 0.999 p1: 0.001
Model3: TRAIN CROSS-ENTROPY 5.04396 PERPLEXITY 32.9901
Model3: (2) TRAIN VITERBI CROSS-ENTROPY 5.11314 PERPLEXITY 34.6106

Model3 Viterbi Iteration : 2 took: 1 seconds

---------------------
Model3: Iteration 3
#centers(pre/hillclimbed/real): 1 1 1  #al: 736.642 
#alsophisticatedcountcollection: 0 #hcsteps: 2.654
#peggingImprovements: 0
A/D table contains 28719 parameters.
A/D table contains 27324 parameters.
p0_count is 19789.1 and p1 is 598.958; p0 is 0.999 p1: 0.001
Model3: TRAIN CROSS-ENTROPY 4.87336 PERPLEXITY 29.3108
Model3: (3) TRAIN VITERBI CROSS-ENTROPY 4.92739 PERPLEXITY 30.4292

Model3 Viterbi Iteration : 3 took: 0 seconds

---------------------
T3To4: Iteration 4
#centers(pre/hillclimbed/real): 1 1 1  #al: 736.79 
#alsophisticatedcountcollection: 24.877 #hcsteps: 2.702
#peggingImprovements: 0
D4 table contains 510545 parameters.
A/D table contains 28719 parameters.
A/D table contains 27324 parameters.
p0_count is 19969.8 and p1 is 508.619; p0 is 0.999 p1: 0.001
T3To4: TRAIN CROSS-ENTROPY 4.81185 PERPLEXITY 28.0874
T3To4: (4) TRAIN VITERBI CROSS-ENTROPY 4.86024 PERPLEXITY 29.0455

T3To4 Viterbi Iteration : 4 took: 1 seconds

---------------------
Model4: Iteration 5
#centers(pre/hillclimbed/real): 1 1 1  #al: 736.754 
#alsophisticatedcountcollection: 20.703 #hcsteps: 2.262
#peggingImprovements: 0
D4 table contains 510545 parameters.
A/D table contains 28719 parameters.
A/D table contains 27348 parameters.
p0_count is 19756.8 and p1 is 615.098; p0 is 0.999 p1: 0.001
Model4: TRAIN CROSS-ENTROPY 4.34695 PERPLEXITY 20.3499
Model4: (5) TRAIN VITERBI CROSS-ENTROPY 4.38164 PERPLEXITY 20.8452

Model4 Viterbi Iteration : 5 took: 1 seconds

---------------------
Model4: Iteration 6
#centers(pre/hillclimbed/real): 1 1 1  #al: 736.789 
#alsophisticatedcountcollection: 15.938 #hcsteps: 2.201
#peggingImprovements: 0
D4 table contains 510545 parameters.
A/D table contains 28719 parameters.
A/D table contains 27348 parameters.
p0_count is 19766.9 and p1 is 610.062; p0 is 0.999 p1: 0.001
Model4: TRAIN CROSS-ENTROPY 4.2227 PERPLEXITY 18.6706
Model4: (6) TRAIN VITERBI CROSS-ENTROPY 4.25028 PERPLEXITY 19.031

Model4 Viterbi Iteration : 6 took: 1 seconds
H333444 Training Finished at: Fri Jun 29 20:05:17 2012


Entire Viterbi H333444 Training took: 4 seconds
==========================================================

Entire Training took: 8 seconds
Program Finished at: Fri Jun 29 20:05:17 2012

==========================================================
/Users/sam/moses-external-bins/snt2cooc.out /Users/sam/unfactored/corpus/de.vcb 
/Users/sam/unfactored/corpus/en.vcb 
/Users/sam/unfactored/corpus/en-de-int-train.snt > 
/Users/sam/unfactored/giza.en-de/en-de.cooc
/Users/sam/moses-external-bins/GIZA++  -CoocurrenceFile 
/Users/sam/unfactored/giza.en-de/en-de.cooc -c 
/Users/sam/unfactored/corpus/en-de-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3 
-model1dumpfrequency 1 -model4smoothfactor 0.4 -nodumps 1 -nsmooth 4 -o 
/Users/sam/unfactored/giza.en-de/en-de -onlyaldumps 1 -p0 0.999 -s 
/Users/sam/unfactored/corpus/de.vcb -t /Users/sam/unfactored/corpus/en.vcb
Parameter 'coocurrencefile' changed from '' to 
'/Users/sam/unfactored/giza.en-de/en-de.cooc'
Parameter 'c' changed from '' to 
'/Users/sam/unfactored/corpus/en-de-int-train.snt'
Parameter 'm3' changed from '5' to '3'
Parameter 'm4' changed from '5' to '3'
Parameter 'model1dumpfrequency' changed from '0' to '1'
Parameter 'model4smoothfactor' changed from '0.2' to '0.4'
Parameter 'nodumps' changed from '0' to '1'
Parameter 'nsmooth' changed from '64' to '4'
Parameter 'o' changed from '112-06-29.200518.sam' to 
'/Users/sam/unfactored/giza.en-de/en-de'
Parameter 'onlyaldumps' changed from '0' to '1'
Parameter 'p0' changed from '-1' to '0.999'
Parameter 's' changed from '' to '/Users/sam/unfactored/corpus/de.vcb'
Parameter 't' changed from '' to '/Users/sam/unfactored/corpus/en.vcb'
general parameters:
-------------------
ml = 101  (maximum sentence length)

No. of iterations:
-------------------
hmmiterations = 5  (mh)
model1iterations = 5  (number of iterations for Model 1)
model2iterations = 0  (number of iterations for Model 2)
model3iterations = 3  (number of iterations for Model 3)
model4iterations = 3  (number of iterations for Model 4)
model5iterations = 0  (number of iterations for Model 5)
model6iterations = 0  (number of iterations for Model 6)

parameter for various heuristics in GIZA++ for efficient training:
------------------------------------------------------------------
countincreasecutoff = 1e-06  (Counts increment cutoff threshold)
countincreasecutoffal = 1e-05  (Counts increment cutoff threshold for 
alignments in training of fertility models)
mincountincrease = 1e-07  (minimal count increase)
peggedcutoff = 0.03  (relative cutoff probability for alignment-centers in 
pegging)
probcutoff = 1e-07  (Probability cutoff threshold for lexicon probabilities)
probsmooth = 1e-07  (probability smoothing (floor) value )

parameters for describing the type and amount of output:
-----------------------------------------------------------
compactalignmentformat = 0  (0: detailled alignment format, 1: compact 
alignment format )
hmmdumpfrequency = 0  (dump frequency of HMM)
l = 112-06-29.200518.sam.log  (log file name)
log = 0  (0: no logfile; 1: logfile)
model1dumpfrequency = 1  (dump frequency of Model 1)
model2dumpfrequency = 0  (dump frequency of Model 2)
model345dumpfrequency = 0  (dump frequency of Model 3/4/5)
nbestalignments = 0  (for printing the n best alignments)
nodumps = 1  (1: do not write any files)
o = /Users/sam/unfactored/giza.en-de/en-de  (output file prefix)
onlyaldumps = 1  (1: do not write any files)
outputpath =   (output path)
transferdumpfrequency = 0  (output: dump of transfer from Model 2 to 3)
verbose = 0  (0: not verbose; 1: verbose)
verbosesentence = -10  (number of sentence for which a lot of information 
should be printed (negative: no output))

parameters describing input files:
----------------------------------
c = /Users/sam/unfactored/corpus/en-de-int-train.snt  (training corpus file 
name)
d =   (dictionary file name)
s = /Users/sam/unfactored/corpus/de.vcb  (source vocabulary file name)
t = /Users/sam/unfactored/corpus/en.vcb  (target vocabulary file name)
tc =   (test corpus file name)

smoothing parameters:
---------------------
emalsmooth = 0.2  (f-b-trn: smoothing factor for HMM alignment model (can be 
ignored by -emSmoothHMM))
model23smoothfactor = 0  (smoothing parameter for IBM-2/3 (interpolation with 
constant))
model4smoothfactor = 0.4  (smooting parameter for alignment probabilities in 
Model 4)
model5smoothfactor = 0.1  (smooting parameter for distortion probabilities in 
Model 5 (linear interpolation with constant))
nsmooth = 4  (smoothing for fertility parameters (good value: 64): weight for 
wordlength-dependent fertility parameters)
nsmoothgeneral = 0  (smoothing for fertility parameters (default: 0): weight 
for word-independent fertility parameters)

parameters modifying the models:
--------------------------------
compactadtable = 1  (1: only 3-dimensional alignment table for IBM-2 and IBM-3)
deficientdistortionforemptyword = 0  (0: IBM-3/IBM-4 as described in (Brown et 
al. 1993); 1: distortion model of empty word is deficient; 2: distoriton model 
of empty word is deficient (differently); setting this parameter also helps to 
avoid that during IBM-3 and IBM-4 training too many words are aligned with the 
empty word)
depm4 = 76  (d_{=1}: &1:l, &2:m, &4:F, &8:E, d_{>1}&16:l, &32:m, &64:F, &128:E)
depm5 = 68  (d_{=1}: &1:l, &2:m, &4:F, &8:E, d_{>1}&16:l, &32:m, &64:F, &128:E)
emalignmentdependencies = 2  (lextrain: dependencies in the HMM alignment 
model.  &1: sentence length; &2: previous class; &4: previous position;  &8: 
French position; &16: French class)
emprobforempty = 0.4  (f-b-trn: probability for empty word)

parameters modifying the EM-algorithm:
--------------------------------------
m5p0 = -1  (fixed value for parameter p_0 in IBM-5 (if negative then it is 
determined in training))
manlexfactor1 = 0  ()
manlexfactor2 = 0  ()
manlexmaxmultiplicity = 20  ()
maxfertility = 10  (maximal fertility for fertility models)
p0 = 0.999  (fixed value for parameter p_0 in IBM-3/4 (if negative then it is 
determined in training))
pegging = 0  (0: no pegging; 1: do pegging)

general parameters:
-------------------
ml = 101  (maximum sentence length)

No. of iterations:
-------------------
hmmiterations = 5  (mh)
model1iterations = 5  (number of iterations for Model 1)
model2iterations = 0  (number of iterations for Model 2)
model3iterations = 3  (number of iterations for Model 3)
model4iterations = 3  (number of iterations for Model 4)
model5iterations = 0  (number of iterations for Model 5)
model6iterations = 0  (number of iterations for Model 6)

parameter for various heuristics in GIZA++ for efficient training:
------------------------------------------------------------------
countincreasecutoff = 1e-06  (Counts increment cutoff threshold)
countincreasecutoffal = 1e-05  (Counts increment cutoff threshold for 
alignments in training of fertility models)
mincountincrease = 1e-07  (minimal count increase)
peggedcutoff = 0.03  (relative cutoff probability for alignment-centers in 
pegging)
probcutoff = 1e-07  (Probability cutoff threshold for lexicon probabilities)
probsmooth = 1e-07  (probability smoothing (floor) value )

parameters for describing the type and amount of output:
-----------------------------------------------------------
compactalignmentformat = 0  (0: detailled alignment format, 1: compact 
alignment format )
hmmdumpfrequency = 0  (dump frequency of HMM)
l = 112-06-29.200518.sam.log  (log file name)
log = 0  (0: no logfile; 1: logfile)
model1dumpfrequency = 1  (dump frequency of Model 1)
model2dumpfrequency = 0  (dump frequency of Model 2)
model345dumpfrequency = 0  (dump frequency of Model 3/4/5)
nbestalignments = 0  (for printing the n best alignments)
nodumps = 1  (1: do not write any files)
o = /Users/sam/unfactored/giza.en-de/en-de  (output file prefix)
onlyaldumps = 1  (1: do not write any files)
outputpath =   (output path)
transferdumpfrequency = 0  (output: dump of transfer from Model 2 to 3)
verbose = 0  (0: not verbose; 1: verbose)
verbosesentence = -10  (number of sentence for which a lot of information 
should be printed (negative: no output))

parameters describing input files:
----------------------------------
c = /Users/sam/unfactored/corpus/en-de-int-train.snt  (training corpus file 
name)
d =   (dictionary file name)
s = /Users/sam/unfactored/corpus/de.vcb  (source vocabulary file name)
t = /Users/sam/unfactored/corpus/en.vcb  (target vocabulary file name)
tc =   (test corpus file name)

smoothing parameters:
---------------------
emalsmooth = 0.2  (f-b-trn: smoothing factor for HMM alignment model (can be 
ignored by -emSmoothHMM))
model23smoothfactor = 0  (smoothing parameter for IBM-2/3 (interpolation with 
constant))
model4smoothfactor = 0.4  (smooting parameter for alignment probabilities in 
Model 4)
model5smoothfactor = 0.1  (smooting parameter for distortion probabilities in 
Model 5 (linear interpolation with constant))
nsmooth = 4  (smoothing for fertility parameters (good value: 64): weight for 
wordlength-dependent fertility parameters)
nsmoothgeneral = 0  (smoothing for fertility parameters (default: 0): weight 
for word-independent fertility parameters)

parameters modifying the models:
--------------------------------
compactadtable = 1  (1: only 3-dimensional alignment table for IBM-2 and IBM-3)
deficientdistortionforemptyword = 0  (0: IBM-3/IBM-4 as described in (Brown et 
al. 1993); 1: distortion model of empty word is deficient; 2: distoriton model 
of empty word is deficient (differently); setting this parameter also helps to 
avoid that during IBM-3 and IBM-4 training too many words are aligned with the 
empty word)
depm4 = 76  (d_{=1}: &1:l, &2:m, &4:F, &8:E, d_{>1}&16:l, &32:m, &64:F, &128:E)
depm5 = 68  (d_{=1}: &1:l, &2:m, &4:F, &8:E, d_{>1}&16:l, &32:m, &64:F, &128:E)
emalignmentdependencies = 2  (lextrain: dependencies in the HMM alignment 
model.  &1: sentence length; &2: previous class; &4: previous position;  &8: 
French position; &16: French class)
emprobforempty = 0.4  (f-b-trn: probability for empty word)

parameters modifying the EM-algorithm:
--------------------------------------
m5p0 = -1  (fixed value for parameter p_0 in IBM-5 (if negative then it is 
determined in training))
manlexfactor1 = 0  ()
manlexfactor2 = 0  ()
manlexmaxmultiplicity = 20  ()
maxfertility = 10  (maximal fertility for fertility models)
p0 = 0.999  (fixed value for parameter p_0 in IBM-3/4 (if negative then it is 
determined in training))
pegging = 0  (0: no pegging; 1: do pegging)

reading vocabulary files 
Source vocabulary list has 6288 unique tokens 
Target vocabulary list has 5098 unique tokens 
Calculating vocabulary frequencies from corpus 
/Users/sam/unfactored/corpus/en-de-int-train.snt
Reading more sentence pairs into memory ... 
Corpus fits in memory, corpus has: 1000 sentence pairs.
 Train total # sentence pairs (weighted): 1000
Size of source portion of the training corpus: 20987 tokens
Size of the target portion of the training corpus: 20365 tokens 
In source portion of the training corpus, only 6287 unique tokens appeared
In target portion of the training corpus, only 5096 unique tokens appeared
lambda for PP calculation in IBM-1,IBM-2,HMM:= 20365/(21987-1000)== 0.970363
There are 292910 292910 entries in table
==========================================================
Model1 Training Started at: Fri Jun 29 20:05:18 2012

-----------
Model1: Iteration 1
Model1: (1) TRAIN CROSS-ENTROPY 12.5126 PERPLEXITY 5843.34
Model1: (1) VITERBI TRAIN CROSS-ENTROPY 17.0956 PERPLEXITY 140054
Model 1 Iteration: 1 took: 0 seconds
-----------
Model1: Iteration 2
Model1: (2) TRAIN CROSS-ENTROPY 5.83597 PERPLEXITY 57.122
Model1: (2) VITERBI TRAIN CROSS-ENTROPY 8.96147 PERPLEXITY 498.508
Model 1 Iteration: 2 took: 0 seconds
-----------
Model1: Iteration 3
Model1: (3) TRAIN CROSS-ENTROPY 5.45191 PERPLEXITY 43.7711
Model1: (3) VITERBI TRAIN CROSS-ENTROPY 8.15832 PERPLEXITY 285.693
Model 1 Iteration: 3 took: 0 seconds
-----------
Model1: Iteration 4
Model1: (4) TRAIN CROSS-ENTROPY 5.26308 PERPLEXITY 38.4012
Model1: (4) VITERBI TRAIN CROSS-ENTROPY 7.60196 PERPLEXITY 194.276
Model 1 Iteration: 4 took: 0 seconds
-----------
Model1: Iteration 5
Model1: (5) TRAIN CROSS-ENTROPY 5.15986 PERPLEXITY 35.7497
Model1: (5) VITERBI TRAIN CROSS-ENTROPY 7.26275 PERPLEXITY 153.569
Model 1 Iteration: 5 took: 0 seconds
Entire Model1 Training took: 0 seconds
NOTE: I am doing iterations with the HMM model!
Read classes: #words: 6287  #classes: 51
Read classes: #words: 5097  #classes: 51

==========================================================
Hmm Training Started at: Fri Jun 29 20:05:18 2012

-----------
Hmm: Iteration 1
A/D table contains 27583 parameters.
Hmm: (1) TRAIN CROSS-ENTROPY 5.09948 PERPLEXITY 34.2845
Hmm: (1) VITERBI TRAIN CROSS-ENTROPY 7.04569 PERPLEXITY 132.119

Hmm Iteration: 1 took: 1 seconds

-----------
Hmm: Iteration 2
A/D table contains 27583 parameters.
Hmm: (2) TRAIN CROSS-ENTROPY 5.05472 PERPLEXITY 33.2371
Hmm: (2) VITERBI TRAIN CROSS-ENTROPY 6.14789 PERPLEXITY 70.9088

Hmm Iteration: 2 took: 1 seconds

-----------
Hmm: Iteration 3
A/D table contains 27583 parameters.
Hmm: (3) TRAIN CROSS-ENTROPY 4.51059 PERPLEXITY 22.7942
Hmm: (3) VITERBI TRAIN CROSS-ENTROPY 5.13864 PERPLEXITY 35.2277

Hmm Iteration: 3 took: 0 seconds

-----------
Hmm: Iteration 4
A/D table contains 27583 parameters.
Hmm: (4) TRAIN CROSS-ENTROPY 3.93751 PERPLEXITY 15.3218
Hmm: (4) VITERBI TRAIN CROSS-ENTROPY 4.31435 PERPLEXITY 19.8952

Hmm Iteration: 4 took: 1 seconds

-----------
Hmm: Iteration 5
A/D table contains 27583 parameters.
Hmm: (5) TRAIN CROSS-ENTROPY 3.57073 PERPLEXITY 11.8822
Hmm: (5) VITERBI TRAIN CROSS-ENTROPY 3.82587 PERPLEXITY 14.1808

Hmm Iteration: 5 took: 1 seconds

Entire Hmm Training took: 4 seconds
==========================================================
Read classes: #words: 6287  #classes: 51
Read classes: #words: 5097  #classes: 51
Read classes: #words: 6287  #classes: 51
Read classes: #words: 5097  #classes: 51

==========================================================
Starting H333444:  Viterbi Training
 H333444 Training Started at: Fri Jun 29 20:05:22 2012


---------------------
THTo3: Iteration 1
#centers(pre/hillclimbed/real): 1 1 1  #al: 728.798 
#alsophisticatedcountcollection: 0 #hcsteps: 0
#peggingImprovements: 0
A/D table contains 27583 parameters.
A/D table contains 28393 parameters.
p0_count is 17172.1 and p1 is 1596.44; p0 is 0.999 p1: 0.001
THTo3: TRAIN CROSS-ENTROPY 3.28646 PERPLEXITY 9.75715
THTo3: (1) TRAIN VITERBI CROSS-ENTROPY 3.37376 PERPLEXITY 10.3658

THTo3 Viterbi Iteration : 1 took: 0 seconds

---------------------
Model3: Iteration 2
#centers(pre/hillclimbed/real): 1 1 1  #al: 730.032 
#alsophisticatedcountcollection: 0 #hcsteps: 2.56
#peggingImprovements: 0
A/D table contains 27583 parameters.
A/D table contains 28393 parameters.
p0_count is 18879.8 and p1 is 742.594; p0 is 0.999 p1: 0.001
Model3: TRAIN CROSS-ENTROPY 4.58527 PERPLEXITY 24.0051
Model3: (2) TRAIN VITERBI CROSS-ENTROPY 4.6569 PERPLEXITY 25.227

Model3 Viterbi Iteration : 2 took: 1 seconds

---------------------
Model3: Iteration 3
#centers(pre/hillclimbed/real): 1 1 1  #al: 730.15 
#alsophisticatedcountcollection: 0 #hcsteps: 2.774
#peggingImprovements: 0
A/D table contains 27583 parameters.
A/D table contains 28393 parameters.
p0_count is 19305.4 and p1 is 529.782; p0 is 0.999 p1: 0.001
Model3: TRAIN CROSS-ENTROPY 4.4032 PERPLEXITY 21.1591
Model3: (3) TRAIN VITERBI CROSS-ENTROPY 4.46107 PERPLEXITY 22.025

Model3 Viterbi Iteration : 3 took: 0 seconds

---------------------
T3To4: Iteration 4
#centers(pre/hillclimbed/real): 1 1 1  #al: 730.216 
#alsophisticatedcountcollection: 25.321 #hcsteps: 2.802
#peggingImprovements: 0
D4 table contains 512981 parameters.
A/D table contains 27583 parameters.
A/D table contains 28393 parameters.
p0_count is 19504.7 and p1 is 430.15; p0 is 0.999 p1: 0.001
T3To4: TRAIN CROSS-ENTROPY 4.32785 PERPLEXITY 20.0822
T3To4: (4) TRAIN VITERBI CROSS-ENTROPY 4.37912 PERPLEXITY 20.8088

T3To4 Viterbi Iteration : 4 took: 1 seconds

---------------------
Model4: Iteration 5
#centers(pre/hillclimbed/real): 1 1 1  #al: 730.206 
#alsophisticatedcountcollection: 21.808 #hcsteps: 2.362
#peggingImprovements: 0
D4 table contains 512981 parameters.
A/D table contains 27583 parameters.
A/D table contains 28393 parameters.
p0_count is 19311.9 and p1 is 526.566; p0 is 0.999 p1: 0.001
Model4: TRAIN CROSS-ENTROPY 3.9899 PERPLEXITY 15.8884
Model4: (5) TRAIN VITERBI CROSS-ENTROPY 4.02534 PERPLEXITY 16.2836

Model4 Viterbi Iteration : 5 took: 1 seconds

---------------------
Model4: Iteration 6
#centers(pre/hillclimbed/real): 1 1 1  #al: 730.136 
#alsophisticatedcountcollection: 16.958 #hcsteps: 2.35
#peggingImprovements: 0
D4 table contains 512981 parameters.
A/D table contains 27583 parameters.
A/D table contains 28393 parameters.
p0_count is 19351.3 and p1 is 506.847; p0 is 0.999 p1: 0.001
Model4: TRAIN CROSS-ENTROPY 3.83768 PERPLEXITY 14.2974
Model4: (6) TRAIN VITERBI CROSS-ENTROPY 3.86522 PERPLEXITY 14.5729

Model4 Viterbi Iteration : 6 took: 1 seconds
H333444 Training Finished at: Fri Jun 29 20:05:26 2012


Entire Viterbi H333444 Training took: 4 seconds
==========================================================

Entire Training took: 8 seconds
Program Finished at: Fri Jun 29 20:05:26 2012

==========================================================
FILE: /Users/sam/mosesdecoder/factored-corpus/proj-syndicate.1000.en
FILE: /Users/sam/mosesdecoder/factored-corpus/proj-syndicate.1000.de
FILE: /Users/sam/unfactored/model/aligned.grow-diag-final
MAX 7 0 0
Started Fri Jun 29 20:05:27 2012
total=1000 line-per-split=1001 
/Users/sam/moses/scripts/generic/score-parallel.perl 1 "sort    " 
/Users/sam/moses/scripts/../bin/score 
/Users/sam/unfactored/model/extract.sorted.gz 
/Users/sam/unfactored/model/lex.f2e 
/Users/sam/unfactored/model/phrase-table.half.f2e.gz  0 
Started Fri Jun 29 20:05:28 2012
/Users/sam/moses/scripts/generic/score-parallel.perl 1 "sort    " 
/Users/sam/moses/scripts/../bin/score 
/Users/sam/unfactored/model/extract.inv.sorted.gz 
/Users/sam/unfactored/model/lex.e2f 
/Users/sam/unfactored/model/phrase-table.half.e2f.gz  --Inverse 1 
Started Fri Jun 29 20:05:28 2012

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

[Moses-support] Empty phrase table after running

Reply via email to