Dear all, I've managed to train a hierarchical model using the following command :
nohup /mosesdecoder/scripts/training/train-model.perl --hierarchical
--glue-grammar --score-options="--GoodTuring" -root-dir
/disque2/Preparation/syntactic/hierarchical_PSCT -corpus
/disque2/Preparation/backoff/PSCT.tok.uni.low -f fr -e en -lm
0:5:/disque2/Preparation/syntactic/hierarchical/LM.sur.en.blm
-external-bin-dir /root/external-bin-dir/ -mgiza -mgiza-cpus 30 >&
training.out &
The resulting model works great but it is very slow.
So I used CreatOnDiskPt to binarise the rule table as follows :
/home/Moses/mosesdecoder/bin/CreateOnDiskPt 1 1 5 20 2 model/rule-table.gz
rules-table
This outputs the following files
-rw-r--r-- 1 root root 85 20 mars 11:35 Misc.dat
-rw-r--r-- 1 root root 781509322 20 mars 11:35 Source.dat
-rw-r--r-- 1 root root 1758588137 20 mars 11:35 TargetColl.dat
-rw-r--r-- 1 root root 1555353459 20 mars 11:35 TargetInd.dat
-rw-r--r-- 1 root root 400714 20 mars 11:35 Vocab.dat
and I've updated my moses.ini as follows :
6 0 0 1 /disque2/Preparation/syntactic/hierarchical_PSCT/rule-table
6 0 0 1 /disque2/Preparation/syntactic/hierarchical_PSCT/model/glue-grammar
But as I try to use it, I'm getting this :
Defined parameters (per moses.ini or switch):
config: moses.ini
cube-pruning-pop-limit: 1000
input-factors: 0
inputtype: 3
lmodel-file: 0 0 5
/disque2/Preparation/syntactic/hierarchical/LM.sur.en.blm
mapping: 0 T 0 1 T 1
max-chart-span: 20 1000
non-terminals: X
search-algorithm: 3
ttable-file: 6 0 0 1
/disque2/Preparation/syntactic/hierarchical_PSCT/rule-table 6 0 0 1
/disque2/Preparation/syntactic/hierarchical_PSCT/model/glue-grammar
ttable-limit: 20
weight-l: 0.5000
weight-t: 0.20 0.20 0.20 0.20 0.20 1.0
weight-w: -1
/mosesdecoder/bin
ScoreProducer: WordPenalty start: 0 end: 1
ScoreProducer: !UnknownWordPenalty start: 1 end: 2
Loading lexical distortion models...have 0 models
Start loading LanguageModel
/disque2/Preparation/syntactic/hierarchical/LM.sur.en.blm : [0.000] seconds
/disque2/Preparation/syntactic/hierarchical/LM.sur.en.blm: line 80492:
reached EOF before \end\
ScoreProducer: LM start: 2 end: 3
Finished loading LanguageModels : [0.056] seconds
Using uniform ttable-limit of 20 for all translation tables.
Start loading PhraseTable
/disque2/Preparation/syntactic/hierarchical_PSCT/rule-table : [0.056]
seconds
filePath: /disque2/Preparation/syntactic/hierarchical_PSCT/rule-table
ScoreProducer: PhraseModel start: 3 end: 4
Start loading PhraseTable
/disque2/Preparation/syntactic/hierarchical_PSCT/model/glue-grammar :
[0.056] seconds
filePath:
/disque2/Preparation/syntactic/hierarchical_PSCT/model/glue-grammar
ScoreProducer: PhraseModel:2 start: 4 end: 5
Finished loading phrase tables : [0.056] seconds
max-chart-span: 20
max-chart-span: 1000
Start loading phrase table from
/disque2/Preparation/syntactic/hierarchical_PSCT/rule-table : [0.056]
seconds
Can't read /disque2/Preparation/syntactic/hierarchical_PSCT/rule-table
So I wondering if I did something wrong with my training
command/binarisation or with the parameters in the moses.ini
Many thanks
Regads
MA
--
[image: Description : Description : lingua_custodia_final full logo]
*The Translation Trustee*
*1, Place Charles de Gaulle*
*78180 Montigny-le-Bretonneux*
*Tel : +33 1 30 44 04 23 Mobile : +33 7 61 44 40 84*
*Email :* *[email protected]
<[email protected]>*
*Website :* *www.linguacustodia.com <http://www.linguacustodia.com/> -
www.thetranslationtrustee.com <http://www.thetranslationtrustee.com>*
ü Pensez à l'environnement, n'imprimez ce courriel que si nécessaire.
Please do not print this email unless it is absolutely necessary. Spread
environmental awareness.
<<inline: image001.jpg>>
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
