Hi, I'm trying to train a tree-to-string model and have gotten gotten train-model.perl to run properly, but it seems that when I perform decoding the decoder is using nothing but the glue rules. My train-model.perl command is as follows:
$SCRIPTS_ROOTDIR/training/train-model.perl -scripts-root-dir
$SCRIPTS_ROOTDIR -root-dir work -corpus moses-0128.ja-en -f en -e ja
-alignment grow-diag-final-and -lm 0:3:`pwd`/ja-ja.arpa -source-syntax
-glue-grammar -extract-options "--MinWords 0 --MinHoleSource 1
--MinWords 0 --NonTermConsecSource" &> err.txt
Using this model, a decoded looks like this in Moses's output and the
trace file:
Translating: <s> he revolutionized the japanese ink painting . </s>
||| [0,0]=X (1) [0,1]=X (1) [0,2]=X (1) [0,3]=X (1) [0,4]=X (1)
[0,5]=X (1) [0,6]=X (1) [0,7]=X (1) [0,8]=X (1) [1,1]=X (1) [1,1]=NP
(1) [1,1]=PRP (1) [1,2]=X (1) [1,3]=X (1) [1,4]=X (1) [1,5]=X (1)
[1,6]=X (1) [1,7]=X (1) [1,7]=S (1) [1,8]=X (1) [2,2]=X (1) [2,2]=VBD
(1) [2,3]=X (1) [2,4]=X (1) [2,5]=X (1) [2,6]=X (1) [2,6]=VP (1)
[2,7]=X (1) [2,8]=X (1) [3,3]=X (1) [3,3]=DT (1) [3,4]=X (1) [3,5]=X
(1) [3,6]=X (1) [3,6]=NP (1) [3,7]=X (1) [3,8]=X (1) [4,4]=X (1)
[4,4]=JJ (1) [4,5]=X (1) [4,6]=X (1) [4,7]=X (1) [4,8]=X (1) [5,5]=X
(1) [5,5]=NN (1) [5,6]=X (1) [5,7]=X (1) [5,8]=X (1) [6,6]=X (1)
[6,6]=NN (1) [6,7]=X (1) [6,8]=X (1) [7,7]=X (1) [7,7]=. (1) [7,8]=X
(1) [8,8]=X (1)
Num of hypo = 10065 --- cells:
0 1 2 3 4 5 6 7 8
1 20 2 19 19 1 2 20 0
20 40 0 200 41 4 0 0
2 0 0 200 86 0 0
20 0 0 2 0 0
39 0 6 12 0
43 176 29 0
27 168 0
11 0
1
BEST TRANSLATION: 10055 S </s> :0-0 : pC=0.000, c=-0.519 [0..8] 9016
[total=-6.947] <<-5.646, 0.000, -17.078, -0.975, -11.623, -8.514,
-23.155, 4.000, 4.000>>
つまり 日本 の 水墨 画 を 一変 さ せ た 。
Trans Opt 0 [0..8]: [8..8]=</s> [0..7]=S : S ->S </s> :0-0 : pC=0,
c=-0.519086 -6.94668<<-5.64583, 0, -17.0777, -0.974503, -11.6227,
-8.51364, -23.1549, 3.99959, 3.99959>>
Trans Opt 0 [0..7]: [7..7]=X [0..6]=S : S ->S X :0-0 1-1 :
pC=0.999896, c=0.999896 -7.38058<<-5.21153, 0, -17.077, -0.974503,
-11.6227, -8.51364, -23.1549, 3.99959, 3.99959>>
Trans Opt 0 [0..6]: [3..6]=X [0..2]=S : S ->S X :0-0 1-1 :
pC=0.999896, c=0.999896 -8.60037<<-4.77724, 0, -17.0175, -0.9619,
-11.5975, -7.04926, -22.7336, 2.99969, 2.99969>>
Trans Opt 0 [0..2]: [2..2]=X [0..1]=S : S ->S X :0-0 1-1 :
pC=0.999896, c=0.999896 -3.05231<<-1.73718, 0, -5.89485, -0.9619,
-4.65396, -6.35611, -9.23707, 1.99979, 1.99979>>
Trans Opt 0 [0..1]: [1..1]=X [0..0]=S : S ->S X :0-0 1-1 :
pC=0.999896, c=0.999896 -2.31227<<-0.868589, 0, -4.07733, 0, 0,
-5.66296, -6.04736, 0.999896, 0.999896>>
Trans Opt 0 [0..0]: [0..0]=<s> : S -><s> :: pC=0, c=0.434294
0.434294<<-0.434294, 0, 0, 0, 0, 0, 0, 0, 0>>
Trans Opt 0 [1..1]: [1..1]=he : X ->つまり :: pC=-2.14208, c=-3.88913
-3.88913<<-0.434294, 0, -4.36269, 0, 0, -5.66296, -6.04736, 0.999896,
0>>
Trans Opt 0 [2..2]: [2..2]=revolutionized : X ->日本 の :: pC=-1.69976,
c=-2.8204 -2.8204<<-0.868589, 0, -3.97844, -0.9619, -4.65396,
-0.693147, -3.18971, 0.999896, 0>>
Trans Opt 0 [3..6]: [6..6]=painting [5..5]=ink [4..4]=japanese
[3..3]=the : X ->水墨 画 を 一変 さ せ た :: pC=-4.02668, c=-7.27479
-7.27479<<-3.04006, 0, -12.5763, 0, -6.94358, -0.693147, -13.4966,
0.999896, 0>>
Trans Opt 0 [7..7]: [7..7]=. : X ->。 :: pC=-0.184696, c=-0.720382
-0.720382<<-0.434294, 0, -1.93996, -0.0126031, -0.025123, -1.46439,
-0.421262, 0.999896, 0>>
I would expect the translation to use terminals other than "X" and
"S", but it seems to be not using syntactic labels at all (for any of
the sentences, not just the first). I've attached part of the rule table and
glue rule files, any clues about why it isn't using any syntactic
information in translation?
Thanks in advance,
Graham
rule-table.bz2
Description: BZip2 compressed data
glue-grammar
Description: Binary data
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
